Incremental Innovation

A Bayesian approach to changing the world.

“Innovation” is a good word for what I mean, but there are a couple of other phrases that relate to the concept I’m really trying to address. It’s “making a difference”, “changing the world”, “doing great things”, “establishing a legacy”. In antiquity this is exemplified by what Alexander the Great did for military strategy and Aristotle did for philosophy. Today it’s what turns kids in dorm rooms and hippies in garages into billionaires in jeans and turtlenecks. In its highest realization, it gets people on lists of historical greats. This notion appears in every field. In literature there’s the works of Dickens and Hemingway. In art there’s Michelangelo and van Gogh. In science Newton, Darwin, and Einstein. The business acumen of Steve Jobs and Elon Musk. N.W.A. did it for rap, the Beatles did it for rock. We attribute innovation to people and groups alike. Whether we’re thinking of “Google” or “Larry Page”, the point is the same – we’re describing a phenomenon that seems to somehow transcend the ordinary improvements that 99% of the world produces in favor of transformational improvement. This is the phenomenon I mean when I say innovation.

Zero to One Innovation

I’m going to borrow language from Peter Thiel’s Zero to One. The work focuses on innovation in the context of startup companies, but it relates directly to my broader concept. Thiel describes innovation as a “zero to one” improvement in some field. To go from zero to one is to do something that no one else is doing, or to create something that did not exist before. Thiel contrasts this with events that take fields from one to n. For example, the invention of penicillin is a “zero to one” event, while opening another pizza shop in Manhattan is a “one to n” event. Penicillin is innovative, the pizza shop is not. Thiel qualifies this model by requiring that successful innovation also introduces an order of magnitude (10x) improvement over competition. Incremental improvement isn’t enough. A car that is 5% more fuel efficient than any other car is not world changing and will not dominate the field, but the development of cars is a world changing event. Cars can be said to be at least an order of magnitude more effective than horse drawn carriages. In my experience, Thiel accurately describes the model of innovation that most people hold. The innovative agent of change must be better than existing agents. In order to generate the legacy that great innovation deserves, it can’t just be a little better – it must be a lot better. To be better than existing practices the innovation must be new, or if not new, at least rediscovered. I’ll call these conditions the “Zero to One (Zt1) conditions” for innovation regardless of whether Thiel himself appreciates this definition, because I’ve seen many people walk away from his book with this impression.

I’ll use Michelangelo to demonstrate Thiel’s model in the wild. Michelangelo’s preeminent biographer, Giorgio Vasari, identifies the Zt1 conditions of innovation:

One [biography], by Giorgio Vasari, proposed that [Michelangelo] was the pinnacle of all artistic achievement since the beginning of the Renaissance, a viewpoint that continued to have currency in art history for centuries.” Vasari on Michelangelo’s David: “in it may be seen most beautiful contours of legs, with attachments of limbs and slender outlines of flanks that are divine; nor has there ever been seen a pose so easy, or any grace to equal that in this work, or feet, hands and head so well in accord, one member with another, in harmony, design, and excellence of artistry. And, of a truth, whoever has seen this work need not trouble to see any other work executed in sculpture, either in our own or in other times, by no matter what craftsman. – Wikipedia

Vasari claims Michelangelo demonstrated a new level of artistic merit, one that is significantly better than any before him. He even claims that Michelangelo created works significantly better than anything that came after. Attacking a claim this extreme is too close to weak manning for my taste, but my point is that this mode of thinking is common. In an attempt to develop a coherent model of a world in which one person achieves such prestige and influence in art, we come to the conclusion that the skills of the artist and the merits of his work must have been far greater than those of his competitors.

Incremental Innovation

This would be boring if I didn’t tell you there are significant problems with the Zt1 model of innovation. I hold that Michelangelo does not represent an order of magnitude improvement in the world of Renaissance art. People get this impression because the fields of painting and sculpture were growing an order of magnitude faster than normal. He was on the right point of the S-curve of innovation. I argue that the contributions Michelangelo made to the fields in both theory and execution can be characterized as incremental relative to his peers at the time, and there is nothing particularly anomalous about the man himself. The important differences between my model and Zt1 innovation come not from the effects of innovation on its field, but rather from what innovation looks like from the inside.

I can make the differences between my model and the conventional model concrete. I’ll continue to use Michelangelo as an explanatory example, but this is a general framework that can be applied to any innovator (or innovation, with some tweaking).

We model the relative skills of a collection of agents in some field with a normal distribution. The incremental innovation hypothesis is what scientists and statisticians call the null hypothesis H_{o}. It holds that the sample in question, the innovator, is not significantly different than the general population. I take general population to refer to the competitors of the innovator. This is the correct way to interpret general population because it is not an impressive claim to say that Michelangelo was significantly more skilled in art than the average citizen of Florence. The Zt1 model of innovation means to say that the innovator made a significant improvement in their field. Thus our population must be restricted to active members of the field in order to model the phenomenon that we’re addressing. You wouldn’t argue that Lebron James is the GOAT by comparing him to an actuary playing pick-up after work.

H_{o}: The innovator or innovation is not significantly better than the competition.

What do I mean by “not significantly better”? I really mean that at least one of the innovator’s contemporaries could be considered as skillful or more skillful than the innovator. This implies that in a plausible alternate universe, a competitor could be switched out for the innovator and attain a comparable legacy. We make this concrete by specifying how far from the mean of the talent distribution the innovator needs to be in order to conclude that they were truly irreplaceable. For the sake of the discussing the general framework, I’ll use the three sigma rule as a conservative rule of thumb.

This should really be adjusted depending on the size of the population (number of competitors to the innovation) and the strength of the claim of greatness (Do you agree that in order to be considered as great as Michelangelo you must be once-in-a-lifetime? Maybe its once every two lifetimes…maybe its once ever. How rare must you be to be as great as Michelangelo?). In any case, the three sigma rule is the rule I’m going to adopt because it is just about the weakest claim I can make that models the Zt1 conditions. Three-sigma corresponds to roughly 99.9% of the competitors being worse than Michelangelo. If this holds, then for 1,000 artists in Renaissance Italy there would be 1 artist as good or better than Michelangelo. The more artists there were, the more sigmas we need to maintain that he was unique among his peers. Again I’m going to try to avoid weak-manning and targeting the hyperbolic Vasario who is likely defending a 10+ sigma alternate hypothesis.

The alternate hypothesis H is the negation of the null hypothesis – that Michelangelo’s skill as an artist is greater than 3-sigma away from the mean of his competitors. From this it follows that (given there were 1,000 artists in Italy during his life time) Michelangelo was at least a once in a lifetime event, while allowing that perhaps he was a once-ever event.

These hypotheses line up with the Zt1 conditions of innovation. A unique innovation is one that only happens once in the lifetime of the innovators. There is only one in the population. This takes the field from zero to one. We have a separate issue of whether an innovation is 10x better in some sense than what’s currently being done – this begs the question of how you apply multiplication to whatever the x-axis of our normal distribution is. I won’t get into that, but as far as the Zt1 conditions go, this framing seems to line up with Thiel’s language.

The General Case for Incrementalism: A Bayesian Analysis

So we’ve defined our two hypotheses – the incrementalist model Ho and the Zt1 model H.  We also have a body of evidence E which encompasses all observations about Michelangelo and  his peers, their works and the impact of their works, facts about Renaissance Italy, and really anything that can be used to support either claim. E is shared among both hypotheses. Like a good rationalist, I appeal to Bayes’ Theorem when applicable.

 P(H|E)=\frac{P(E|H)P(H)}{P(E)}

And similarly for the null hypothesis,

 P(H_{o}|E)=\frac{P(E|H_{0})P(H_{0})}{P(E)}

P(H) and P(H_{0}) are given by construction. For the three-sigma interpretation,P(H_{0}) is 0.999, and P(H) is .001.  If you think your  innovator is more special than one in one thousand, then the more sigmas you need, P(H_{0}) gets higher and P(H) gets lower. To compare the two hypotheses, we can look at the ratio of their probabilities given the evidence:

 \frac{P(H|E)}{P(Ho|E)} = \frac{P(E|H)P(H)}{P(E|Ho)P(Ho)}

If this ratio is greater than one, the Zt1 model is more likely to be correct that my model. If this ratio is less than one, my model is more likely to be correct. We have our knowns so far:

 P(H_{0}) = 0.999

 P(H) = 0.001

 P(E|H) = 1

Why am I going with  P(E|H) = 1 ? If you assume that an innovator is at least a once-in-a-lifetime event, or make a stronger claim, then the probability that the innovator will produce innovations deserving of the esteem he or she has achieved is probably close to 1. Equating this probablity with 1 is a conservative assumption that gives the benefit of the doubt to the Zt1 model of innovation.

The core of my argument, and the place where discussions about these innovators normally leads, rests in the evaluation of  P(H_{0}) .  P(H_{0})  is the degree of belief that one of the competitors of the innovator could have produced the innovations presented in E – that the innovator is not of significantly higher merit than his contemporaries.

In order to go from the most contentious part of the discussion, P(H_{0}) , to the conclusions,  P(H_{0} | E)  and  P(H_{0} | E) , we must apply Bayes Theorem – which means multiplying by the Bayesian priors  P(H)  and P(H_{0}) . This is something that in many cases, people rarely do – it’s the base rate fallacy. It’s why Bayesian rationality is an important movement – recognizing this fallacy helps us avoid incorrect models of the universe.

We have

 \frac{P(H|E)}{P(Ho|E)} = \frac{P(E|H)P(H)}{P(E|Ho)P(Ho)}

 \frac{P(H|E)}{P(Ho|E)} = \frac{(1)(0.001)}{P(E|Ho)(0.999)}

 \frac{P(H|E)}{P(Ho|E)} = \frac{(1)}{P(E|Ho)(999)}

In order to consider the alternate hypothesis possible, we must convince ourselves that \frac{P(H|E)}{P(Ho|E)} > 1 . For this to hold  P(E|H_{0}) , which is our degree of belief that the innovator could be swapped out with a peer, must less than 1/999, or 0.001.

Now we see why it’s so important to consider Bayes’ Theorem in detail. Normally when discussing this model, people get to this point and think “it seems more likely than not that Michelangelo was unique in his time, so he must be have significant artistic merit compared to his peers”. This is blatantly wrong. You can’t just conclude that it’s “more likely than not”. With the Bayesian interpretation of probability, you have to come to the conclusion that it is 1000 times more likely that Michelangelo was uniquely meritorious than not. This is the standard that must be met for an innovator to satisfy the Zt1 conditions of innovation.

Individual Examples

Harry was wondering if he could even get a Bayesian calculation out of this. Of course, the point of a subjective Bayesian calculation wasn’t that, after you made up a bunch of numbers, multiplying them out would give you an exactly right answer. The real point was that the process of making up numbers would force you to tally all the relevant facts and weigh all the relative probabilities. Like realizing, as soon as you actually thought about the probability of the Dark Mark not fading if You-Know-Who was dead, that the probability wasn’t low enough for the observation to count as strong evidence. One version of the process was to tally hypotheses and list out evidence, make up all the numbers, do the calculation, and then throw out the final answer and go with your brain’s gut feeling after you’d forced it to really weigh everything.

– Eliezer Yudkowsky in Harry Potter and the Methods of Rationality Chapter 86

Now we have to evaluate  P(E|H_{0}) . This is the most important and most subjective portion of the argument (although you could argue any of these numbers are subjective, but that’s less contentious than what I’m about to get into). The evaluation dives into the details of the innovator, and his competitors. It requires knowledge of the works in the field at the time of the innovators life, knowledge of the works of the innovator, and knowledge of how these innovations were levered and through what means. You have to gather all the body of evidence E that you can to address this claim.I will only be making claims on rough order of magnitude estimates, but I hope to convey that the claims I make are absolutely reasonable given the evidence at hand.

I have a post with my arguments for  P(E|H_{0})  – the degree of belief that another innovator could have established similar results – for Michelangelo. I think after reading this you can formulate your own conclusions for any innovator of choice. I have thoughts on covering more (Einstein, Jobs, Alexander the Great, the Beatles) but haven’t committed time to filling those out yet. If you’re honest with yourself and diligent in your research, I think you’ll find that most fail to meet the Zt1 conditions. I’m open to the notion that some innovators really were truly unique in their time, but we must realize exactly how hard it is to really justify that claim.

My goal with this argument is not to convince you that there are no real heroes, or to rob you of role models. I seek a model of innovation that is as accurate as possible. We should not overlook a number of cognitive heuristics that come into play with the formulation of the Zt1 model of innovation – anchoring on fame and influence, the illusory truth effect, fundamental attribution error, and perhaps most of all the base rate fallacy. I’m also playing around with the idea that there’s some fundamental human inclination to hero worship, although that’s for another time. We should also recognize there is incentive from established innovators to propagate the Zt1 model of innovation. When an innovation is believed to be a “zero to one” event, rather than a nudge above the competition, it’s that much easier to market and profit from it.

My aim is not to present a bleak view of the world in which only external forces determine who changes the world. I do believe in the importance of individual virtues – and I suggest that the best way to maximize your chances of making an impact is to recognize the true drivers of innovation.

Michelangelo from the Inside

 In which I claim Michelangelo was aiight.

Michelangelo was on top of the game during his lifetime, and his legacy seems to speak for itself. But how much of his success can we attribute to his unique skill as an artist? There are certainly other drivers – like leverage of high visibility commissions, and the explosion of the art world itself during the Renaissance. Instead of vaguely throwing our hands up and saying “yes, those matter, but you can’t deny he was talented”, I’d like to actually address the relative weights of these factors. This ties back to a larger conversation on the nature of innovation.

One of my larger goals is to address what innovation looks like from the inside. Michelangelo’s portfolio looks incredible to someone 500 years removed. The statue of David, St. Peter’s Basilica, the Sistine Chapel ceiling? They’re all beautiful and famous, not to mention cornerstones of achievement across sculpting, architecture, and fresco. And he didn’t even like painting. At this point, most people end their analysis. It’s self-evident! Michelangelo must have been remarkably talented in order to have built such a portfolio.

I’m not so sure. For starters, look at what else was happening in his time. At least two other artists from the same setting who built comparable legacies: Leonardo da Vinci and Raphael Sanzio da Urbino. To complete the Teenage Mutant Ninja Turtle gang, Donatello was making history only a generation earlier. These artists – and other notables like Giambologna and Titian – are the people I would consider Michelangelo’s peers, and therefore his competitors. They saw it the same way. These artists, and many more remarkable artists, are relevant because in order to understand Michelangelo’s merits we must view them in the context of other artists at the time.

Central Italy was the Wall Street of medieval Europe. The Medici dominated the financial capital Florence, and these bankers in particular routed an unprecedented flow of capital into art education and projects – grand cathedrals, sculptures, and paintings. (There’s a larger point here on the unappreciated benefits of modern financial institutions). Are you surprised that so many influential artists came out of this setting? Those individuals lived at the right point on the S-curve of their fields. It’s easy to produce novel improvements when you’re among the first people in a group to do something. These artists were among the first since the Classical period (hence ‘Rennaisance’) to be backed by this much capital, and they had the advantage of significant technological advancements. Don’t sleep on the printing press.

This setting should temper how we view the relation between Michelangelo’s skill and his legacy. It would be anomalous for an artist now to achieve Michelangelo’s level of influence – but that’s because the fields of marble sculpture, painting, and architecture are saturated. We’re too far along on those S-curves. It’s not because no one approaches Michelangelo’s level of skill – actually, I bet there have been a ton of marble sculptors since Michelangelo that surpass his merits. I’ll go so far as to say that maybe there were some of those artists during his lifetime as well. So why is Michelangelo the go-to artistic genius? Why didn’t every artist in Florence get to that level?

Michelangelo was perhaps more artistically skilled than his contemporaries, but he was certainly better leveraged. Take a look at the political leverage one of his most influential (and note: one of his earliest) works: the statue of David. David stood in the center of Florence, in Palazzo Vecchio, as a symbol of the essence of the city.

Because of the nature of the hero it represented, the statue soon came to symbolize the defense of civil liberties embodied in the Republic of Florence, an independent city-state threatened on all sides by more powerful rival states and by the hegemony of the Medici family. The eyes of David, with a warning glare, were turned towards Rome. – Wikipedia

David is certainly well sculpted, but the placement of the statue was a bigger win for Michelangelo than any of its artistic merits. The amount of social capital this placement earned him should not be underestimated.A smart man – which Michelangelo certainly was – could leverage this attention for more valuable commissions without necessarily demonstrating a meaningful level of skill over his competitors  which he certainly did. Being rich makes it easier to get richer, being famous makes it easier to get more famous, and having clients makes it easier to acquire clients. I’m not making the assertion that Michelangelo was less skilled than his competitors. I am denying a relation between outlying artistic skill and outlying fame. I’m saying that the order of magnitude difference between Michelangelo’s legacy and the legacy of an average Florentine artist should not be attributed to an order of magnitude difference in skill, if there was even a skill difference at all. Skill grows linearly. Legacy does not scale with skill – it scales with social capital, which grows exponentially.

None of this precludes Michelango being in another league than other artists. It does set the bar for supporting that claim very high – even higher than it already was. And before establishing Michelangelo was an especially remarkable artist, you need to at least examine works of his contemporaries. In painting, Leonardo da Vinci has The Last Supper, The Mona Lisa, and Adoration of the Magi. Raphael has the School of Athens. In sculpture Donatello (although a generation earlier) has Judith and Holofernes. Bandinelli has Hercules and Cacus and Giambologna has The Rape of the Sabines. If you want to validate the claim that many make – that Michelangelo is one of the greatest artists of all time – it’s not enough to show he was skilled, or even a little better than his peers. You have to show he was significantly more skilled, without relying on the fact that he gained a significant amount of influence on the art world.

400px-Heracles_and_Cacus_(Florence)_2013_Februaryrape-of-sabine-women'David'_by_Michelangelo_JBU0001

Which of these came from divine talent, and which from mere mortals?

This is a judgement call on your part. A priori it’s objectively harder to take the position that Michelangelo was extraordinary, because extraordinary claims require extraordinary evidence. I’m not an expert in art criticism, but I could not find any distinguishing features of David that made it obviously more skillfully sculpted than the other sculptures on display in Florence. I could be convinced that it was the best sculpture there, but I would be surprised to hear an argument that marks it far above Giambologna’s The Rape of the Sabine Women, for example. The same goes for Michelangelo’s The Last Judgement compared to da Vinci’s The Last Supper or Veronese’s The Wedding at Cana. The Sistine Chapel ceiling, Pieta, the dome of St. Peter’s Basilica…they are beautiful and awe-inspiring. So are most of the other works done by skilled artists of the period. The point is that it is clear that Michelangelo was definitely good, but its not clear that he was incredibly good when you look at what everyone else was able to do at the time. There’s not enough evidence that an average competitor in his time and place couldn’t have attained a similar legacy.