TEDxGoodenoughCollege

May 9th, 2014

Nick Werle – TEDxGoodenough College

How much has the nuclear meltdown in Japan cost? What about the global financial crisis of 2008? These are hard questions, but given the chance, I think most would agree that both the nuclear power station and the Too Big To Fail banks should have spent more on safety before they caused disasters, whatever the upfront costs. Unfortunately, the risk management philosophies that are meant to protect us from catastrophe leave us vulnerable, largely because they are caught up in a metaphor that confuses safety in the real world with winning at the betting table. I want to explain why thinking about games of chance when trying to understand probability is a mistake.

Today, much of our mathematical and technical expertise works to tame the uncertainty of the future and give us an idea of the risks we face. In general, we know the types of threats, and we know some defensive measures that might protect us. But this protection is expensive. There are up-front costs of defensive infrastructure, such as levees to fortify a city against flooding. And there are also costs borne over time, as people forgo the most profitable ways of doing things, like prohibiting new houses along a riverbank.

The discipline of risk management has sprung up to help answer two questions that are fundamental to our safety: How often will the river flood? and How high might the floodwaters be at their worst? Probabilistic thinking can help society decide how much to spend in defense. For a given threat, such as a flood, an earthquake, or a stock market crash, the decision usually comes down to choosing the worst disaster our defenses should be designed to repel. After all, if a levee can hold back a 5 meter flood, it should have no problem with a 3 meter flood.

Modern society is full of complex systems; what connect them are the high costs society bears when they fail. Risk management is often seen as a scientific approach to organizing society and a neutral, objective way to determine how much safety we need. But I want to argue that when we think about the uncertain future through the lens of risk management, we misunderstand the meaning of probability. **In short, we treat the real world too much like a game of chance, and in doing so, we make three big mistakes: we try to play the odds, we think of the parts but forget the whole, and we think we know just how bad things can get**.

Now, remember when you first learned about probability. If your primary school maths teachers were anything like mine, then you probably did some experiments. You might have rolled some dice and flipped some coins, counting the number of times each outcome occurred. These experiments help us internalize the idea that if we flip a coin 100 times or 1000 times, very close to half of those flips will land on heads, and so we say that the probability of heads is 1/2. This idea is known as the frequentist interpretation of probability. It makes a lot of sense in these simple games, where we can repeat the same experiment many times and count how often each result appears. If you’re in the business of gambling, the frequentist version of probability works great, because if you play enough games and keep the odds in your favor, you are very, very likely to win.

Professional gamblers, for instance, don’t win nearly every bet. They are professionals because they recognize that a wager with a 53% chance of winning is good enough to break even. Few have a long-term winning rate over 58%. They may lose a lot along the way, but the wins cancel out the losses. Over the long run, luck has very little to do with gamblers’ success.

As it turns out, financial markets usually work this way too. Professional investors don’t need to win on every trade; they just need to win more than they lose and avoid going bankrupt in the process.

But the real world is not a game. We don’t know all the rules, and we can’t treat the real world like a repeatable experiment, counting the number of real worlds in which a disaster does or does not occur. **This is our first mistake.** In defending society from threats in the uncertain future, we need to use probability as a guide to subjective decision making, not gambling. You can’t play the long-term odds with the safety of a nuclear power station. There is no amount of time operating without an accident that will compensate for a single meltdown. Like many complex systems in the real world, nuclear power stations are too important, too dangerous to fail. And the frequentist version of probability that works for gamblers can’t help us.

One reason dice are mathematically well behaved is that their results are mutually independent. This means we can decompose a complex dice game into problems of individual dice, and then add them up. We often apply this same independence assumption to real world problems, but in doing so, we make our **second mistake: Thinking only about parts but forgetting about the whole**. When faced with a complex system, like a nuclear power station, risk management typically breaks the problem into parts and calculates the probabilities of certain types of failures. But real world complexity defies this decomposition: In real life, elements of systems are deeply connected, not mutually independent. Treating each risk separately leaves us vulnerable to the biggest, most catastrophic forms of failure, even if we are protected from the little bumps along the way.

In the case of Japan’s Fukushima Daiichi nuclear power station, for example, its owner TEPCO considered a number of dangerous threats to its plant, including earthquakes, tsunamis, and failure of the local power grid. To address each threat, TEPCO installed defenses and backup systems: automatic shutoff routines triggered by earthquakes; seawalls to defend the plant from tsunamis; and backup generators that could power cooling equipment to prevent core meltdown if the station lost external power. Individually, these defensive systems may have worked alright, but when trouble came, all of these threats materialized at once, and the individual backup systems failed to act together to save the plant.

On March 11, 2011, a 9.0 magnitude earthquake struck off the coast, AND a 15 meter tsunami struck the plant, AND connections to the local electricity network failed, AND seawater flooded the backup diesel generators foolishly installed in basements, AND the flooding disabled a second set of backup generators that served as the last line of defense against a meltdown. The plant was designed to withstand any one of these problems individually. But by assuming that all of these risks were independent, TEPCO not only ignored that these problems were likely to happen simultaneously, they also ignored that the automatic response to one threat could make the plant more vulnerable to another. The nuclear disaster, which continues to unfold today, can be directly traced to the backup systems designed to deal with risks one by one, as if the threats were handfuls of dice rolled again and again.

Although the Fukushima Daiichi meltdowns were triggered by an earthquake, tragically, this is a human disaster at its root. It’s not true that no one had thought of these combined threats to the plant, or realized that tsunamis often follow earthquakes along the Japanese coast. There were numerous reports in the years before the earthquake, warning TEPCO and the Japanese government of these vulnerabilities, but the company avoided making the necessary but expensive improvements. No one could have stopped the earthquake and the tsunami. But other nuclear power stations along the coast that had invested in upgrades – including another TEPCO plant only 7 miles away – survived the threat without creating a disaster.

Now, let’s turn to financial markets, where we’ll see another disaster caused more by human failure than by some unforeseeable force of nature. There were many causes of the 2008 financial crisis, but I want to focus on some very faulty independence assumptions like those that doomed Fukushima Daiichi. Building off of the idea that property markets are local, financial economists argued that the probabilities of homeowners in different places failing to pay back their mortgages are independent of each other. Thus, investment banks in The City and on Wall Street told the world they could eliminate risk by packaging thousands of mortgages together and selling off the pieces to pension funds, mutual funds, hedge funds, and each other. But in reality, these homeowners ARE connected, and the new system of mortgage finance had made housing markets more interlinked than ever before. In 2007, housing prices and job prospects declined throughout the global economy, making it difficult for borrowers all over to make their mortgage payments. As the Too Big To Fail banks drowned in an alphabet soup of mortgage debt, taxpayers around the globe were asked to bail them out. Hopefully, we’ve learned that this assumption of statistical independence and the general idea that banks can eliminate risk are both false! Mathematically, this failure appeared as though the dice in a game of chance were conspiring to bring down the global economy.

This brings us to **the third mistake** we make **when we try to use probability to understand the real world as though it were a game of chance: We think we know exactly how bad things can get**. In a game of chance, like dice or blackjack, we have very detailed knowledge about the underlying processes that generate the randomness we observe. We know both how many cards a standard deck contains and how many of each type there are. With a six-sided die, we know that no one will roll a 5,000, no matter how long she tries. By understanding these limits to what might happen in a game, we can construct a mathematical equation, called a probability distribution, that reflects exactly, or almost exactly, the underlying process generating the randomness we see.

But in the real world, we don’t know the true processes causing earthquakes, floods, or stock market crashes, so we can’t build our probability models from the bottom up. Instead, we tend to take convenient, readymade probability distributions, like a bell curve, and “fit” them to the data we observe. In doing so, we are implicitly make some very strong assumptions about the processes driving change in the real world. Some of these fits are better than others, but we should never confuse something that is essentially an advanced method of counting with a deep understanding of how a complex system works. Inevitably, the bulk of the data used to fit a probability model will be very close to average, since market crashes and devastating tsunamis are very rare. Thus, the model will be tuned to describe the system’s normal behavior, but not its rare and extreme catastrophes—like the global financial crisis or Fukushima. Indeed, statisticians have recently shown that it is impossible to estimate accurately either the frequency or the intensity of rare and extreme events. Unfortunately, it’s these extreme events that cause the disasters that cost society so much.

We should be very wary of people saying they have eliminated the risk of disasters and they know exactly how bad things can get, because these claims are impossible. Usually, these overconfident people are trying to convince us that they shouldn’t be required to spend money on some profit-reducing safety buffers. We sometimes trust these people because we rely on the idea that they have an incentive to look out for their own survival. But what we don’t consider is that when disaster eventually strikes, these same people will call us for help, claiming that the catastrophe was unpredictable and so they are not responsible for the damage.

This is exactly what we all saw in the wake of the 2008 global financial crisis, when massive market fluctuations brought century-old banks to their knees. Instead of building up supplies of capital during boom years to absorb losses during a bust – the financial equivalent of building a flood barrier that can be raised in a storm – the financial institutions took on massive amounts of debt to boost their trading profits through extremely high leverage. This is the equivalent of building a new palace at the water’s edge during low tide.

No one knew how the global financial crisis would play out, but many people were warning of massive losses to come. There was time to prepare. However, when the faulty mortgage bonds I just told you about became worthless, throwing the markets into turmoil, the bank directors claimed they were victims of astronomically bad luck. Famously, the Chief Financial Officer of one Wall Street investment bank explained his firm’s massive losses by telling the Financial Times, “We were seeing things that were 25-standard deviation moves, several days in a row.” To put this into perspective, a 25-standard deviation event should be so rare that if the markets had been open every single day for the 14 billion years since the universe began, it would still be nearly impossible for this event to have happened, even once. ____ And the CFO claimed this virtually impossible event happened several times in a single week, which is unfathomably rarer still. Bad luck had nothing to do with the crash. Instead of arguing that that the world is wrong because it doesn’t match the models’ predictions, these risk managers should throw out their models. If the behaviors of complex systems like financial markets are so difficult to predict, then maybe our disaster preparation strategy shouldn’t rely so heavily on predicting exactly what will happen.

We’ve seen again and again how our risk models systematically underestimate both the severity and the frequency of big disasters. When we outsource to risk managers decisions about how much safety we need, we will end up under-protected and overexposed. My suggestion is that we should pay far more attention to how much damage we will suffer should a disaster strike than argue about exactly how often it will happen.

Preventing disasters is not a game. We shouldn’t think of it as one.