TLDR: Thinking Fast And Slow

13 min readMay 10, 2022

Thinking Fast and Slow. Photo by Monica Sauro on Unsplash

Human beings are fundamentally irrational. No matter what Aristotle may tell you, the majority of our decisions are made quickly, heuristically, and without anything resembling a formal proof.

Amazon is awash with books with initial chapters dedicated to that thesis. Their arguments vary widely, ranging from academic studies to florid anecdotes to literary-tier thought experiments, all of which describe reasonable people doing things which feel right in the moment, then lead to… uh… sub-optimal results. The majority of these books are advice manuals, describing either ways to counteract your own irrationality, or more insidiously, to take advantage of somebody else’s. Thinking Fast and Slow by Daniel Kahneman does neither of these. Instead, it seeks to define our irrationality. It attempts to understand the structure of human decision making, providing not just a curated list of biases and heuristics, but also explanations in terms of a simple and powerful framework. In this TLDR, I’ll describe that framework, and make the case that Kahneman’s book serves as a phenomenal starting point for understanding the mind.

System 1 and System 2

The human mind is not unitary. Like any complex system, it is composed of a number of interlocking components which can be divided and described at many layers of abstraction. Here we will focus on the conscious mind — the world of thoughts and ideas to which we have access. It consists of a number of components and specialized hardware which can be grouped into two interconnected subsystems, which Kahneman labels System 1 and System 2. The act of cognition consists of an endless, looping interaction between the two as they attempt to make sense of the world both around and inside them.

System 2 is what we tend to think of as the rational mind. It is directed, precise, and capable of performing essentially arbitrary operations on data. When we add two numbers or hold a set of items in memory, that’s System 2. However, it is also fairly weak, and consequently, very lazy. Try remembering the ten digits associated with an American-style phone number. Now try putting them in numerical order without writing them down. Notice the strain? If not, try multiplying two phone numbers together. The effort you experience is called ‘cognitive load’. While these operations are trivial for a computer, they are slow and difficult for a human being, and the brain would prefer to avoid them whenever possible. If there is a better way to get a passable result without invoking System 2, it will take it. Enter System 1.

The majority of conscious (or semi-conscious?) data processing takes place within a vast associative network — System 1. It is powerful and automatic, generating links between data points and comparing them to learned information about the world. It is responsible for constructing narratives, detecting anomalies, and piecing together most of the cause-effect relationships we take for granted in the world around us. It is also gullible and prone to naive mistakes, as its metric for truth is not rigorous proof, but rather internal consistency.

The interactions between System 1 and System 2 are roughly as follows. New data enters through sensory channels and is fed into System 1’s associative network, where it is integrated into that network’s representation of the world. That network comes with a set of expectations, or ‘norms’, which describe how the world is supposed to behave. If the data doesn’t violate those norms (or trip a System 2 lookout request), integration completes. Otherwise, System 2 is activated, so that it can perform the more difficult mental operations required to make sense of the data, either by conscious manipulation or through priming, in which it directs the System 1 hardware to generate new narratives or integrations as part of the ongoing cognitive process. This dialogue continuously generates new narratives and patterns, and forms the basis of most of our thoughts.

In reality, System 2 is not engaged often. The majority of reality fits our narratives, and we don’t think anything of it. However, these narratives are limited to associations and narrative links, so when a complex question arises, System 1 finds itself at a loss. The attempt to answer such questions without invoking System 2 leads to biases, and because of that system’s laziness, our decisions are riddled with them.

Biases

System 1’s biases can be broadly grouped into three broad categories — heuristic, narrative, and utility. Heuristic biases are those caused by System 1’s inability to handle complexity — it is asked a complex question, finds a simple question which can serve as a proxy, then answers that question instead. Narrative biases are those in which System 1’s inability to think statistically comes into play — it sees patterns where there are none, turning statistical noise into narratives. Finally, utility biases are decision making biases which cause deviations from pure decision theory — in which System 1 makes logically inconsistent assertions about its own preferences. This section contains a list of named biases ordered by category. Note that for the sake of my own sanity, I’ve left out references to the supporting studies — for that, you’ll actually need to read the book.

Heuristic Biases

Heuristic biases arise from the fact that some queries are simply too difficult for System 1 to handle. Take the question “is politician X trustworthy?” Your exposure to politician X — unless you are a friend, relative, or voyeur — is likely mediated through so many layers of media filter that you have exactly zero knowledge with which to answer that question. What you do know, however, is that X served with distinction in a previous war, has no hard-to-dismiss scandals, and generally fits your beliefs. Thus, your System 1 substitutes the questions “is X patriotic”, “is X not a total scumbag”, and “does X seem like part of my tribe”. Those questions — especially the last one — are easy to answer, so unless you choose to perform an actual System 2 analysis of the data, your are likely to believe the affirmative.

The first named heuristic bias is the Halo Effect — if you like some characteristics of a thing, you’re likely to view other characteristics in a positive light, and vice versa. This is why first impressions matter — if you find a person immediately likable, you tend to view their future actions through rose tinted glasses. The same is true if you find them irritating. This is also why many caustic political ideologies take hold — if one of their tenants strikes a deep chord with their followers, those followers are more likely to start limbering up for the mental gymnastics required to justify the less pleasant ones.

The next heuristic bias is What You See Is All There Is, shortened (at what cost?) to WYSIATI, for which everyone should have their own pronunciation. WYSIATI is the construction of an opinion from the available evidence, even if that evidence is partial or sparse. System 1 does not care whether the story is complete, but only whether it is consistent. It is the reason that point of view is so important — when a story is present from only one point of view, the actions and motivations of the other characters are simplified or ignored. It is also why both sides in a court of law must be allowed to make a defense, rather than simply presenting raw factual evidence.

Next is the Availability Heuristic: the importance of an item is proxied by the ease with which it is recalled. This is why it is easy to assume that politicians are more likely to be involved in romantic scandals than engineers (though perhaps not the only reason) — their scandals show up in the media, meaning it’s easier to recall instances of them. This is also why we often blame complex social problems on the few simple issues up for national debate at the time.

Fourth is the Conjunction Fallacy, which is less of a heuristic and more of a direct failure in reasoning. The example given in the book is of a hypothetical woman named Linda, who spends most of her college and early adulthood experiences involved with pro-social-justice organizations. It asks people which is more probable: that Linda is a bank teller, or that she is both a bank teller and a feminist. Interestingly, the latter is often deemed more probable, despite representing a proper subset of the former, simply because it is easier to imagine given the background knowledge.

Finally comes the Anchoring Effect. In this case, we are asked to approximate a number, whether it be the number of judges in all federal courts or how much we should pay for a car. The System 1 method involves finding a number which sounds reasonable, then adjusting up or down based on situation specific issues. This provides a source of error, as it is then easy to “seed” this process by suggesting an irrational number beforehand, onto which System 1 can latch. This is why — at least psychologically — it is beneficial to make the first offer in a negotiation.

Narrative Biases

Narrative biases arise from the fact that System 1 tends to assign narratives and volitions where they don’t belong. This primarily takes the form of simplistic cause/effect relationships grafted onto complex and chaotic systems, though in sociological contexts, it often involves positing the existence of willful elements pulling the strings. It is the agent responsible for Taleb’s Black Swans.

The first named narrative bias is Regression To The Mean. Ask yourself, why is it in America that the counties with the lowest incidence of heart attacks are rural. Is it the bucolic clean air, the active lifestyle? Alternatively, why are the counties with the greatest incidence of heart attacks also rural? Is it lack of access to good healthcare, or unhealthy dietary norms? The answer is neither. It is simply that there are a lot more rural counties than urban ones, so they should demonstrate more extreme max/min values across most metrics. System 1, however, is bad with statistics and good with narratives. It naturally attempts to assign a cause to these extreme values, even if they are merely the result of large sample size. Another, more immediately accessible example is that of the MVP athlete. Why is it that those athletes who perform best in one season tend to perform disappointingly in another — not below average, but not at the elite level expected after their MVP months. It’s usually because their spectacular performance is more a question of luck than extra skill, and they merely regress to the mean. Once again, System 1 doesn’t like such regression, and insists on a pattern. It feels silly to bet against the team which won the Superbowl last year, but it might be a good idea.

Related to Regression to the Mean is the Law of Small Numbers. Typically, a statistically valid result requires a large quantity of data. This is to prevent noise from influencing the outcome — the odds of getting all heads on 6 coin flips is far higher than getting it on 60. Despite the existence of very precise math which defines required sample size, System 1 is quite happy to draw a conclusion from small amounts of data. It is far more likely to assume that the 6 heads on the coin are the result of an unfair coin, rather than statistical noise. Anecdotal evidence (from the book, I promise) indicates that even experts frequently come to erroneous conclusions based on insufficient sample size.

This brings us to Expert Bias. In a series of experiments, Kahneman showed that simple formulas are often almost as good at predicting future outcomes as trained experts. For one example, a simple formula based on a half dozen parameters — including weather during the growing year — outperformed sommeliers in predicting the future price of a wine vintage. For another, formulas were terrible at predicting the stock market — but so were experts. It turns out that expertise is simply an exercise in associative memory, and only applies when two conditions hold: the process must be sufficiently regular that previous experience can provide valuable predictive data, and the expert must have spent sufficient time exposed to the process to pick up that experience.

The fourth narrative bias is the Optimism Bias — when planning a project, most people ignore the statistics associated with similar projects. It’s a classic case of “that couldn’t happen to me”. For example, the vast majority of startups fail, consuming far more capital than they ever produce. However, people still start companies, often without a fallback plan. We as a society should be thankful — this is the engine which has driven most of capitalism’s unparalleled dynamism — but it is built on the back of overoptimistic failures.

The final narrative bias is related to the Optimism Bias — it’s known as the Planning Fallacy. It occurs when people fail to account for everything that can go wrong with a project. Rather, we generate timelines based on everything running smoothly — perhaps with some buffers for friction and unexpected events — but ignore the often cataclysmic effects of Black Swans. If you’ve ever found yourself working on an ‘easy’ project at midnight before a deadline, this is why. We’ve all done it.

Utility Biases

Utility Biases are those in which decisions are made which contradict pure decision theory. In decision theory, the optimal decision is typically that with the highest expected outcome — i.e. the maximal sum of outcomes weighted by their probabilities of occurrence.. The mind, however, tends to place undue emphasis on loss avoidance and certainty. The section of the book dedicated to utility biases is essentially a synopsis of Kahneman’s 1979 article on Prospect Theory (which has been added to my reading list), and can be thought of as a catalog of deviations from standard economics.

The most important utility bias — and the one from which fundamentally all others spring — is Loss Aversion. Losses tend to feel worse than equivalent gains. Therefore, given a choice between a certain outcome or one in which the potential for gain and loss are equal, people will often choose certainty; the expected emotional payout is less, since the pain of loss will be greater than the joy of gain. This is why insurance companies exist — even if the expected outcome is overall negative, the pain of a large loss drastically outweighs the potential utility of saving on the premium.

The second bias is Preference Reversal: the utilities associated with outcomes when considered separately are often different from those associated with those outcomes considered together. Considering options together activates System 2, which is forced to decide which is fundamentally better — one is essentially anchored against the other. Without a second option to anchor against, however, it is very easy to overstate or understate the utility of an outcome.

Finally, the Framing Effect states that people are far more likely to accept a risk when couched in terms of its potential gain than its potential loss. Take a risky surgery with a 95% chance of success. This is far more acceptable to most people than the same surgery, when described as having a 5% chance of death. The second phrasing triggers our loss-avoidance infrastructure, while the first skirts it.

The Two Selves

After spending so long describing System 1 and System 2 — their differences, biases, and relevance to real life — the book concludes by briefly bringing up a second dichotomy: that between the Experiencing and Remembering Selves. The Experiencing Self is that of our lived reality. If you take a snapshot of your current state, including thoughts, emotions, and desires, the flux of that state over time defines your experience. The Remembering Self is more of a construct, which defines reality not as it is experienced, but as it is remembered.

Before dismissing the distinction as irrelevant, we should examine it in more detail. The Remembering Self follows two laws — the Peak-End Rule and the Duration-Indifference Rule. The Peak-End Rule states that the pain (or pleasure) associated with an experience is directly proportional to the sum of that experienced at peak and that experienced at the end. The Duration-Indifference Rule states that the duration of pleasure (or pain) doesn’t affect how much pain or pleasure is associated with the experience, only the peak and end. Combined, these imply that people will tend to more fondly remember a shorter experience with greater peak/end pleasure than a much longer one with less peak/end pleasure, even if the integral of pleasure over time is substantially larger for the second. Situations like this expose a disconnect between the Experiencing and the Remembering Selves.

Additionally, the Remembering Self tends to be far more obsessed with narrative. It has trouble dealing with the messy chaos of reality, and tends to fill it with causes, effects, storylines, and most importantly — meaning. While the Experiencing Self muddles through an unpredictable maelstrom of random events, the Remembering Self goes back and applies narratives to them. Following the Peak-End and Duration-Indifference rules, it reassigns meaning to experiences and reassesses their benefits to the person as a whole. It easily forgets constant, low-grade pain, while continuously revisiting brief and traumatic experiences.

The difference is relevant for decision making. An experiment referenced in the book describes how most people — given the option of completely forgetting an experience once it is over — will treat the Experiencing Self, who must go through it, as little more than an acquaintance. They’ll be willing to accept much larger pain/reward tradeoffs if allowed to forget them than if forced to remember. While memory-optionality is not practical in the real world (barring insane quantities of liquor), we often treat our Experiencing Selves the same way. We frequently condemn ourselves to consistent low-grade pain in order to avoid a brief but higher-grade version. If we can remember this distinction and find ways to counteract it, the world of lived experience can become a lot more pleasant.

Conclusions

Thinking Fast And Slow serves as a manual to the human psyche, and should be taken as a 101 course for any self-development regimen. Not only will it provide a framework for evaluating one’s own decisions and biases, but will also provide a mechanism for ensuring that we don’t bite off more than we can chew. By understanding the difference between the two systems, we can make more rational choices and achieve more nuanced opinions. By remembering the biases of the Remembering Self, we can avoid putting ourselves through too much misery in search of insubstantial goals, and instead focus on maximizing returns for the life we experience. Everyone should read this book. 10/10.

A final word. Kahneman does a very good job at not preaching. Despite the many references to irrationality and the utility of personal experience, he saves the could’s and should’s for a few pages at the very end. This normative section essentially boils down to one axiom: whatever you’re focusing on — whether a worry, a goal, or an ideology — probably matters less than you think. So take some time to enjoy your life.