Mr. Brown

A Review of Made to Stick & The Computer Science of Human Decisions

April 1, 2021

Computers are man-made creations that closely reflect our way of thinking. Neurons are replaced with electrical wiring and our own fleshy brain with the polished cold metal of a central processing unit, or CPU. Therefore, it comes as no surprise that many of the principles that computer scientists apply to computers are modeled from concepts that we use in our lives. This opens up a two-way information street where we can advance our computers and simultaneously glean vital clues about our own lives. At one end of the street, Chip and Dan Heath illustrate certain principles that can make ideas more likely to “stick” and go mainstream in their book, Made to Stick. They assert that if ideas are kept as simple as possible, they have the highest chance of disseminating and surviving the test of time. In Freakonomics, this idea of simplicity also lends itself well to the complex incentive structures that require the right incentive to build upon. Although these resonate as very human problems, an astonishing similarity can be seen between this preference of simplicity in our lives and the other end of the street: data interpretation in computers.

The first chapter of Made to Stick discusses the success of the Southwest Airlines achieved through their mission statement of being the cheapest airline out there. As Herb Kelleher, former CEO of Southwest Airlines, once told an employee, “I can teach you the secret to running this airline in thirty seconds. This is it: We are THE low-fare airline.” (Heath 29). This simple motto has influenced the complex business and decision-making of the company for more than 30 years and guided the company to what it is today: THE low-fare airline and one of the most profitable airline companies by far. For the employees, this phrase is what drives their decision-making in their daily discourse. There may be other factors to consider, however. “The central circle, the core, is "THE low-fare airline." but the very next circle might be ‘Have fun at work’” (Heath 30). Since the main message is to be “THE low-fare airline”, having fun is secondary to the motto of the company: a classic example of prioritization.

Authors Brian Christian and Tom Griffiths make their case for simplicity and prioritization in their book, The Computer Science of Human Decisions. They design a nine-factor formula that models and predicts the level of happiness over time in marriages using survey data. Each factor in the formula represents a variable that can provide greater accuracy in predicting the happiness level over time in a given marriage. Theoretically, this approach enables a significant increase in accuracy with each additional factor. In reality, however, every additional factor after the second adversely reduced the accuracy of the program. In a sense, the program was “overfitted” to the data and overemphasized the little things. In the chaotic discourse of business for the Southwest Airlines employees, it would have been impossible to pinpoint the most important factor in their decision-making if they had been presented a large number of factors, similar to the ineffectiveness of the nine-factor model.

One of the best ways to combat overcomplexity in computer systems is by simply penalizing it. The nine-factor model could be optimized to its ideal numbers of factors: not too many as to lose focus but enough to maintain sufficient data. This process of penalizing complexity is known as Regularization. In the field of algorithms, this is known as the Lasso and, just like a physical lasso, it acts to restraint the complexity of the system. This model is the backbone of many predictive algorithms that rely on an extremely large number of variables and need to do a lot of prioritizing to consistently arrive at a non “overfitted” conclusion. Penalizing complexity is essentially what Chip and Dan Heath are referring to when they analyze how simple ideas stick. When it comes to journalism, this pitfall of “burying the lead” comes into play when proper regularization isn’t applied to the text. Chip and Dan Heath state “A common mistake reporters make is that they get so steeped in the details that they fail to see the message's core…” (Heath 26). In order to avoid “burying the lead” journalists use something called the inverted pyramid structure in which “After the lead, information is presented in decreasing order of importance.” (Heath 31). Every bit of information after the first is less likely to be read and therefore less important; this is a case of regularization.

Regularization is also present around us and in the way we think, even though most people are unaware of the complexity penalty they apply to themselves. A former newspaper writer and professor of communications, Ed Cray, says “The longer you work on a story, the more you can find yourself losing direction” (Heath 31). An application of Regularization would be, as he suggests, to stop thinking after a given amount of time has been expended on the topic. In computer science, this is known as “Early Stopping” and it is mainly a way predictive algorithms terminate after figuring out only a few important factors. This “termination of thinking” so to speak, is best illustrated by Jeff Hawkins; a project lead who carried around a wooden block as a replica of the tablet his team was developing. The block was divided into four functions: “a calendar, contacts, memos, and task lists” (Heath 49). Whenever asked about additional features which could be added to the tablet, “Hawkins would pull out the wooden block and ask them where it would fit” (Heath 50). The spatial restriction of the wooden block served as the complexity penalty: additional features would not fit and therefore there existed a need to focus on the four functions and those functions only. This is a prime example of Regularization and the Lasso at work in our day-to-day life.

Initially, it is hard to think of Regularization outside of the world of computing and math, but there are natural barriers that act as the penalizing factors in our lives as well. The most evident example of that in nature is evolution. As we look at ourselves, it may come as a surprise to think that physical parameters such as the number of calories our brain needs serve as the regularizing element for our intelligence. Due to this, “we can also infer that a substantially more complex brain probably didn’t provide sufficient dividends, evolutionarily speaking” (Christian 161). The human brain also tries to minimize the number of neurons firing at any given point, which also acts as a Lasso to keep our brains from overfitting on any and every given information. This directly ties back to the inverted pyramid structure that journalists use, or rather the fact that they communicate this information through language: another natural Lasso. Talking at great lengths and thereby taxing the listener’s attention serves as a penalty for long, prolonged statements. This language Lasso urges us to write more concise and catchier phrases in an attempt to hold the attention span of our listeners and to simplify, simplify, simplify. Before an idea can be “sticky” it must pass through this “inherent Lasso of memory” (Christian 162). Life advice such as “A bird in hand is worth two in the bush” (Heath 47) becomes the well-known proverb only because it is compact and easy to remember. In fact, this proverb has been around for over 2,500 years and has been found among a plethora of languages.

Proverbial advice is best suited to the most complex of situations. Not only does it clear past the inherent Lasso of memory filter, but also avoids overfitting due to its concise and simple nature. Proverbs are seen more as heuristics, a non-optimal or non-tailored solution to an overly complex situation. For example, on the battlefield, this heuristic is similar to an idea of a “Commander’s Intent” as Chip and Dan Heath explain in Made to Stick. Colonel Tom Kolditz states that “No plan survives first contact with the enemy” (Heath 25). In a chaotic setting such as a battlefield or in the world of business, it is most effective to go with a simple plan that works well in many different situations. The simple solution avoids “overfitting” to the situation and instead focuses on what one is trying to accomplish. “Training scars” are a typical example of overfitting that occurs when a person is given instructions by the book. For example, police officers who had gone through target practice “found spent shells in their pockets with no recollection of how they got there” during real-life shootouts (Christian 160). This was because, during training, they were instructed to pick up after themselves since this was seen as proper etiquette. Because of this overfitting in training, they were losing vital seconds picking up after themselves in shootouts, courtesy of their “training scars”. Not unlike the Commander’s Intent that Chip and Dan Heath discuss, this step-by-step approach fails to work optimally in complex situations despite attempts to make the approach as optimal as possible. Therefore, it is best to use a heuristic approach and work towards a goal or an objective, similar to how successful incentive structures are implemented.

“Incentive structures work,” as Steve Jobs put it. “So you have to be very careful of what you incentivize people to do because various incentive structures create all sorts of consequences that you can’t anticipate”’ (Christian 157). A failed incentive structure is one that uses the wrong incentive to produce the desired outcome. Finding an incentive is easy; you reward yourself for the hard work, or you give your dog treats for good behavior. However, it is easy to forget the perverse effects each incentive has on something as multi-faceted and intricate as humans. Each incentive has the potential to be taken advantage of by an ulterior motive and often that becomes the case. In Freakonomics, Steven Levitt talks about an incentive system for teachers where “if her students do well enough, she might find herself praised, promoted, and even richer:” (Levitt 22). This incentive aims to incentivize teachers to perform better, but instead, some teachers pervert the idea and start cheating in order to boost their status. In a sense, the metric for measuring teacher success was through her student’s test scores which could easily be changed in order to make herself look better. To engage the major flaw of incentive structures, it is crucial to implement cross-validation, or in this case, standardized testing. Standardized testing allows us to compare the performance of students in the classroom and through a standardized metric that is used nationally. No incentive is perfect, but having multiple metrics prevents overfitting to one characteristic (test scores) and restricts the possibilities of perversion. Of course, the aim is to achieve better student performance and that is where the heuristic is derived from, but the factors measuring “better performance” aren’t always incorruptible. This leads us to cross-validation techniques that prevent “the ruthless and clever optimization of the wrong thing.”(Christian 158).

After looking at the various problems with ideas and incentive structures discussed in Made to Stick and Freakonomics and their solutions resonating with the ones in The Computer Science of Human Decisions, we can begin to see both ends of the street. Our inherent penalization of complexity in the way our brains work and how we have evolved lend themselves as solutions in our most challenging algorithms. Terms such as Regularization and the Lasso are, in essence, the same as our shorter attention span or inability to remember the little things. Overfitting, when the predictive algorithm adheres too closely to the data, shows up in the incentive structures that assess the wrong things and fail to work. The solution is found by taking the way computers use cross-validation as a way to refine data and applying that to our own incentive structures. These books separately discuss the interconnected ideas between the computing world and ours with emphasis on the advantage of simplicity, whether it be to decrease error in a predictive algorithm or to write a killer newspaper article.

New Writings