The following is a guest post by @sunset_shazz.
With their NFL team celebrating a come-from-behind victory capped by a last-second, team-record 61-yard field goal by an unheralded rookie kicker, all of Philadelphia is understandably basking in reflected glory.
Instead, the city is fulminating in collective outrage over Doug Pederson’s decision to eschew punting on 4th and 8 from the opponent’s 43-yard line, with 2:36 left in the 1st half and a 7-point lead. Numerate commentators Bo Wulf and Jimmy Kempski have demonstrated that Pederson’s decision was by no means incorrect (most likely it was a push). The case for more aggressive fourth down decisions is over a decade old; there is meagre profit in arguing with those who are impervious to evidence.
I am far more intrigued by the decision process itself. Sheil Kapadia quotes Jeffrey Lurie discussing the 4th down decision-making process, unprompted, in an informal chat with reporters after a recent presser:
“A lot of teams — our’s is one — where it’s all in the offseason done with mathematics,” Lurie said. “It’s not based on any form of instinct. If it’s going to be 50/50, 48/52, then a coach is going to have their instinctual predilection, right? But what we found is there’s been so many decisions over time that are too conservative for the odds of maximizing your chance to win that the opportunity. … I mean, you’ve seen certain coaches that are deemed more aggressive because the math leads them there. That’s all it is.”
Following some snickering that Pederson’s decision-making is being dictated by his superiors, the head coach clarified that he is the decider, with help from an analytics staff, including coaching assistant/linebackers coach Ryan Paganetti and Jon Ferrari (the latter’s title – “director of football compliance” – was obviously conceived by Oceania’s Ministry of Truth).
Is this decision-making set up weird? No, in fact it may be ideal.
I was struck how Pederson, during a press conference, corrected a reporter’s estimate of the historical probability of success (the “base rate”), citing his staff’s model estimate off the top of his head. His facility of recall regarding the base rate is textbook behavioral science. The seminal work of Daniel Kahneman and Amos Tversky showed that what they termed “base rate neglect” is common in poor decision making. Here is an example from Kahneman’s Thinking, Fast and Slow:
An individual has been described by a neighbor as follows: “Steve is very shy and withdrawn, invariably helpful but with little interest in people or in the world reality. A meek and tidy soul he has a need for order and structure, and a passion for detail.” Is Steve more likely to be a librarian or a farmer?
As Kahneman wryly notes, it helps to know that there are roughly 20x more male farmers than male librarians in the United States. The base rate is very important information in making the right judgment.
I am struck by Pederson’s description of how he receives base rate and other key information, immediately prior to making an informed judgment in which he also will take non-quantitative measures into account (e.g. how the defense is playing, the weather, injuries, etc.) Here, the Eagles are harnessing another key behavioral tic – anchoring bias. Anchoring is the tendency to overweight proximate (sometimes irrelevant) information that is the starting point in making a decision under conditions of uncertainty.
Again, Kahneman and Tversky were among the first to describe and investigate anchoring, and by 2017 there exists a vast academic literature on it. My favorite study was conducted by James Montier, a financial economist (full disclosure: also a former colleague). Montier asked hundreds of subjects  (mainly fund managers and financial analysts) the following:
1) Please write down the last four digits of your telephone number
2) Is the number of physicians in London higher or lower than this number?
3) What is your best guess as to the number of physicians in London?
Montier loves shocking people with his results:
[T]hose with telephone numbers above 7000 believe there are on average just over 8000 doctors. Those with telephone numbers below 3000 think [there] are around 4000 doctors. This represents a very clear difference of opinion driven by the fact that investors are using their telephone numbers, albeit subconsciously, as inputs into their forecast.
Clearly, one’s personal telephone number should have no bearing on one’s estimate of the number of physicians in London. The fact that it does, consistently, for intelligent, educated, statistically-minded professionals speaks to the power of anchoring bias.
Over the last decade, hedge funds have paid attention to the behavioral science literature, and have sought to anchor their professionals to salient, predictive data. This has driven hybrid-quantitative trading strategies where a fund manager is augmented by an algorithmic or otherwise quantitative model. The outputs of the model are then used as anchors for further tweaking by a human that is aware of variables outside the model’s specification. Wall Street got this idea, in part, from the world of chess, where, by 2005 the best type of player was a hybrid expert + model, capable of beating Grandmasters and machines. Similarly, University of Pennsylvania professor Philip Tetlock  has found that expert forecasters can significantly improve their decision processes by relying on models to improve their calibration of variables such as base rates. The Intelligence Advanced Research Projects Activity (IARPA) has taken note of these results, and the CIA has been studying similar literature since 1999.
When Doug Pederson hears base rate data and associated variables over his headset, the Eagles organization is harnessing anchoring bias, and turning it from a bug into a feature. Moreover, the augmented expert approach is consistent with that of the more sophisticated analysts in the fields of finance, academia, chess, and government.
It’s nice to root for a team that pays attention to the world outside sports, rather than snickering at it dismissively.
 His initial sample was 300 subjects, though I believe he has replicated this finding with subsequent samples.
 Since 2012, I have been a participant in Tetlock’s Good Judgment Project.