The following is a guest post by @sunset_shazz.
Should the Carolina Panthers have fired Head Coach Ron Rivera or traded QB Cam Newton the day after they lost the Super Bowl? Scott Kacsmar at FiveThirtyEight argues they should have done either or both. Do read the whole piece; the argument is presented as follows:
- In NFL history, only 4 coaches have won their first Super Bowls after 5 seasons on the job with the same team;
- No team has ever started the same quarterback under the same head coach for more than 5 years and seen that duo win its first championship.
Having examined the history of prior first time Super Bowl winners, FiveThirtyEight infers that these characteristics are conducive to winning championships. The study’s conclusion: “If championship success doesn’t come within five years, things tend to get stale, and someone eventually has to move on from their position of power.”
Can you spot the flaw in this reasoning?
How about if I used the same exact logic, using a more emotionally salient characteristic:
- In NFL history, only 4 minority head coaches have won Super Bowls. Therefore you shouldn’t hire minority head coaches. [1]
Does that framing device make the flaw in reasoning clearer?
FiveThirtyEight’s study suffers from the confusion of the inverse, a statistical fallacy that undergraduates are commonly taught to avoid. One of the best recent treatments of this problem was a brilliant piece by Katherine Hobson on the lab-testing startup Theranos (also, funnily enough, at FiveThirtyEight). Chapter 8 of Nate Silver’s excellent The Signal and the Noise provides a lucid discussion on this topic, in the context of Bayes’s theorem.
Here is the issue: the fraction of Super Bowl winners that possess a certain characteristic, by itself, tells you nothing about the probability that those who possess that characteristic will win a Super Bowl. A better way to estimate the latter would be to go back and examine the historical success rate of coaches who possess the characteristic you’d like to study.
I compiled every season coached since the 1970 merger, then excluded the seasons after a coach has won his first Super Bowl. Coaches who were tenured 5 years or fewer with their teams won 24 first Super Bowls in 1009 opportunities, for a success rate of 2.38%. Coaches tenured 6 or more years won 4 first Super Bowls in 176 opportunities, a 2.27% success rate. Using a technique previously used in the Duck Bias study, I applied the cumulative distribution function of the binomial distribution to test whether the success rates were different, to a statistically significant degree. The P value of 0.592 indicates no statistically significant difference.[2]
However, Super Bowl success is a noisy, sparse data set, due to the very small sample size. An alternative measure of coaching success which enjoys the advantage of more data is the frequency with which a coach makes the playoffs. I compiled the playoff rate for every coach in the dataset, and compared this with the base rate of success for that year.[3] The data shows that coaches with longer tenure are actually more likely (47.7%) to make the playoffs than shorter tenured coaches (31.1%) and the base rate (38.0%); both of these differences are statistically significant.
Obviously, this data doesn’t tell you anything about causation. There is likely a survivorship bias / selection effect: those coaches who are kept by their team after 5 years without a championship are likely of higher quality than average, which is probably why their subsequent success rate is higher.