• Blog
  • About
  • Links
  • Contact
  • Twitter→

McNabb or Kolb

  • Blog
  • About
  • Links
  • Contact
  • Twitter→
B80E8860.jpg

On Dallas Goedert And The Eagles' Two-Tight End Dominance

B80E8860.jpg

Are you curious why the Eagles drafted a second tight end, Dallas Goedert, when they already have Pro Bowler and Super Bowl-winner Zach Ertz on the roster? Look no further than The Athletic, where I penned an article recently that examined the Eagles' use of multiple tight end sets last year. The numbers surprised me, especially what emerged about the Eagles in the red zone, in the playoffs, and specifically the effectiveness of Ertz and Trey Burton in the same formation. Make sure to subscribe to The Athletic Philly (hat tip to the inimitable Sheil Kapadia for asking me to contribute) and check it out:

‘Big bodies on smaller bodies’: Why the Eagles doubled down on the two-tight end offense

Tagged with 2017, 2018, Tight End, Zach Ertz, Dallas Goedert, Sheil Kapadia, The Athletic, Brent Celek, Trey Burton, Doug Pederson, NFL Draft.

May 18, 2018 by Brian Solomon.
  • May 18, 2018
  • Brian Solomon
  • 2017
  • 2018
  • Tight End
  • Zach Ertz
  • Dallas Goedert
  • Sheil Kapadia
  • The Athletic
  • Brent Celek
  • Trey Burton
  • Doug Pederson
  • NFL Draft
  • Post a comment
Comment
nick-foles-takes.jpg

Nick Foles Is The Playoff GOAT

nick-foles-takes.jpg

The following is a guest post by @sunset_shazz.

Nick Foles is a high-variance quarterback. His performance ricochets from abysmal to sublime with such frequency that he made me re-adjust my chart axis, twice. And yet: including the 2013 loss to the Saints (in which he engineered a comeback from a 13-point deficit and left the field with the lead) his postseason play has been consistently excellent. There have been 93 quarterbacks since the 1970 merger who have played at least 4 playoff games. Of these, Foles ranks 1st in completion percentage and 2nd in Adjusted Net Yards / Attempt (ANY/A).

Screen Shot 2018-02-12 at 11.50.18 PM.png

Obviously, this is not statistically dispositive. Nothing about playoff analysis is. Mark Messier and Reggie Jackson’s playoff performances comprised a mere fraction of their total careers, yet their knack for elevating their game on the biggest stage is what made them memorable. One way to think about the playoffs: there is a tide in the affairs of men, which, taken at the flood, leads on to fortune. As I will show, Foles has taken the tide at the flood in historic fashion.

Note, from the chart above, that the fewer games played, the greater variance in ANY/A between individual players. But what about each player’s game-by-game variance? I measured the standard deviation of each player’s game ANY/A, and scaled this by his mean ANY/A, thus constructing a coefficient of variation.

Screen Shot 2018-02-12 at 11.50.57 PM.png

Of all 93 QBs in the sample, Foles has been the 4th most consistent (i.e. has the 4th lowest variation). Moreover, he has the lowest variation of the 16 QBs who have only played 4 games.

Perhaps Foles has benefitted from playing in a QB-friendly era? I compared each QB’s game ANY/A to the league average for the year in which that game was played. One can then plot mean Relative ANY/A against the coefficient of variation:

Screen Shot 2018-02-12 at 11.52.21 PM.png

Foles has the 5th highest Relative ANY/A in addition to having the 4th lowest variation. One way to think about the above graph is to imagine an “efficient frontier” on the upper left quadrant. When considering similar efficient frontiers in the context of financial economics, Nobel Laureate William F. Sharpe constructed a “Sharpe ratio” which compares a fund manager’s relative return (e.g. versus an index) to the standard deviation of the fund’s return.

I similarly devised a playoff QB Sharpe Ratio, which is each QB’s mean Relative ANY/A divided by the standard deviation of his game ANY/A. Think of it as one number which captures both efficiency and consistency of play. The following table shows the top 10 playoff QB Sharpe Ratios since the merger:

Screen Shot 2018-02-12 at 11.52.51 PM.png

All 10 of these quarterbacks played in a Super Bowl, and all but two of them were champions. Only Bengals starter Ken Anderson and Bills backup Frank Reich did not win the season’s final game. (Reich, of course, will receive a ring as Offensive Coordinator of the 2017 Super Bowl champions.)

By this metric, Foles will have to settle for second place out of 93 playoff QBs. The Raiders’ Ken Stabler, who played in 13 playoff games between the 1971 and 1979 seasons, passed for 3.08 ANY/A above average (3rd) and had the 8th lowest coefficient of variation in the sample. Combining efficiency and consistency, he is the greatest playoff quarterback of all time. Here are the rankings of some other notable QBs, and Eli Manning:

Screen Shot 2018-02-12 at 11.53.23 PM.png

Obviously, I’m not suggesting Foles is better than any of those quarterbacks (except Eli; he’s indisputably better than Eli, it’s not even close). However, in the inherently limited sample that consists of the playoffs, Foles has performed at a historically great level, in terms of both efficiency and consistency. Also, he can catch.

Tagged with Super Bowl, Nick Foles, 2017, 2018, Playoffs, Quarterback, Statistics.

February 13, 2018 by Brian Solomon.
  • February 13, 2018
  • Brian Solomon
  • Super Bowl
  • Nick Foles
  • 2017
  • 2018
  • Playoffs
  • Quarterback
  • Statistics
  • Post a comment
Comment
121315_cox-happy_1200.jpg

Can't Run On Us

121315_cox-happy_1200.jpg

The following is a guest post by @sunset_shazz.

As the inimitable Jimmy Kempski recently explained, the Eagles’ defensive game plan is rather simple:

  1. Stop the run.
  2. Make the opposing offense one-dimensional.
  3. Get after the quarterback.

Indeed, Eagles opponents do appear to give up on the run – this year the defense has faced the fewest rushing attempts per game. One caveat: they also have the second-highest point-differential. As everyone knows, you face fewer run attempts when holding a lead.

Are the Eagles’ opponents giving up on the run because they’ve fallen behind? Or are the Eagles facing fewer rush attempts due to their stout run defense, irrespective of the scoreboard?

I examined Game Script data compiled by Chase Stuart. Game Script is basically the average score margin over a total game. As a stylized example, let’s say Team A returns the opening kickoff for a touchdown, kicks the extra point, then neither team scores for the rest of the game. Team A’s Game Script in this simplified example would be +7 (the average lead held the entire game); Team B’s Game Script would be -7. A higher game script is often associated with more rushing by the team who’s leading (because you run when you win, not win when you run) and more passing by the opposing team (which has a negative game script).

The plot below shows each team’s average Game Script on the X-axis and average Pass/Run Ratio on the Y axis (all data through weeks 1-10). The regression line represents the expected Pass/Run Ratio, given Game Script, computed from the 146 games that were played through week nine. [1] A team that is toward the right (LAR, PHI) has enjoyed a higher average lead, and a team toward the top (SFO) has a higher pass/run ratio.

By comparing each team’s pass-run ratio to what one would theoretically expect given game situation (denoted by the regression line above), one may construct a “Pass Heavy Index”:

This year, Bill Belichick has been 10.9% more likely to call a pass, given game situation, than average. With Mitchell Trubisky behind center, John Fox is 15.5% less likely to call a pass, given game situation, than average. Despite almost being run out of town after a week 2 game in which his Pass Heavy Index was +27%, Pederson is basically in the middle of the pack.

What about the defense? One may similarly plot each team’s opponent’s average pass/run ratio against the opponent’s average game script:

Did you notice the outlier on the upper left? One may also compute each team’s opponents’ Pass Heavy Index:

The table above shows that Eagles opponents are not only passing more than any other team’s opponents (70.1% of the time), but that they are the most pass heavy adjusted for game situation. The evidence supports Kempski’s thesis: Eagles opponents this year have become one dimensional. One way to look at it is that opponents have a healthy respect for the Birds’ run defense. Another way to view it: they think they can attack the secondary. Somebody please inform the Green Goblin that he’s being disrespected.

Stick Figure GIF reprinted with permission, courtesy Jimmy Kempski

Alternatively, perhaps this is an artifact of sampling bias – maybe the Eagles just happen to have faced teams who pass a lot (like Arizona).

Looking at it game-by-game, 6 out of 9 teams the Eagles played chose to pass more than they typically do, adjusted for game situation. There were three exceptions: the Cardinals 76.7% pass ratio, though 8.3% higher than expected, was a tad (0.5%) less pass heavy than Bruce Arians’ typical game; this was due to an extreme game script driven by a three touchdown first quarter by Carson Wentz. In the most recent two weeks, the Niners and Broncos both continued to run more than would be expected, despite falling behind by two scores in game script. Could this portend a change in opponent strategy, perhaps due to the absence of LB Jordan Hicks, whose season ended in the first series of Week 7? Or was this merely due to the injury to Joe Staley for the Niners  and the move to Brock Osweiler for the Broncos? The next few weeks should be interesting.

The analysis presented above demonstrates that Eagles opponents are 10% less likely to run the ball than average, given the game situation. If opponents indeed are choosing to attack the Eagles’ passing defense, they are picking a different, though still potent, poison. The Eagles passing defense is ranked 7th in ANY/A allowed, and ranked 8th in defensive passing DVOA.

Interestingly, the Rams’ and Jaguars’ average lead has been similar to the Eagles’, though their opponents are running more than typical, given such a deficit. Those teams are ranked 2nd and 1st against the pass, respectively, in DVOA, and are ranked 15th and 30th against the run. Though opponents of each team are falling behind during games at a similar rate, they are choosing to attack the Eagles differently, given the relative strengths of their defensive units.

Thanks to Eagles fan Noah Becker and MoK Editor-in-Chief Brian Solomon for discussion leading to this post. 

[1] The Y-intercept indicates the neutral pass/run ratio, 57.8%, which also mathematically corresponds to the league average pass-run ratio.

Tagged with 2017, Run Defense, Defense, Statistics, Chase Stuart, Jalen Mills, Jimmy Kempski, Philadelphia Eagles.

November 18, 2017 by Brian Solomon.
  • November 18, 2017
  • Brian Solomon
  • 2017
  • Run Defense
  • Defense
  • Statistics
  • Chase Stuart
  • Jalen Mills
  • Jimmy Kempski
  • Philadelphia Eagles
  • Post a comment
Comment
usa_today_9781843.0.jpg

The Kids Are Alright

usa_today_9781843.0.jpg

The following is a guest post by @sunset_shazz.

Carson Wentz’s start to the 2017 season has garnered national plaudits for his stewardship of the Eagles’ league-leading offense. But it being 2017, there lurks a coterie of skeptics who claim his underlying ability is “horrendous” like Blake Bortles or merely pedestrian like Andy Dalton. Even more emphatically, poor Jared Goff was confidently pronounced a bust after one season.

Is it fair to judge a quarterback solely on his rookie year? What about after the first nine weeks of his second season in the league? And how might one systematically evaluate a developing quarterback, relative to historical data?

Let us consider some advanced metrics that are used to evaluate quarterbacks:

  • Adjusted Net Yards / Attempt (ANY/A) was developed by the great Chase Stuart, and accounts for sack yards, while providing a bonus for touchdowns and a penalty for interceptions. Both Stuart and Topher Doll have shown that ANY/A predicts wins. Danny Tuccitto has brilliantly used confirmatory factor analysis to show that ANY/A is a stable indicator of QB quality.
  • Defense-adjusted Value over Average (DVOA), the brainchild of Aaron Schatz at Football Outsiders, is a success-based, opponent-adjusted per-play efficiency metric intended to both correlate with non-opponent adjusted wins (descriptive) and to predict future opponent-adjusted wins.
  • Defense-adjusted Yards above Replacement (DYAR) uses similar success-rate inputs to DVOA, in order to compute an aggregate value for a player (combining volume and efficiency).
  • Total QBR is ESPN Stats & Information’s proprietary efficiency metric that combines both passing and running contributions, adjusted for game situation, with charting to assign responsibility to a quarterback’s receivers and blockers.

Through nine weeks, the 2017 sophomore class is playing at an extraordinarily high level, as measured by each of these advanced stats:

Please note that nothing herein intends to argue for any of these quarterbacks to the detriment of the others. Though the data presented above is insufficiently precise to draw ordinal rankings, it is unequivocal:

Wentz is good. Goff is good. Prescott is good. All three of these things can simultaneously be true, pace internet trolls.

Some epistemic humility is in order: the first-nine-week sample size is obviously noisy, with varying degrees of luck, opponent quality, team injuries, coaching quality and supporting casts influencing the statistical performance of each QB. Danny Tuccitto warns us that ANY/A stabilizes at 326 dropbacks, and even at that sample size, 50% of the observation represents randomness/luck. Nonetheless, the broad takeaway should be that each sophomore QB has thus far performed at a top-quartile level, judged by a variety of different metrics. Is this good? And how confident can we be that such performance will continue?

Recently, Chase Stuart noted that three sophomores from the same class have not played this well since at least the NFL-AFL merger. Though ANY/A is less context-specific than the other measures, it has the advantage of being transparent and easy to calculate, permitting historical analysis. Stuart compared the first 8 weeks of 2017 for Goff, Prescott and Wentz to full seasons of prior 2nd year QBs. Comparing partial to full seasons isn’t quite neutral, due to the disparity in number of games sampled; we should expect some mean reversion of our reference QBs as sample size increases. Using pro-football-reference’s excellent query engine, I examined the first 9 weeks for each sophomore quarterback from 1999 through 2017. Historical comparisons need to be adjusted for era, due to the enormous change in average NFL passing efficiency over time. To account for this, I divided each quarterback’s ANY/A by the league average for that year. [1]

Top ANY/A vs Average since 1999, sophomore QBs, weeks 1-9

The 76 QB sample set in this study is itself a product of survivorship bias: only those QBs who were successful enough to throw 100 passes in the first 9 weeks of their second year in the league are included. On the other side of the distribution, successful QBs who rode the pine for their first few years (like Aaron Rodgers, Tony Romo or Philip Rivers) are not in this sample. The average age of the sample is 24, similar to our reference QBs.

The three 2017 sophomores are, as Stuart observed, performing extraordinarily well relative to their peer set (all are in the top quartile of the sample). Relative to their era, they are passing with greater efficiency than Tom Brady, Drew Brees, Matt Ryan or Andrew Luck did in their second seasons.

You will also note that the top ranked sophomore QBs include many future hits (Big Ben, Kurt Warner, P. Manning) and a few notable misses (Nick Foles, Derek Anderson). The last column I included is the Career Approximate Value (CAV), which is a (very) rough method developed by Doug Drinen that puts a single number on a player’s total career, encompassing both longevity and performance.

Below, I plotted log Career Approximate Value against ANY/A relative to league average for the first 9 weeks for second year QBs from 1999-2015 (I excluded QBs from 2016-2017 because recent QBs have not yet had sufficient time to accumulate CAV points).

The positive relationship shown above indicates that the first 9 weeks of a sophomore season predicts 37% of a QB’s future CAV. Do note that the correlation is sensitive to a few outliers. The odious Ryan Leaf and Akili Smith are on the bottom left, whereas Foles and Anderson are on the bottom right. I don’t want to ascribe an illusion of precision to this rough analysis – don’t fixate on the exact R-squared number, or the model coefficients. Both sample size and the extremely imprecise nature of CAV make me hesitant to draw definitive conclusions from the data. What is interesting to me is that the same plot using a QB’s full rookie season yields an R-squared of 0.224 – in other words, the first 9 weeks of a QB’s sophomore season tells you roughly 70% more about his future career than his entire rookie season does. Extending this analysis to full seasons since 1970, the R-squared is 0.083 and 0.2348 for rookie and sophomore years, respectively (n=155 & 204). My interpretation of this data: though rookie and second year passing efficiency predict only a small fraction of a quarterback’s career value, the sophomore year deserves 2.8x as much weight as the rookie year, in terms of confidence about predictive power. Rookie performance, in particular, is extremely noisy. One would have been wise to heavily discount Troy Aikman, Donovan McNabb and Terry Bradshaw’s dreadful rookie seasons. Rams fans should take note.

Relatedly, I didn’t find any predictive power when measuring the degree of era-adjusted-ANY/A improvement from rookie to sophomore season. This echoes Vincent Verhei’s study of second year improvement using DVOA. In hypothesis testing, a negative result can be an interesting result.

Quantitative analysis is not the only tool in an NFL researcher’s kit. Film study (though not my sphere of competence) is also valuable. Though Nick Foles had a magical sophomore season, the film showed reason for concern, as my friend Derek Sarley noted. I don’t personally see similar issues with Wentz – both his pre-snap adjustments and post-snap play appear to pass the “eye test”. No, he’s not perfect. Yes, he has flaws he needs to address. But so do all second year quarterbacks.

Moreover, our penchant for treating quarterbacks as static vessels of talent/ability shortchanges the importance of coaching and development. The installation of a new coaching regime in Los Angeles appears to be an interesting natural experiment, in terms of Goff’s maturation. Similarly, we can view Ezekiel Elliott’s probable(?) suspension as an instrumental variable when evaluating Prescott.

All inductive statements are, by their very nature, revisable. We don’t know the future; we can only use informed judgment to hazard a prediction. The false-positive rate for the top 20 QBs in table 2 above is 25% by my count [2], so let’s take that as the “base rate” of failure for the 2016 Sophomore QBs. It is therefore reasonable to expect that two – perhaps all three – of the 2016 sophomores will enjoy successful careers as NFL starters.

Finally, in these impatient times, let us remind ourselves that transcendent quarterbacks do not emerge, fully formed, from the forehead of Zeus. Each of these young, relatively inexperienced quarterbacks is playing the most technically and cognitively demanding position in sports at a very high level. Adjusted for experience and era, their achievements are even more astounding. The evidence suggests that the future of quarterback play is bright. Football fans, rejoice.

Thanks to Eagles fan / Data Scientist Sean J. Taylor for his insightful discussion on methodology. Any errors are mine alone.

[1] PFR’s partial season engine shows results from 1999 onward. Full season results go back before the merger, and also generate an era-adjusted ANY/A+ which uses a “Z-score” methodology, expressed in standard deviations above or below the population mean. My method is less sophisticated, though nonetheless robust.

[2] I excluded the reference QBs, as well as Marcus Mariota.

@sunset_shazz is an Eagles fan who lives in Marin County, California. He previously wrote about 4th down decisions.

Tagged with 2017, Carson Wentz, Dak Prescott, Jared Goff, Quarterback, Statistics, Rookie.

November 11, 2017 by Brian Solomon.
  • November 11, 2017
  • Brian Solomon
  • 2017
  • Carson Wentz
  • Dak Prescott
  • Jared Goff
  • Quarterback
  • Statistics
  • Rookie
  • Post a comment
Comment
Doug-Pederson.jpg

Think Again About Fourth Downs

Doug-Pederson.jpg

The following is a guest post by @sunset_shazz.

With their NFL team celebrating a come-from-behind victory capped by a last-second, team-record 61-yard field goal by an unheralded rookie kicker, all of Philadelphia is understandably basking in reflected glory.

Just kidding.

Instead, the city is fulminating in collective outrage over Doug Pederson’s decision to eschew punting on 4th and 8 from the opponent’s 43-yard line, with 2:36 left in the 1st half and a 7-point lead. Numerate commentators Bo Wulf and Jimmy Kempski have demonstrated that Pederson’s decision was by no means incorrect (most likely it was a push). The case for more aggressive fourth down decisions is over a decade old; there is meagre profit in arguing with those who are impervious to evidence.

I am far more intrigued by the decision process itself. Sheil Kapadia quotes Jeffrey Lurie discussing the 4th down decision-making process, unprompted, in an informal chat with reporters after a recent presser:

“A lot of teams — our’s is one — where it’s all in the offseason done with mathematics,” Lurie said. “It’s not based on any form of instinct. If it’s going to be 50/50, 48/52, then a coach is going to have their instinctual predilection, right? But what we found is there’s been so many decisions over time that are too conservative for the odds of maximizing your chance to win that the opportunity. … I mean, you’ve seen certain coaches that are deemed more aggressive because the math leads them there. That’s all it is.”

Following some snickering that Pederson’s decision-making is being dictated by his superiors, the head coach clarified that he is the decider, with help from an analytics staff, including coaching assistant/linebackers coach Ryan Paganetti and Jon Ferrari (the latter’s title – “director of football compliance” – was obviously conceived by Oceania’s Ministry of Truth).

Is this decision-making set up weird? No, in fact it may be ideal.

I was struck how Pederson, during a press conference, corrected a reporter’s estimate of the historical probability of success (the “base rate”), citing his staff’s model estimate off the top of his head. His facility of recall regarding the base rate is textbook behavioral science. The seminal work of Daniel Kahneman and Amos Tversky showed that what they termed “base rate neglect” is common in poor decision making. Here is an example from Kahneman’s Thinking, Fast and Slow:

An individual has been described by a neighbor as follows: “Steve is very shy and withdrawn, invariably helpful but with little interest in people or in the world reality. A meek and tidy soul he has a need for order and structure, and a passion for detail.” Is Steve more likely to be a librarian or a farmer?

As Kahneman wryly notes, it helps to know that there are roughly 20x more male farmers than male librarians in the United States. The base rate is very important information in making the right judgment.

I am struck by Pederson’s description of how he receives base rate and other key information, immediately prior to making an informed judgment in which he also will take non-quantitative measures into account (e.g. how the defense is playing, the weather, injuries, etc.) Here, the Eagles are harnessing another key behavioral tic – anchoring bias. Anchoring is the tendency to overweight proximate (sometimes irrelevant) information that is the starting point in making a decision under conditions of uncertainty.

Again, Kahneman and Tversky were among the first to describe and investigate anchoring, and by 2017 there exists a vast academic literature on it. My favorite study was conducted by James Montier, a financial economist (full disclosure: also a former colleague). Montier asked hundreds of subjects [1] (mainly fund managers and financial analysts) the following:

1) Please write down the last four digits of your telephone number
2) Is the number of physicians in London higher or lower than this number?
3) What is your best guess as to the number of physicians in London?

Montier loves shocking people with his results:

[T]hose with telephone numbers above 7000 believe there are on average just over 8000 doctors. Those with telephone numbers below 3000 think [there] are around 4000 doctors. This represents a very clear difference of opinion driven by the fact that investors are using their telephone numbers, albeit subconsciously, as inputs into their forecast.

Clearly, one’s personal telephone number should have no bearing on one’s estimate of the number of physicians in London. The fact that it does, consistently, for intelligent, educated, statistically-minded professionals speaks to the power of anchoring bias.

Over the last decade, hedge funds have paid attention to the behavioral science literature, and have sought to anchor their professionals to salient, predictive data. This has driven hybrid-quantitative trading strategies where a fund manager is augmented by an algorithmic or otherwise quantitative model. The outputs of the model are then used as anchors for further tweaking by a human that is aware of variables outside the model’s specification. Wall Street got this idea, in part, from the world of chess, where, by 2005 the best type of player was a hybrid expert + model, capable of beating Grandmasters and machines. Similarly, University of Pennsylvania professor Philip Tetlock [2] has found that expert forecasters can significantly improve their decision processes by relying on models to improve their calibration of variables such as base rates. The Intelligence Advanced Research Projects Activity (IARPA) has taken note of these results, and the CIA has been studying similar literature since 1999.

When Doug Pederson hears base rate data and associated variables over his headset, the Eagles organization is harnessing anchoring bias, and turning it from a bug into a feature. Moreover, the augmented expert approach is consistent with that of the more sophisticated analysts in the fields of finance, academia, chess, and government.

It’s nice to root for a team that pays attention to the world outside sports, rather than snickering at it dismissively.

@sunset_shazz is an Eagles fan who lives in Marin County, California. Check out his previous article during last year's Air Yards debate.

[1] His initial sample was 300 subjects, though I believe he has replicated this finding with subsequent samples.

[2] Since 2012, I have been a participant in Tetlock’s Good Judgment Project.

Tagged with Doug Pederson, Fourth Down, 2017, Jeffrey Lurie.

September 28, 2017 by Brian Solomon.
  • September 28, 2017
  • Brian Solomon
  • Doug Pederson
  • Fourth Down
  • 2017
  • Jeffrey Lurie
  • 3 Comments
3 Comments

McNabb or Kolb

The Eagles blog that outlasted two quarterbacks.

  • Blog
  • About
  • Links
  • Contact
  • Twitter→

Copyright © 2010-19 McNabb or Kolb. All Rights Reserved.