For a project for @simplystats, I had to analyze a dataset. I picked the NFL data (as NFL.com had data from 2000-2012 on their website). I scraped the play-by-play data, and the time constraints for cleaning the data was relatively high. I decided to scrape the box-score data, text example, more polished output, but even this had some problems. Two interesting summary measures were: Kickoff Returns (Number-Yards) and Kickoffs (Number-In End Zone-Touchbacks).
Many of you know that in 2011, the NFL changed the kickoff line from 30-yard to 35-yard, but did not change where touchbacks started out at (still the 20). As a result, we wanted to see how dramatic of a change of touchbacks/return yardage has happened in those 2 years. We wanted to keep things very simple and on a grand-view level.
Overall – we looked at the total number of kickoffs, those that were in the endzone, number of touchbacks, and number of returns. Overall, in the regular season and the post-season games, we see that the number of touchbacks has increased dramatically (and naturally the number of returns has decreased) since the change in 2011.
Also – the bottom panel shows the distribution of these metrics ( aggregated at a team level ) but has the overall boxplot to show that the variability are not changing drastically over the year. We may have individual teams changing drastically, but again we’re looking more at a “league-level”.
Now, using the average return yardage – by dividing Yards / Number returns for kickoffs, we looked at how this may have changed in years 2011-2012.
The top plot shows the spaghetti plot for each team (Houston Texans start in 2002), and a loess (Cleveland) smoother (with 95% confidence interval) shows that there is a weak increase over the years for the average return yardage. Looking at the distributions of each team (below panel), actually shows that seasons 2011-2012 have a slight jump compared to the other seasons. There are a few ways to test this (wilcox rank sum – collapsing the years past, linear spline, etc, and each were “significant”), but we simply present the data. Overall, there’s an estimated 1.75 yard increase (comparing to 2000-2010) or 1.53 yard increase (comparing to just 2010 average) in the 2011-2012 seasons.
So there are more touchbacks, but it seems that if one is returning a kickoff, they would return slightly more yards than in years past. This could be for a slew of reasons: a better selection procedure, kicks returned are shorter, maybe harder to cut angles for defense (thanks T. Louis), or something else. Whatever it is, I thought it was interesting.
Avg Return Yardage for Year 2000 | _{21.05} 21.51 _{21.96} | |
Slope for Years 2000-2010 (avg yards /year) | _{0.07} 0.13 _{0.19} | |
Change in slope for years 2011-2012 (avg yards /year) | _{0.15} 0.51 _{0.87} |
Acknowledgements: ggplot2 is awesome