Thursday, August 23, 2018

Due North: Analytics Research in Canadian Football

1-Abstract

A broad survey of the history of scholarship in Canadian football analytics as relates to in-game decision-making. Works discussed span from 1982 to 2016, and while most focus on CFL data there does exist some mention of Canadian university football research (then CIS, now known as U Sports). Eleven different works covering a variety of topics create the basis of modern analytics in Canadian football.

2-Introduction

Although football analytics research has been well discussed (Clement 2018a, [b] 2018, [c] 2018) insofar as one considers only American football, the discussion of Canadian football research in a meta-fashion has remained relatively sparse. Work has largely been hampered by a lack of data, both through a shortage of its supply, there being only ~100 CFL or CIS games per year, vice 256 NFL and ~800 NCAA FBS games per year, and a shortage of availability, whereas American football play-by-play data has been cultivated and stored in a number of different well-organized databases.

3-Analytics

The first, and perhaps most definitive, work is that of Peter C. Bell from the University of Western Ontario, who created an EP model in the style of Carter & Machol (1978) using 46 games of CFL data in 1978 (1982). Following Carter & Machol’s methodology, Bell divided the field into 10-yard chunks to bin his limited data, and from each of these used a Markovian approach, each potentially leading to another 1st & 10 transition state in one of the 11 10-yard chunks, terminating in an opponent’s possession in one of the 10-yard chunks, or ending with one of the four scoring plays Bell considered most likely - touchdown, field goal, rouge, or defensive touchdown. Interestingly, Bell does not consider the safety among the likely ending states of his model, perhaps because while it is a common ending in the first couple 10-yard chunks, it is almost impossible to surrender a safety once the offense is past its own 30-yard line. It is also possible that in the CFL in 1978 teams did not concede safeties with the same regularity that they would in later decades. Bell adjusted the value of scoring plays to account for the subsequent change in possession, but unlike all such calculations done in American football (Clement 2018b), Bell’s work shows that a kickoff has positive expectation for the kicking team, ergo that a touchdown has an effective value slightly greater than 7 (7.050) points. While the magnitude of the increase is too small to be of consequence, almost certainly being absorbed. Table 1 shows Bell’s results for EP by field position  on 1st and 10.
Field position
Expected Points
105
-1.690
95
-0.873
85
-0.171
75
0.305
65
1.051
55
1.604
45
2.353
35
3.160
25
3.672
15
4.248
5
5.751
Table 1 Expected Points by Field Position (Bell 1982)
Consistent with other works, and especially with Carter & Machol, Bell’s results show EP being largely linear, trending sharply upwards in the last section. Bell’s regression slope matches that of Carter & Machol’s at 0.07 points per yard. Bell both calculates P(1D) and with it a set of 3rd down recommendations, and also the break-even P(1D) rates for different situations. The broad strokes of Bell’s conclusion encourage coaches to go for it on 3rd & 1 anywhere on the field, 3rd & 3 or less in field goal range, and 3rd & 5 or less in the red zone. Table 2 shows Bell’s P(1D) results for 3rd down by distance.
Distance
Attempts
Successes
Observed P(1D)
Smoothed P(1D)
1
86
71
0.83
0.76
2
19
10
0.53
0.69
3
8
5
0.62
0.61
4
7
3
0.43
0.54
5
8
5
0.62
0.46
6
2
1
0.50
0.39
7
0
0
---
0.32
8
6
2
0.33
0.24
9
3
0
0
0.17
Table 2 P(1D) of 3rd down by Distance (Bell 1982)
Keith Willoughby is perhaps the most prolific author in the field, having written three published pieces on Canadian football. While reviewing the break-even point for missed field goal returns (Willoughby 2001) he developed an EP model by applying a linear regression model to starting field position for drives in 18 games (n=272). It remains an open question whether using only the starting plays of drives (known as P & 10) is materially different from using all 1st & 10 plays or whether this is a needless restriction of the data. Regardless, Willoughby’s regression function comes out toy = -1.738 + 0.054 x. While the x-intercept is consistent with other EP results in American football (Clement 2018b) the EP for drives is lower than what we would expect, with a maximum value of 4.202 points. A look at the visualization of Willoughby’s data shows a scarcity of data points as the distance to the goal line decreases, with the data dominated by points between the -10 and -45 yard lines (Willoughby 2001). Thus, the fitted line is largely an extrapolation, but previous work in American football has consistently shown a break from the linear model towards either goal line (Clement 2018b). Nevertheless, since the focus of Willoughby’s was comparing the decision between conceding a rouge after a missed field goal vice returning the kick, the break-even point certainly lies in the area with the most data, and any breakdowns of the model outside this region are not relevant to the task at hand. If one were to make further criticism of the piece it would focus on the small sample sizes and the inclusion of data from only a single team in the league. However, in 2001, while computers had certainly infiltrated academia, they were still trying to make inroads in the video departments of football teams, and the support provided by the Saskatchewan Roughriders was certainly invaluable. Willoughby argues that his slope agrees “reasonably well” with Carter & Machol (1971), their slope was measured at 0.07, a nearly 40% difference from Willoughby’s 0.054 result.
Willoughby later used a number of per-game team statistics (rushing, passing, fumbles, interceptions, and sacks) to develop a logistic regression model allowing prediction of pre-game win probabilities (Willoughby 2002). He observes the strong correlation (p<0.01) of the coefficients for interceptions and rushing yards, but misunderstands the causality. Teams who are winning will run the ball, irrespective of its effectiveness, to run the clock. Conversely, teams who are trailing need to pass the ball and often take greater risks, leading to more opportunities for interceptions. Teams with many interceptions and rushing yards accumulated these as a consequence of winning, and they are only useful measures in the sense that winning teams tend to win (an unremarkable tautology).
Tangentially related is Dr. WIlloughby’s development of a more sophisticated technique for CFL scheduling (Kostuk and Willoughby 2012), which, while not germane to this discussion, certainly serves to reinforce his passion for Canadian football and his support for it however he is best able to provide it.
Derek Taylor, currently of TSN, wrote a four-piece work while employed as a journalist at Global News. Using a dataset of Canada West games from 2007-2012 (~168 games) he looked at the raw EP in Canadian University football (Taylor 2013a). As a result, there is no consideration for a defensive field goal, defensive rouge, or defensive safety, as his model does not look beyond the current drive. Taylor’s stated goal was examining the decisions that occur commonly in Canadian football, namely: whether to take a kickoff return or the ball at the -35 after surrendering a field goal, whether to return punts out of the end zone, and whether to punt or surrender a safety.
When discussing the benefits of choosing to return a kickoff after having conceded a field goal Taylor considers the impact of returns for touchdowns and the average position of returns. His determination of the value of a kickoff return is based on the EP value of the average field position after a kickoff return (Taylor 2013b). This assumes that the underlying EP chart is linearly distributed. To this we look to previous research on EP. While most EP studies do show a linear relationship (Clement 2018b), Taylor’s model is a raw one, and raw models demonstrate a quadratic relationship. If Taylor’s model is self-consistent the impact of longer returns must be more heavily weighted, and the EP value of the average return is not the average EP value of the return. Taylor uses the same methodology to discuss the option after a safety, but in this case because the average return is beyond the -35 there is no question as to the better option.
In the third installment of his series Taylor examined the rule wherein a returner on a punt or missed field goal who elects to return a kick out of the end zone will receive the ball at no less than the -20. Here he considers not only the initial difference in value but also the impact on future drives, by accounting for a theoretical 15-yard difference for the opponent on the next possession. Ultimately his numbers conclude convincingly that kicks should be returned out of the end zone regardless of whether the return expects to reach the 35-yard line. His conclusion holds not only for missed field goals, which provide the bulk of their analysis, but also for punts, which are more rarely fielded in the end zone, and even for kickoffs where there is no guarantee of reaching the 20 yard line. We can question whether the use of a raw EP model has unduly influenced this third analysis, since the quadratic nature of raw EP models leads to relatively small differences in EP near one’s own goal line.
Taylor’s final piece and most impassioned one looked at the habit of conceding safeties when faced with 3rd down in one’s own end of the field (Taylor 2013c). Taylor posits that the hidden part of the damage inflicted from the safety is from the conded possession and field position, which he values at 1.42 points. Here Taylor’s methods become convoluted. At the risk of hypothesizing atop an interpretation, Taylor seems to be doing his work manually, rather than having coded a scraper and parser to allow a computer to do the labourious parts of the work. As a result he is forced to make a number of assumptions, generalizations, and broad categorizations. Ultimately his conclusion is that the concession of a safety is the better course of action only within one’s own 10 yard line. University of Manitoba defensive coordinator Stan Pierre (whom the author considers a personal friend, but with whom he frequently disagrees) posited that allowing a touchdown carried with it some degree of immeasurable effect greater than the score itself. Taylor reworked his calculation assigning 9 points to a touchdown, an arbitrary figure meant to account for this unseen force, but his conclusions remained unchanged.
A team at Carleton University developed a Python scraper for CFL play-by-play data (Wu 2015a). Owing to the non-obvious structure of the HTML of CFL game data this required the use of a number of Python packages. This play-by-play was organized by games and seasons into a complete database within an arborescence of folders (Wu 2015b). Finally this database was converted into a relational database with MongoDB (Wu 2015c). Although this work did not produce any conclusions they did make their code available via GitHub (Wu 2015d), facilitating any future work in the field by making available the play-by-play data which otherwise would be very laborious to scrape.
The website 3 Down Nation, a leading source of journalism in Canadian football, published A Primer on Advanced Stats in Football (Dryden 2015a), introducing CFL fans to such notions as Pythagorean Wins, Simple Rating System, for which they track SRS values for CFL teams through the related site CFLstats.ca (Dryden n.d.), which serves as a repository for player, team, and game data for CFL games reaching back to 2008.
A later piece from 3 Down Nation looked at the relationship between various turnover-related metrics and winning percentage (Fulton 2015). Unsurprisingly, his primary conclusion is that turnovers are strongly tied to winning. Obviously Fulton goes deeper than this tautology, and shows that turnover ratio as the strongest predictor of success. He also finds that interceptions are more costly than fumbles. This ties back to Willoughby (2002) above - teams that are losing throw the ball in a desperate attempt to come back, and are more willing to risk interceptions in the process. Additionally, these teams are generally worse than average, given that they are already losing. Conversely, teams who are winning do not throw the ball very much, and so avoid interceptions. Quizzically, Fulton finds that giveaways are more impactful than takeaways, by a factor of 248%. By symmetry this is indeed odd and may merit further investigation.
In a continuation of Fulton’s work, Dryden (2015) goes further into the distinction of fumbles and interceptions. He notes that the field position difference is 3 yards in the offense’s favour for fumbles and 0 yards for interceptions, but that gap grows to 12 yards if one considers punt return fumbles. Ultimately, Dryden does come to discover that far more interceptions happen when the offense is trailing, whereas the gap for fumbles is relatively small.
A final piece from Fulton (2016) looked at season-to-season mean reversion of CFL teams. While Fulton shows some evidence of mean reversion, he does so in the contect of season-to-season comparison, and without comparing it to some sort of expectation, using a simple .500 record as the expectation. He finds weak support for the idea, though it is likely that if his methods had looked for mean reversion in teams relative to their Pythagorean expectation the evidence would have been much stronger, simply owing to the nature of good teams to remain good at least in the short term.
Fulton would continue his work through the CFL Analytics Lab, beginning with a piece discussing whether there was a statistically significant advantage for the Hamilton Tiger-Cats kicking field goals at home (Fulton 2016b). While Fulton finds that an advantage is present, he does not provide any measure of statistical significance, and given the recency of the Tiger-Cats new stadium that forms the basis of the discussion the effect of Hamilton’s own kickers in the brief time confounds the results.. Still, in the field of Canadian football analytics his is the only investigation of field goal kicking. This piece also discussed differences in punting averages, but this proved to be confounded not only by the punters but also by the returners. Ultimately the study proved premature and in need of more time and data.

4-Conclusion

The corpus of Canadian football research presently consists of a small but growing collection of small pieces, flagshipped by the works of Bell (1982) and Willoughby (2001). The growing availability of data, as provided by (Wu 2015d) will hopefully lead to growth in the field.
Modernized Canadian equivalents to the current American football analytics of P(1D), EP, WP and various kicking metrics are the obvious direction for future research to take, as well as comparative work between American and Canadian metrics.

5-References

No comments:

Post a Comment

Three Downs Away: P(1D) In U Sports Football

1-Abstract A data set of U Sports football play-by-play data was analyzed to determine the First Down Probability (P(1D)) of down & d...