number of yards advanced = 2*off1*random1 - 2*def2*random2
where off1 is team 1's offensive rating, def2 is teams 2's defensive rating, random1 & random2 are random numbers between 0 & 1. One might argue that real games are more thought out than this, (I hope they are!) but there is a random component to the game which is ignored by most. The model simulates football games, using the above formulation to advance the ball. For a given matchup of two teams, the results of one simulated game differs from the next (just as real games would differ if two teams could find a way to play each other again and again). The model gathers the average statistics by playing 10,000 simulated games to get the average scores and the percentage of team 1 winning over team 2 for the given offense and defensive strengths. A data base of a range of offensive and defensive matchups has been collected to use in matching to real games.
The model is applied to the actual football games by setting a balance between fitting each team's offensive and defensive strengths to match the actual scores and to having the strengths remain consistent between weeks.
As an illustration, let's assume that BYU and New Mexico decided to play five games in a row (not likely, but it simplifies the example). Lets assume that for these games their offensive and defensive strengths remain the same (no injuries, no improvements in skill, etc). Here's how the games might turn out (these scores were generated from game simulations):
Hypothetical series matchup between Brigham Young and New Mexico: Brigham Young New Mexico score game_fit series_fit score game_fit series_fit def off def off def off def off 49 0.7 9.8 1.5 9.4 45 -1.4 12.8 -1.1 11.5 49 1.1 9.9 1.5 9.4 41 -1.4 12.2 -1.1 11.5 42 0.9 8.9 1.6 9.4 42 -0.8 12.5 -1.1 11.5 45 3.1 9.3 1.6 9.4 16 -1.0 9.2 -1.1 11.4 45 1.4 9.4 1.7 9.4 38 -1.1 11.7 -1.1 11.3
If, for each game, the offensive and defensive strengths are adjusted so that the average score from these strengths matched the actual game results, the resultant parameters would be those given above in the "game fit" column. On the average these parameters roughly agree with the "actual" ones (those that I used to generate this hypothetical matchup: BYU off=9.4 def=1.8, NM off=11.1 def=-1.1).
In the real world, it is not easy to distinguish the changes in scores due to changes in strengths and those due to simple random fluctuations. The model simply attempts to find the best balance it can between fitting the offense and defense to the scores and keeping them constant between games. Above I have also listed how the model would interpret this hypothetical series of games in the "series_fit" column.
Now, what if the real offensive or defensive strengths of a team change? The model cannot predict that, but it can attempt to track it. For example, below is a chart for a hypothetical championship series between two teams. In the middle of the series, one team's offense drops in value (lets say the starting quarterback is injured).
Hypothetical series matchup between Penn State and Florida: Penn State Florida score "real" series_fit score "real" series_fit def off def off def off def off 38 5.0 16.5 4.8 15.9 41 4.3 15.3 4.7 15.7 49 5.0 16.5 4.7 15.6 49 4.3 15.3 4.9 15.8 42 5.0 15.7 4.6 15.2 38 4.3 15.3 5.2 15.8 27 5.0 15.0 4.6 14.8 41 4.3 15.3 5.4 15.9 14 5.0 15.0 4.6 14.6 41 4.3 15.3 5.6 15.9
The model does detect a change, but since it has to average between games (because of the erratic results any one game gives), the change is spread out over a few games. Also note that some of the drop in Penn State's "real" offense is translated into an increase in Florida's "fit" defense since the model could not tell if the drop in Penn State's scoring was due to an offensive or defensive change. (If these teams played other teams as well, the model could sort this out better.)
Please email comments or questions to bfm@BassettFootball.net