This is the latest in a series of posts looking at which team level metrics are repeatable from season-to-season in the Premiership. The next three paragraphs are generic for the series, feel free to skip.
As I’ve said before, it’s all well and good knowing that team ‘x’ took 20 shots in the first half against team ‘y’, but unless we know whether the number of shots a team takes is repeatable over time then trying to put that number into context is essentially useless (a point that is made far more eloquently by Richard Whittall in this column).
Ultimately, determining how repeatable a metric is allows it to be broken down into a ‘skill’ component – something that teams can control – and a ‘luck’ component – something teams have no control over. (For the story so far see I’ve placed a summary table at the bottom of this post, which I’ll continue to update in the future). Whilst those that are dominated by luck are wonderful insofar as it’s funny to watch the media play out narratives that can be explained very simply by regression towards the mean over time, those that are dominated by skill tell us something useful about the team posting the numbers, which will be repeated season after season, and thus are the metrics we should truly be interested in.
The theory here in this series is pretty simple. I take a big group of teams and compare how well the value they record for a given metric in one season correlates to the same metric the following season, and determining the correlation coefficient (R value) of a plot with year ‘n’ on the x axis and year ‘n+1’ on the y axis allows the breakdown of a metric into skill and luck components to be established. The sample comprises of the 204 pairs of ‘back-to-back’ team Premiership seasons that have occurred since the beginning of the ’00-01 Premiership season (17 non-relegated teams per season x 12 back-to-back seasons).
This time I’m focussing on goals. Goals are clearly important in football – as I showed a couple of years back they’re highly correlated to points, and it’d be ideal if they were highly repeatable from season-to-season. At the end of the season they’re easily found and freely available in the league table so the only extra information we’d need to predictive model is the relationship between points and goals, and how far we should regress the value posted in year ‘n’ to get the best prediction for year ‘n+1’. So lets see if we can go about doing that.
First up, this is the number of goals scored by a given team in year ‘n’, and year ‘n+1’.
That’s a promising start – the breakdown is 75% skill and 25% luck. The best we’ve seen so far in this series.
Next, lets take a look at goals against.
Not quite the same – and if you check out the table below it’s a pattern that has been emerging in this series – the metric that is controlled by defense is less repeatable than the equivalent controlled by a teams attacking ability.
Next up let’s take a look at goal difference – it’d be really useful if the correlation here was good – it’s included in every league table, almost regardless of how simple they are.
So yer, that’s kind of handy. We’re up to 83% skill and 17% luck. Basically if we look at the league table and a team seems way out of place given their goal difference (hey Newcastle ’11-12) we can reasonably expect some regression the following season.
One last thing, goal ratio, defined as goals for/(goals for + goals against).
So there’s a slight improvement here, but it’s miniscule. The split remains 83/17 and, unless you have a spreadsheet that will do it for you at the click of a button, it’s probably not worth the extra effort it takes to calculate.
In short though, goals are really repeatable, and really useful. Given how well they’re correlated with points, they stand a solid chance of being the basis of a great seasonal predictive model.
Finally, below is a table summarising this series so far, with each metric broken down into its skill and luck components. Skill and luck are defined therein in the context of ‘the repeatability of metric ‘x’ is ‘y%’ skill driven, and ‘z%’ luck driven at the team level over the course of a Premiership season. Click on the names of any of the metrics to be taken to the post with the relevant plots posted.
|Metric||% skill||% luck|
|% of total shots that are on target (%TSOT) for||53||47|
|%TSOT for + %TSOT against||52||48|
|PDO (penalties excluded) (1)||46||54|
|% of total shots that are on target (%TSOT) against||44||56|
|sh% on shots from inside the box (2)||37||63|
|sh% (penalties excluded) (1)||36||64|
|sv% (penalties excluded) (1)||32||68|
|sv% on shots from inside the box (2)||24||76|
|sv% on shots from outside the box (2)||23||77|
|Penalties awarded differential
(penalties awarded for minus penalties awarded against) (1)
|Having penalties awarded against (1)||9||91|
(penalty goals for minus penalty goals against) (1)
|sh% on shots from outside the box (2)||8||92|
|Being awarded penalties (1)||4||96|
|Penalty goals conceded (1)||3||97|
|Penalty goals scored (1)||<1||>99|