**I asked the nice people at Infostrada Sports Group for a whole slew of data with regards to penalties taken in the Premiership and they were kind enough to provide me with it. They can be found ****online**** and on ****twitter***.*

Contained on the plot below is every referee who has officiated a game since the beginning of the 2001-02 season. The axes are pretty self explanatory but the labelling can be confusing due to the sheer number of names. Click on any of the plots in this post to view a high-res, larger, version.

I expected this spread to look kind of messy to the left, where variance can have a huge effect but I’m not sure how large I expected the spread to be towards the right hand side. Luck should do a good job of evening out there yet, for example, Mark Clattenburg awards ~50% more penalties than Graham Poll or Peter Walton. I suspect part of this is due to officiating in a different era, there are more penalties awarded now than ten years ago (link), but whether that is due to a change in play or a change in the mentality of referees I’m not sure. It could well be a mixture of both.

We can, however, take this plot one step further and add some lines to map the standard error, or amount of random variation, we’d expect to find within the sample. The plot below has had three lines added to it and is known as a funnel plot. The solid line at the centre represents the mean number of penalties awarded per game (0.241) and the two dotted lines represent the upper and lower confidence limits, which lie two standard errors from the mean.

The basic idea of a funnel plot is that, if you take an action such as flipping a coin, then in the first few flips you’re much more likely to see huge deviations from the expected behaviour (the mean) than over a large period of time i.e., you’re more likely to flip four straight heads than ten straight heads. As the action is repeated the results will regress towards the mean, and as such the confidence limits become narrower the more times the action repeated multiple times, hence the funnel shape of the confidence limits.

So what does this plot tell us? Well ~95% of the data points should fall within the lines and that is pretty much reflected in the plot, there are only two referees who are a significant distance from the mean, Jonathan Moss (8 penalties in 13 games, 0.615 penalties per game) and Mike Dean (99 in 264, 0.375).

Moss officiated relatively few games, possibly for just this reason (on average we’d expect to see 3 penalties in 13 games). Dean on the other hand awards 55% more penalties than average, even after 264 games, and sits as the true outlier on this plot. Basically if I were desperate for my team to win a penalty I’d want Mike Dean on the pitch.

The second thing I want to look at is the propensity of penalties to ‘even up’ games by awarding penalties to both teams in the same game. I think this is behaviour that most fans suspect occurs, but can we be sure that any referees actually do so?

The following plot looks at the number of games in which a referee awards a penalty and how many of those games see penalties awarded to both teams:

So what does this tell us? Well the confidence limits are large because of the small sample size. For example Mark Clattenburg has awarded penalties in 40 different games, has never awarded a penalty to both teams, and yet we still aren’t sure that this isn’t due to random effects. At the opposite end of the scale it’s probably still quite a stretch to say Alan Wiley and Mike Dean are efficient at evening the score with any confidence. Basically the small sample size means we don’t learn all that much here.

Next time I’ll have a look at the split between the number of penalties awarded to home and away teams, whether home/away teams are better at converting penalties and whether the ability to convert a penalty is score dependent.