What the hell happened to Wigan? Part I – scorer bias at the DW?

Rather than reporting TSR’s here I’m simply going to record the percentage of shots a team takes. It simply looks cleaner.

I’m going to start by laying out the basics and building up from there. Firstly shots aren’t equally distributed between home and away teams. There’s a 56.6/43.4 split – i.e., of every 7 shots in a match, home teams will take an average of 4.

53% of these shots are on target (home=53.6, away=52.7) and, of those shots on target, almost exactly 20% of those that aren’t penalties are scored (20.2/19.8). Because the conversion rates almost equal, the split in (non-penalty) goals, 57.9/42.1, is close to the 56.6/43.4 distribution of shots mentioned above.

I haven’t tested this across every team in the sample because it takes a fair bit of work but, for those I have, it turns out that this 13% gap (56.6-43.4=13.2) is pretty repeatable, regardless of quality. My ‘go-to’ good team are United, and their split is 66/54, whereas my ‘go-to’ bad team are Sunderland, who’s split is 51/39.

That’s the groundwork covered I think. Now I’m interested in this because for few years there have been specific teams that consistently over- or under-perform every shots model that I make to predict how well teams will do. Wigan are a prime example of this, they’ve taken 50.5% of the shots in the eight seasons they’ve been in the Premiership, and TSR suggests this should be good for ~424 points, or 53 per season. In fact they’ve racked up only 331, or 41 per season (as an aside it’s incredible they’ve managed to average so few points and only be relegated once). Given the current state of football analytics, the simplicity of the models I’m using, and the role of luck in the Premiership, a ~20% error over one season isn’t a terrible prediction, but over 304 games that’s a long long way to be out. So I’ve been looking at different ways to slice the data I have in order to understand the discrepancy. I don’t expect to find a simple fix that gives a perfect answer, but it would be useful to at least get an indication.

One thing that’s been in the back of my mind for a while is the possibility of scorer bias at certain grounds. I’ve mentioned this a ton of times between this blog and twitter so it’s about time I did some of the leg work to look into it. The idea behind scorer bias is that, because the people recording the stats at a given arena are always the same, they may interpret the definition of stats differently. It’s a well known phenomenon in the NHL – and has been blogged about to a great extent, but I think the best illustrations are these from JLikens, and Gabe Desjardins.

There are some metrics in football that are very strictly defined. For example the number of goals scored is irrefutable, as are the number of corners, cards, and offsides awarded in a given game. There are others though, such as shots, which are open to interpretation. Is a mis-hit cross deemed a shot if it goes on target? How about a blocked/deflected shot? That free kick from out wide that just bends past the far post – is that a shot or a cross?

Fortunately football lends itself particularly well to a home/away split when it comes to looking for scorer bias because a team will play one away game at every other stadia, and therefore be ‘scored’ by every other set of scorers. Thus any bias in scorers at other stadia will be minimised, as their input is only one game of 19, as opposed to the home scorers, who presumably mark all nineteen of a teams home games. So how am I determining whether a ground is subject to scorer bias? I’ve come up with three possible indicators:

1. A large discrepancy in the number of attacking events being taken between a teams home and away games
2. A large discrepancy in the proportion of shots being recorded as shots on target between a teams home and away games
3. A large discrepancy in a teams sh%/sv% between a teams home and away games

Starting with number 1:

Screen Shot 2013-05-19 at 11.44.59 PM

Of these, the largest discrepancy from the mean are goals, which are 6% lower at the DW than in the average Premiership game. We can, however, discard these – as we mentioned before goals are an irrefutable stat. Shots on target are ~3% either side of the mean, and total shots are spread about 2% either side. There’s no evidence of scorer bias here.

Number 2:

Screen Shot 2013-05-19 at 11.47.14 PM

In each case here a 1% deviation from the mean is about 34 shots, so we’d have expected ~40 more shots to be called ‘on target’ than we’ve seen. But this is over a 150 game sample. If there’s bias here, it’s to the tune of one shot per four games being erroneously deemed ‘off target’. That’s not conclusive enough for me.

Number 3:

Screen Shot 2013-05-19 at 11.59.43 PM

This is Wigan’s sh% and sv% with penalties stripped out. There’s maybe an effect here – it’s something to look at at a later date – but I have a feeling that the differences will prove to be too small to be statistically significant.

In short, there probably isn’t scorer bias at the DW, and if there is it’s nowhere near large enough to account for the discrepancy between Wigan’s TSR and their point total. Hopefully the next round of digging proves more fruitful.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s