Distribution of goals in the premiership

This is the first of three posts over the next week or so. I’ll make them as informative as possible but at the end of the day I doubt there’ll be any groundbreaking conclusions. I still think they’ll prove useful as representations of some fundamental principles that should prove to be useful resources in the future.

What I’m really waiting for is to finish processing the data I have. I’m about half way through and there are some neat patterns emerging but I’m waiting for the whole data set before posting anything. The timescale is a couple of weeks (depending on how much travelling I do) but once that’s done then I’ll be able to demonstrate some pretty interesting stuff.

First a table of stats – I don’t think it will mean much now but could be useful later.

Mean Standard deviation 95% max 95% min
Total 2.59 1.66 5.90 -0.72
Home 1.51 1.29 4.09 -1.08
Away 1.09 1.09 3.27 -1.10

The 95% min and max are the limits between which, statistically, 95% of results will form. Obviously a team can’t score a negative number of goals but the distribution isn’t a normal one because of the number of times there are zero goals. Treat any negative number on these tables as zero.

Now a couple of graphs to show how goals are distributed within games. First up is total goals

Distribution of goals per game

Then split between home and away teams

Distribution of goals per game split for home and away teams

And finally a plot showing the how often home and away teams score at least a defined number of goals.

Plot showing how often a home/away team scores at least a certain number of goals

The way to read this plot is as follows. If, for example, you want to know how often a home team scores 2 or more goals. Start by finding 2 on the x-axis and tracing straight up until you meet the line for the home team. Then trace from that spot directly left to the y-axis to find the frequency (the proportion of games where the home team score 2 or more goals). Multiply the frequency (~0.45 in this case) by 100 and you have the probability as a percentage (~45%).

The last graph proves to be the most illustrative from my perspective. The area between the lines represents the extra goals scored by the home team compared with the away team.

Next time there will be a remarkably similar looking post focussing on shots on target.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s