//
you're reading...

Datablog

Alternate ways of displaying football results

Click here to skip ahead to chart 1
Click here to skip ahead to chart 2

We are now two weeks into the premier league and steadily the positions in the table are starting to form. Towards the end of the last season the league table became an increasingly important tool to get a clearer picture of the title race, yet it often did an inadequate job of describing the story. Looking at the table alone, it was not clear if any of the teams had struggled through their season, or had long runs of losses as Liverpool did. Even towards the close of the season, it was unclear how many points Tottenham, Liverpool, Arsenal and Chelsea needed to secure a place champions league (leading to long articles such as this).

So here I propose two solutions, one is to present the season as a line chart so that you can see how each of the teams has faired throughout the season and compare them with one another (a good tool once the season as finished). The second solution is a river flow diagram, so you can see how the team has faired so far but also predict how well they could do in the future. These are simple versions with a moderate amount of interactivity. I have also suggested how they could be further improved and expanded upon.

I have explained the issues I have with the current method of displaying league results below for those who are interested here, but this focuses on two different ways for the results to be displayed in an online format. For all of the graphs I have used data from the 2011-2012 season, including the predictions in order to see how robust they are.

Points over gained by matches played:
(Note: Highlighting an area will zoom in to a certain time period)

This plots the points gained by the number of games played while enabling you to add and take away teams by pressing their names in the legend, this means that you can see how the teams have performed across different periods of the season and does more to describe the competition in certain areas of the table. You can zoom into certain areas by highlighting them with the mouse. There is also an option to view a similar graph with all teams selected and an alternative labeling.

How this could be improved and built upon

There are quite a few ways to expand this. The biggest improvement that I want to work on is displaying key events of the season on the timeline, such as transfers, major losses, injured players. Additionally clicking on each data point should provide a link to a match report, as well as providing information such as league position. I haven’t been able to find a complete and effective way to incorporate this yet, but here is an image of what it could look like:

One possible user interface.

Additionally this should include images (I have not done so here due to copyright) of the season highlights, with links to news reports surrounding specific events.

Using the date as the variable on the x axis as opposed to matches played, this will give a more accurate statement as to how the league table looked at different stages of the season.

Dealing with uncertainty

The above table provides a better look at the league at the end of the season. During the season it could do more, not only displaying the range of points that could be obtained, but give some projection of the likely outcome. A good method of doing this graphically is using a river flow diagram (also known as a fan chart or infamously as the river of blood), a method commonly used by the Bank of England to forecast interest rates. Here is one such example.

I then used the river diagram to display a forecast of the points each team could attain by the end of the season. The total blue area represents all of the potential positions that the team could hold, while the darker areas represent a more precise but potentially less accurate projection. The diagram below shows what that fan chart would have projected for each team at each stage of the 2011-2012 season, there is a key provided on the “Select a team” tab.

How this could be improved

When initially planning the project I wanted to find a way of comparing two teams’ forecasts to see if it is likely or even possible that one would over take the other. I am yet to find a way to effectively do this. The above chart was created using a stacked area chart and a macro that exported the 660 images for each team at all of the different points of the season. To create another chart that is able to compare any two teams would require around 6,270 images which is not really viable.

I have put together an example of how such a chart would look using photoshop, comparing the forecasts for Manchester City and Wolverhampton’s first 24 games into the season, then another diagram 25 games into the season. (Note that City is the faded blue). 

While highly unlikely (indicated by the current trajectory of form) Wolverhampton could feasibly overcome Man City at this point.

The above image shows the forecasts 24 games into the season, at which point it was possible for Wolverhampton to catch up with Manchester City if Wolves won all of their games and City lost all of their games. This is shown by the river diagrams overlapping. Should the teams maintain their respective forms Wolves will not overtake City, shown on the diagram by the forecasted trend lines which do not cross.

The next image shows that 25 games into the season it was no longer possible for Wolverhampton to overtake Manchester City, as the shaded areas have cross over on the diagrams.

The point at which Wolverhampton could no longer mathematically beat Manchester City

How the data for the river flow is calculated
The trendline for future matches is simply the average point gain per match that the team has had throughout the season. The darker blue area represents the uncertainty of this calculation, so there is an 80% certainty that the average is within this area. For those of you who are interested in how this is calculated, there is a full explanation here. In layman’s, terms it answers the following line of questioning:

“On average how many points does the team win from each match?”
“How much could this average vary?”
“What would happen if the team maintained this average?”

I should note that this is a very, very simple method of forecasting which could be greatly improved. Firstly the data sample that has been used is far too small, at best it is an average taken from 33 data points, at worst from 5 data points. It also doesn’t take into account other factors. For instance if in their first five matches Manchester United they faced lower league teams like Wolves, this method would predict a near certain win in their next match; irrelevant of who they would play. There are also much better methods of forecasting rather than projecting a trendline, the document from the Bank of England that I linked earlier.

Effectiveness of the calculations
While the forecast could be significantly improved, it does effectively show how a continuation of the teams’ current form would effect their standing in the table. For example Liverpool had a sudden change of form 24 games into the season, as a result the projected results were different from the actual points scored:

Prediction after 24 matches played for Liverpool

By contrast Manchester City maintained their form throughout the season. So the projection the model gave based upon the 24 games City had played is very similar to their final score:

Prediction made at the point of 24 matches played for Manchester City

Inefficiencies with the current format

Pos Team Pld W D L GF GA GD Pts
1 Manchester City 38 28 5 5 93 29 +64 89
3 Arsenal 38 21 7 10 74 49 +25 70
4 Tottenham Hotspur 38 20 9 9 66 41 +25 69
5 Newcastle United 38 19 8 11 56 51 +5 65
6 Chelsea 38 18 10 10 65 46 +19 64
7 Everton 38 15 11 12 50 40 +10 56
8 Liverpool 38 14 10 14 47 40 +7 52
9 Fulham 38 14 10 14 48 51 −3 52
10 West Bromwich Albion 38 13 8 17 45 52 −7 47
11 Swansea City 38 12 11 15 44 51 −7 47
12 Norwich City 38 12 11 15 52 66 −14 47
13 Sunderland 38 11 12 15 45 46 −1 45
14 Stoke City 38 11 12 15 36 53 −17 45
15 Wigan Athletic 38 11 10 17 42 62 −20 43
16 Aston Villa 38 7 17 14 37 53 −16 38
17 Queens Park Rangers 38 10 7 21 43 66 −23 37
18 Bolton Wanderers 38 10 6 22 46 77 −31 36
19 Blackburn Rovers 38 8 7 23 48 78 −30 31
20 Wolverhampton Wanderers 38 5 10 23 40 82 −42 25

Of the issues I have there are three in-particular:

  1. It is impossible to ascertain what form the teams are in.
  2. It is not immediately obvious how many games each team has left.
  3. It is not immediately obvious when it is impossible for one team to overtake another.

Issue 1- It is impossible to ascertain what form the teams are in:
Why its a problem:
Its hard to tell where and how the season was lost. For instance Manchester United looked certain to win the league until the 36th game in their season where they drew equal with their derby rivals. None of this is very well communicated in the table: all that can be ascertained is that it was a close finish.
How the charts provide a better solution:
A teams form is displayed by the gradient of the curve, the closer it is to 3, the higher the form. 

Issue 2- It is not immediately obvious how many games each team has left:
Why its a problem:
While it is pretty easy to figure out the number of games left in the season, it isn’t immediately obvious. For a league with a total of n teams and x games played the formula would be for the number of games left to play [2(n-1)]-x.
How the charts provide a better solution:
Both charts would not only show that information clearly and immediately, but it would give a greater sense of how far through the season each of the teams are.

Issue 3- It is not immediately obvious when it is impossible for one team to overtake another:
Why its a problem:
Finding out the range of possible points that the team could win in a season cannot be gained immediately. While the minimum amount is simple (the current points attained), the maximum possible amount to be attained in general terms is ([2(n-1)]-x)*3.
How the charts provide a better solution:
The feasibility of one team overtaking another in the table can be immediately ascertained by checking for any cross over of the two teams forecasted points.

Discussion

No comments yet.

Post a Comment