Often incorrectly written as ELO, Elo ratings actually
take their name from the inventor, Arpad Elo, a Hungarian-born American physics
professor and Chess player who invented the ratings method as a way of
comparing the skill levels of players from his game. Its use has expanded, and
has been adapted for several sports including American Football and basketball,
but also in football, and it is their use here that is the focus for the rest
of this article.
The
Basics
The essence of Elo ratings is that each team has a
rating. When comparing two teams, the team with the higher rating is considered
to be stronger. The ratings are constantly changing, and are calculated based
upon the results of matches. The winner of a match between two teams typically
gains a certain number of points in their rating while the losing team loses
the same amount. The number of points in the total pool thus remains the same.
The number of points won or lost in a contest depends on the difference in the ratings
of the teams, so a team will gain more points by beating a higher-rated team
than by beating a lower-rated team.
Raw Elo
suggests that both teams ‘risk' a certain percentage of their rating in each
contest, with the winner gaining the total pot, i.e. their rating increases by
the losing team’s ante. In the event of a draw, the pot is shared equally.
A Simple
Example
A simple example shows how this works when two evenly
matched teams meet, and both have 5% of their rating at risk. Arsenal and
Chelsea both have a rating of 1000 so both teams risk 5%, i.e. 50 points, and
the pot contains 100 points.
There are
three possible outcomes.
1) Arsenal
win, and the result of this is that Chelsea’s rating drops by 50 to 950, and
Arsenal’s rating increases by 50 to 1050.
2) Chelsea
win, and the result of this is that Arsenal’s rating drops by 50 to 950, and
Chelsea’s rating increases by 50 to 1050.
3) The result
is a draw. The pot is divided between the two teams, resulting in the ratings
for both Arsenal and Chelsea remaining unchanged at 1000.
A Second
Example
A second example shows how this works when the home side
is stronger. Manchester City (with a rating of 1200) plays Aston Villa (with a
rating of 1000). Again, both sides risk 5% (60 points and 50 points respectively),
so the pot contains 110 points.
The three
possible results and their effect of the ratings are:
1) Manchester
City win, and the result of this is that Aston Villa’s rating drops by 50 to
950, while Manchester City’s rating increases by 50 to 1250.
2) Aston Villa
win, in which case Manchester City lose their 60 points and their rating drops
to 1140, while Aston Villa gain the 60 to improve their rating to 1060.
3) The result
is a draw. The (60+50) 110 points in the pot are divided by two, resulting in
Manchester City’s rating dropping by 5 points to 1195, and Aston Villa’s rating
improving to 1005.
A Third
Example
A third example shows how this works when the away side
is stronger. Wigan Athletic (with a rating of 800) plays Manchester United
(with a rating of 1000). Again, both sides risk 5% (40 points and 50 points
respectively), so the pot contains 90 points.
The three
possible results and their effect of the ratings are:
1) Wigan win.
Their rating increases by 50 to 850, while Manchester United’s rating decreases
by 50 to 950.
2) Manchester
United win, in which case Wigan lose their 40 points and their rating drops to
760, while Manchester United gain the 40 to improve their rating to 1040.
3) The result
is a draw. The (40+50) 90points in the pot are divided by two, resulting in
Manchester United’s rating dropping by 5 points to 995, and Wigan’s rating
improving to 805.
The table
below summarises these combinations of pre-match ratings, match results, and
updated ratings:
Some
Issues
All very simple, but for football, it is much too
simple. Anyone with a basic understanding of football can see a number of
problems with the above examples. One obvious problem is that home advantage is
not taken into account, so in a match between two evenly rated teams, in the
event of a draw, the away side should be rewarded, and the home side penalised.
In the ‘teams evenly rated’ example above, a draw for Chelsea at Arsenal is
clearly a better result for them than it is for Arsenal, and it is illogical
that both teams walk away at full-time with the same rating as when the match started.
In Part Two, I will look at some ways in which these problems can
be remediated.
In Part One we explained the basic premise of Elo ratings, and illustrated how they are applied. Part two will offer some suggestions on how the principles of Elo can be enhanced to make our ratings more useful. It is important to understand that these are only suggestions. There are no hard and fast rules that dictate what these parameters should be. There is no right and no wrong, only what works and what doesn’t work.
We finished
Part One with an example of two evenly rated teams, risking the same percentage
of their ratings, and identified one major problem which is that an away draw
is better than a home draw, and it is thus illogical for both teams to end the
match with the same rating as they started.
The
Punter’s Revenge: Adjusting For Home Field Advantage
One way to handle this is by having the home team risk a
slightly higher percentage of their rating than the away team. Back in the
early 1980s, two authors, Tony Drapkin and Richard Forsyth wrote a book called “The
Punter’s Revenge: Computers In The World Of Gambling”, which was targeted at
computer literate punters at a time when the personal computer was just
becoming popular. One of the more memorable chapters was on rating football
teams, and the author’s suggestion, after running trials, was to use 7% for the
home team, and 5% for the away team. I’ve found no reason to diverge too far
from these numbers.
If we re-visit the earlier examples from part one, using the 7% and 5% numbers, the results become:
When the teams
are identically rated going in, after a drawn match, the away team gains
slightly, the home team loses slightly, something that intuitively seems right.
If you’re not happy with the adjustments that 7% and 5% give you, then there’s
absolutely no reason not to tweak these, but I would caution against exceeding
10% or going below 3%. Changes in rating should be in modest increments, but at
the same time, not too modest that it takes a season for a declining team’s
rating to reflect its form.
Result
Adjustment: Incorporating Margin Of Victory
Now to address the next problem – match results. Basic
Elo doesn’t quantify wins. A win is a win, whether it is by one goal or by a
dozen. Most readers will agree that this is an unsatisfactory state of affairs,
and will make adjustments. One method is to increase the percentages that each
team risks, but to award a certain percentage of the pool to the winners /
losers varying depending on the margin of victory / defeat.
For example,
Arsenal and Liverpool are both rated at 1000, and Arsenal are at home. The pot
(or pool) contains 120 points, 70 from Arsenal, 50 from Liverpool. If the game
finishes 6-0 to Arsenal, it’s reasonable to give all the points to them. My own
preference is for a four-goal win or more to be sufficient to secure the entire
pot. A three-goal win is pretty good, and earns most of the pool, whereas a two-goal
win earns a little less, and a one-goal win the minimum. The following table is
a suggestion.
Winning is
worth at least 70% of the pot, with the margin of victory becoming less
significant as it grows. Winning 6-0 rather than 5-0 is neither here nor there,
but winning 1-0 rather than drawing 0-0 is much more significant – even though
the difference between both pairs of scores is just one goal. You may want to
consider a 1-2 defeat as a better result than a 0-1 defeat, but again,
decisions such as these come down to personal preference. With all the time in
the world, you might analyse goal times, and conclude that a 2-0 win decided in
the 30th minute is a stronger win than a 2-0 win in which the second goal was
scored on a breakaway in the 93rd minute with the vanquished team pressing hard
for an equalizer. A fair conclusion in my opinion, and an example of how you
can modify Elo to suit your own needs, and add flexibility based on the amount
of time you have available.
Maintaining
accurate ratings is time consuming, and in previous years I would attempt to
maintain ratings for the Premier League, Football League and Conference as well
as the Scottish Leagues. These days, I restrict my tracking to the top
divisions of England, France, Germany, Italy and Spain, in part because there
is a wealth of data readily available to input, and on the output side, there
are many liquid markets available. It is also my opinion that in the lower
leagues, ratings are not so stable. A modest amount of money goes a long way,
as recently seen with Crawley Town and Fleetwood Town, and ratings can soon be
out of date.
In Part Three, I will look
at more ideas for maintaining accurate ratings.
In Part Two, I looked at one way in which Elo ratings could be improved by measuring the strength of a win based on winning margin. However, the low scoring nature of football means that the match result often does not reflect the performance of the teams.
We have all
seen games where one team has dominated, only to lose 0-1 to a goal very much
against the run of play. If you limit your input to this single figure, goals
scored minus goals conceded, you risk entering less than accurate data into
your ratings.
While it is
true that Birmingham City did beat Chelsea 1-0 on November 20th, 2010 is it
fair and reasonable to award 100% or 70% of the points available to them? You
might think it is, and I would say that is your decision to make, but my take
on it is to look behind the result and use some of the other data that is
readily available these days.
When deciding
what data I should include, my rule is that there is a correlation between the
data and goals. For example, simple logic tells you that there is a
relationship between shots, shots on target, and goals. 10 shots, of which 5
were on target, doesn’t necessarily mean that a team will score 2 goals, but
for each league there are fairly consistent ratios which we can use.
Charles
Reep: Incorporating Shots On Goal
Pioneering football statistician Charles Reep began his
research in 1950 (at 3:50pm on 18 March while watching Swindon Town play
Bristol Rovers to be precise) and discovered (among other things) "that
over a number of seasons it appears that it takes 10 shots to get 1 goal (on average)".
This average
will of course vary from season to season, by league and by team, but the
important thing is that there is a correlation between shots, shots on target,
and goals scored. A note here that some of this data has an element of
subjectivity about it, and you will often see major differences in the
statistics for the same game from individual observers.
Again, how
much effort you want to put into this is a personal choice. Researching the
leagues you are interested in will show there are differences, which you can
incorporate if you wish, for example as of 2011, the EPL is more efficient at
converting shots to goals than Serie A.
I would
however caution against changing these parameters too frequently once you have
determined reasonable values, with my preference being to use an average for
the past three seasons. The soccerbythenumbers.com website often has some
interesting articles on this subject, along the lines of this entry from
January 2011:
Adding
Meaning
This data is important because it allows you to enter
more meaningful data into your calculations. Arsenal 2 Chelsea 1 is a start,
but in my view, the data is made more valuable by entering the shots and
shots-on-target figures also, so you now have for example Arsenal 2:5:12
Chelsea 1:8:19 - a set of numbers that might reasonably lead you to conclude
that Chelsea were a little unlucky in that their goals scored were lower than
might have been expected.
You can
include other data too, although I have yet to see any evidence of correlation
between free-kicks or yellow cards. Red cards can obviously be more
significant, but you would want to factor in the amount of time remaining at
the time of the dismissal. A headline of "10 man City see of United"
might sound dramatic and sell newspapers, or draw clicks, but if the dismissal
was in the 90th minute, it's a little misleading to say the least.
In Part Four,
I'll look at corner kicks and whether this additional data should be included
in your Elo based ratings.
Dangerous
Corner
I concluded Part Three with a discussion about what data can or should be
included when adjusting a team's Elo ratings. It might seem logical and
reasonable to include corner kicks, but perhaps surprisingly, the evidence
shows that there is essentially no correlation between the number of corner
kicks and goals scored. The English Premier League is actually the strongest,
while Serie A and La Liga are the weakest.
This isn’t to say
that corners do not lead to goals. One of the problems is that the readily
available data is based on match totals, i.e. they do not reveal how many
corners lead to a goal; only that over the course of a match, Team A had 2
goals and 12 corners. Both goals may have come from corners, but at this
'macro' level, there is no evidence that says there should be say one goal for
every eight corners.
Having decided what data to include, we are no win a
position to expand upon the simple table seen in Part Two, which looked like this:
Putting
It All Together
By using more data than simply the round figure of
goals, it is possible to 'more accurately' reflect the result of a game. I
mentioned in Part Three the real-life example of Birmingham City beating
Chelsea 1-0 on November 20th, 2011, and we will use this match as an example of
how additional data can be incorporated.
Birmingham
City had one shot, and they scored one goal. Chelsea had 24 shots, 9 were on
target, yet none resulted in a goal. If we have done our analysis and concluded
that from ten shots, you can expect one goal (on average), or that from three
shots on target, one goal can be expected, you have tripled the amount of data
you are entering, and this helps to smooth out any outlying data points.
It is at this
point that a basic knowledge of spreadsheets will be useful, since the easiest
way to automate these calculations is by creating LOOKUP tables.
Using the
ratios for this example (Shots : Shots-On-Target : Goals) we have a result of
4:1:1 to 20:10:0. Dividing the first parameter by 10 (10 shots approximates to
1 goal), and the second by 3 (3 shots-on-goal approximates to 1 goal), we have
in goal units 0.4:0.33:1 to 2:3.33:0. You can average (mean or median) these
numbers out, or apply a weighting to them so that the match result becomes
Birmingham City 0.58 Chelsea1.78. Any weightings or the choice of average is a
personal preference. While Chelsea did not win the game, their overall
performance based on these numbers suggests they were the better team, and my
ratings would adjust in accord with a more accurate scale, for example:
Again, it is personal preference how granular you make these numbers. Breaking them down into 0.25 increments is one idea, but you can use any number. Once the factors are entered into your spreadsheet, and you have set the LOOKUPs correctly, they do not need to be maintained. Your spreadsheet can calculate your match result, e.g. 1.78 to 0.58 and update the Elo ratings accordingly.
Modified
Results
At this point in the process, you might also want to
consider weighting the ‘modified’ result based on the strength of the
opposition. An implied score of 1.5 to 0.5 can reasonably be considered a more
merit worthy achievement against Manchester City than against a struggling
team.
Update the Elo
ratings based on the Table A above, or your version of it, and you’re done.
Most matches will see a small change in rating for both teams, some one-sided
affairs may see a bigger shift, but the ratings, once established, ‘should’
reflect the strength of one team when compared with another.
Predictions
How do I use my ratings to make a fortune I hear you
ask? One way is to expand your spreadsheet to incorporate a predictive feature.
For predicting a future match, you would enter in the two team’s ratings, say
800 and 1000. Create a table with the same margins as there are in Table A, and
this can easily be programmed to calculate the post-match Elo ratings for each
team if the winning margin is 0, 0.25, 0.5 etc. Your spreadsheet can be coded
to display the margin of victory which will keep the ratings as close to their
pre-game position as possible. Note that you will also need to allow for the negative
equivalents to cater for away wins, and the table above would also have values
assigned for -0.25, -0.5 etc.
For example,
Wigan Athletic are currently rated at 1257, Manchester United at 1549. If Wigan
plays Manchester United at home, a margin of 0 would result in the ratings
being unchanged (top right number 0.00) as in the picture below:
If we look at
the reverse fixture between these teams using the same ratings, the spreadsheet
shows the following:
The ‘expected’
result in this example is that Manchester United will win by 1.5 goals.
If the
modified result entered is 3.21 to 0.91, (e.g. United win 3:1 and these numbers
are modified for shots and other criteria) the picture below shows how the
ratings would change. Manchester United would gain 9 points, and Wigan Athletic
would lose the same 9 points. United's win by a margin of 2.3 exceeded
expectations, so they are duly rewarded, but winning does not always boost a
team's ratings.
By entering in
all the upcoming fixtures, your spreadsheet will give you a starting point
before you bet. Whether your preference is to focus on the matches expected to
be draws, or to look for value on the Asian Handicaps based on your
computations, is up to you. I tend to focus on the draws, matches where the predicted
result is 0 to 0.5, but that’s just my preference as I consider the draw price
to be somewhat ignored, and thus be more likely to offer value.
Caution
I mentioned that this prediction is a starting point.
You should always be aware of the relevance of the match to both teams – early
and late season can be treacherous, and if you use the ratings in domestic cup
games that some teams may take less seriously than others, the spreadsheet
won’t help you.
This also
raises the question of whether you should include Cup matches in your ratings
or not. My preference is to not use domestic cup matches, but I do use
inter-league Champions League games or Europa League games to adjust my
ratings, e.g. AC Milan v Chelsea.
For anyone
interested in my starting point for these ratings, I used the 2008-09 season
standings, and used UEFAs coefficient to make the English league stronger than
the French league for example. After three years of maturity, the top four
clubs in sequence are Barcelona, Real Madrid, Manchester City and Manchester
United. The weakest is Ligue 1’s Espérance Sportive Troyes Aube Champagne –
a.k.a. Troyes.
Conclusion
This concludes the series on Elo ratings, and once
again, I would like to make it clear that many of the parameters I use area personal
preference and can be adjusted in any way you wish. The process described above
is quite possibly unique to me, as it is a combination of ideas and thoughts
collected over more years than I care to remember. It is not intended to be a
‘copy and paste’ answer for you; the purpose of these articles has been to show
you how one person’s thought process works, and perhaps prompt you to have some
ideas of your own.
There is
another component to the spreadsheet which is the use of the ‘modified result’
mentioned above as input to a Poisson calculation from which you can estimate
the probability of every result, and thus all the Over / Under, Match Odds and
Correct Score markets, and that will be the subject of a future article.
While I have tried to make this series as clear and as easy to understand as possible, it is not impossible that I have assumed some knowledge or understanding that I should not have, so if anyone has any questions on the above, please comment or send me an email, and I will try to respond.
No comments:
Post a Comment