Wednesday 25 April 2012

Lucky Run Or Significant?

Since I began my Elo based ratings spreadsheet about three years ago, there have been almost 3,500 matches entered from the top leagues in England, France, Germany, Italy and Spain. Each league has its own personality, but the total Home:Away:Draw percentages (rounded) across all leagues currently stands at 47:27:27, pretty close to the historic ratios.

Looking at the EPL over the last 20 years, the percentage of home wins has been over 50% just twice (05-06 and 09-10) with a low in 93-94 of 42%, while draws and aways trade places in the 24% to 30% range, excepting the draws in 05-06 which were at a very atypical low of 20%. The first quarter of the season was fairly normal, 24 draws from 95 matches, but the second quarter saw just 15 draws. The second half totalled 38 (exactly 20%), but the following three seasons saw normal service restored, with each season seeing a draw percentage of 26%.

Over a long period of time, the 20 years of the EPL for example, fluctuations such as these are going to occur, but unless there has been a fundamental change in the game (the offside rule is abolished or the goals are widened) the draw percentage was always going to bounce back from that 20%.

Anyone laying the draw in all EPL games that season may well have thought of himself as a 'god', as Peter Webb puts it, but only for a few months until the odds caught up with him. A strategy as simple as laying all draws, backing all draws, backing all favourites or whatever, can never work for long. The market would correct itself if there really was a reason for the run of winners, but in the case of the draw, and this is one reason why I like the XX Draws so much, the price is robust. It is rare to see the draw price below 3.15, with the obvious exception of late season Serie A games.

When is a 'short term' run of results significant, or merely the random nature of probability? Fortunately, to stop ourselves from getting too excited after a week of six draws out of twelve or whatever, we are able to turn to statistics when we need to see if results are "statistically significant" - i.e. unlikely to have occurred by chance.

Of  the 3,432 matches I have recorded in the spreadsheet, 922 have been draws (26.86%).

341 matches qualified as XX Draws (Classic), with 111 winners (32.5%) at an average price of 3.54 (Implied probability .2825).

Using a probability of 0.2686, the probability of at least this strike rate  over this number of events is 0.0115.

Using the probability of .2825, the probability of hitting at least this strike rate is 0.0456.

Looking at the XX Draws (Extended), 1134 matches qualified, with 360 winners (31.75%) - the average price I don't have as I only recently started recording it, but using the 3.54 doesn't seem unreasonable. This makes the probability of hitting at least this strike rate just 0.0053 - or 1 in 200.

The 0.05% level for significance is the one usually used to state that the results are "unlikely to have occurred by chance" and this does appear to be the case. These numbers look almost too good to be true, which likely means I've missed something, but if one of the statistics experts reading this would like to run the numbers and either confirm them or correct me, I would appreciate it.

The next thing I want to take a look at is whether the strike rate is improving as the Elo based ratings mature, and after modifying it a year ago to use more data than simply the final score.

Finally, after an upbeat and positive post, a disappointing night in the NBA where I felt the Jazz price was too short for much of the game. It appears I was wrong - it happens occasionally :)  It's funny how the games I look forward to the most always seem to be the games I trade the worst. The play-off curse has come early, as this game was, to all intents and purposes, a play-off.




3 comments:

George said...

I get very very similar numbers using your inputs. Am probably using a slightly different method arrive there. Here we go:

=1-BINOM.DIST(111,341,1/3.54,1)
I get 0.0354 vs your 0.0456

=1-BINOM.DIST(360,1134,1/3.54,1)
I get 0.0044 vs your 0.0053

BigAl said...

I think there are several holes in the analysis.

The 0.2686 is irrelevant but, to be fair, I think you realised that.

Your "significance test" is too simple.

Might be wrong, I seem to remember your 3.54 number was just a guess as you didn't have all the prices??

Each match is different. Therefore I'm confused by your average price method.If I have 4 matches which, on average, have a 50% probability of draws - is the average correct price necessarily evens? Do the maths with,say, 70%,30%,60% and 40%. Average implied price equals......???

BigAl said...

By that, I mean what is the average of the 4 "correct" prices.

I think this is relevant but I haven't thought too hard about quite what you mean so could be wrong.