Wednesday 24 December 2014

Garbage Goal Times

A few days ago, I mentioned some comments by my best friend Shapeshifter (according to one strange individual, if you mention someone's name in a blog post, you are immediately their best friend and scheming to take over the world of betting together). The rather more boring truth is that I don't know Shapie, or any other Betfair Forum blogger for that matter, and don't need to. Betting is a solitary activity, as ultimately we are all competing against everyone else, but occasionally someone writes something mildly interesting.

Shapeshifter mentioned that at least one person is pursuing a strategy based on the times of goals. As anyone who has ever considered this idea for more than five minutes will be aware, this idea won't work in a sport like football.

Here's what Shapie wrote:
In a nutshell, he looks at time of the goals and, by half time of a football match, has percentages of outcomes.
An interesting hypothesis but upon reading it, it falls into a couple of theories I have about data:
- As mentioned, context needs to be attached to the data
- Regardless of the context, sometimes too much information can actually make you ‘blind’ when compared to using ‘some’ of the information.
- (in relation to betting) A mistake that can be done is taking data and “molding” it to look like it offers an edge when, in fact, it is either one of two extremes: too general OR overthought.
One of the problems with football betting is that the timing of incidents is imprecise. I don't mean that sun-dials are being used, rather that the clock counts up from zero to 90, and doesn't stop for breaks in play.

Imagine a game where a serious injury occurs in the first few seconds, and the game is delayed for fifteen minutes while the player is attended to and removed from the field. A goal is scored immediately from the restart, and it goes down in the record books as a 16th minute goal and the first half lasts for 60 elapsed minutes. See the problem? Clearly this is an unlikely situation, but it makes the point, and long stoppages do occur. The Liverpool v Arsenal game last weekend saw nine minutes of time added on as a recent example.

What might be more useful would be if goal times were recorded against the time remaining in the game or the half, as they do in ice-hockey or in sports such as basketball, but the time remaining isn't known until after the game is over in football. The clock doesn't countdown in football, nor stop until the end. Thus data comparing goals scored in the 16th minute is useless, because as illustrated in my imaginary game above, 16th minute goals are not all the same.

The idea that you can have meaningful percentages for outcomes based on goal times from the first half is clearly nonsensical. You might have percentages, but they aren't meaningful. Garbage data can't generate an edge, however much you may wish that to be the case.  

And on that seasonal note of friendly betting cooperation, friendly advice and friendly constructive commenting  (not actually that generous, as I don't bet football in-play) I bid you all a Merry Xmas.  


Betslayer said...

I think that is a very naive view, so some goal times aren't accurate?? so sample size isn't key to smooth out the data then? I don't think the odd delay would have that big a difference when you are working on large margins. From analysing inplay using minute by minute data I can tell you it largely fits very very accurately to bookmakers inplay pricing. Funny you think elo ratings are accurate enough for your betting and systems but a theory with 15% ROI over hundreds of bets is deemed to be flawed.

Betslayer said...

I also wonder where you get this mythical data to make your elo ratings, so you clean up goals so that no incorrect ones are included? what about incorrect offsides? Goals off peoples arses? No thought not, all data and systems are flawed in nature, doesn't stop people having an edge! The market makers have the same limitations in timings of goals and indeed data. I know of no database for football without errors and trust me its an area I have a large amount of experience in.