Saturday 4 April 2009

Baseball By The Numbers


Still on the subject of statistics, but this time I’m switching to baseball, a sport that (along with cricket), is ideal for statistics. The new baseball season in the USA starts this weekend, and once again it’s the Wall Street Journal, not known for its sporting section, whetting our appetites with an interesting article on win expectancy. There is a formula based on how many runs a team scores and conceded which determines a team’s expected winning percentage that has been very accurate over the years.

The focus of the article is on how one team (the verbosely named Los Angeles Angels of Anaheim) have exceeded their win expectancy by 24 wins in the past five seasons, and explores reasons why this should be*, but my interest is more in the fact that seldom is the expected number more than three away from the expectation.

With the top teams losing one-third of their games each year, and the worst winning at least that number, baseball is a totally different beast to tame from football. Perhaps an adapted formula for football could be more useful in predicting the total number of goals in a game rather than the result?

Anyway, I do love my numbers and formulae, so yet another idea to ponder during the upcoming long summer evenings.

* The reasons suggested are that the Angels had one of the best bullpens in baseball, play good defence, hit well with runners in scoring position, or maybe have the baseball Gods on their side. However, this trend will most likely end this year as the Angels have lost a couple of key players, and have some top starters injured as opening day approaches.

3 comments:

BingBang said...

Baseball - is as you said a statistical paradise - one thing I learnt last baseball season is that the result is usually determined by the starting pitchers - the starting pitcher for each team changes on a daily basis and the teams have a rota and the teams announce who their intended starting pitcher is for the next few days.

If you look on the MLB.com site, the one stat that I thought is the most revealing was the ERA (Earned Run Average) for each starting pitcher and the market I used to dabble in was the under/over total runs markets.

Good luck.

Cassini said...

The pre-game odds are always based on the starting pitchers. The same two teams play each other two, three or four consective days, and the odds from one day to the next are usually completely different.

Anonymous said...

The formula is not really a secret. It was developed by Bill James and given public credence in the great book "Moneyball" (one of the best sports books ever).

The stat is even available on the MLB site on the standings page

The "pythagoras theorem" can also be used in the NFl to great profitability

And WHIP is a better stat to use for baseball pitchers then ERA