Predicting the outcome of sports games has been the focus of a great deal of data science. And without a doubt, their ability to do predictions with higher accuracy than the bettors help them sustain the business. Sports offer a ton of in-game and post-game raw data that can be leveraged to make more accurate predictions and to make money as well. Bettors need to beat oddsmakers who also use data analytics to predict the outcome of every game: This academic study has been completed by Hans Manner in 2015 with the help of BigDataBall data.
The figure above is taken from Hal Stern’s article and breaks down the state of oddsmaking success in different sports. Three statistical predictors were compared to oddsmakers. The numbers in the parenthesis represent the number of games analyzed.
The first predictor in the comparison always predicts that the “home team” will win. The second one always predicts that the team who has previously won more games will win. The third predicts the team who has scored more points would win. The least squares is a regression method that helps find the best fit for a dataset by minimizing the sum of residuals (errors). As we can see, the results of 3 predictors were beaten by oddsmakers analytics where the strengths and weaknesses of the matchup, injuries, venue (home/away performances) are taken into account.
It looks like baseball is the most random (unpredictable) sport. Another conclusion would be: In each sport, college league predictions are considerably better than their professional leagues where fewer games (less data), less variation in player quality (salary-caps), and balanced schedules (# of home and away games are equal) exist.
Time will show us if future research would differ from the above figures as the number of teams, rules, and tempo changes.