In this course the learner will be shown how to generate forecasts of game results in professional sports using Python. The main emphasis of the course is on teaching the method of logistic regression as a way of modeling game results, using data on team expenditures. The learner is taken through the process of modeling past results, and then using the model to forecast the outcome games not yet played. The course will show the learner how to evaluate the reliability of a model using data on betting odds. The analysis is applied first to the English Premier League, then the NBA and NHL. The course also provides an overview of the relationship between data analytics and gambling, its history and the social issues that arise in relation to sports betting, including the personal risks.
Overview
Syllabus
- Week 1
- This module introduces the regression models in dealing with the categorical outcome variables in sport contest (i.e., Win, Draw, Lose). It explains the Linear Probability Model (LPM) in terms of its theoretical foundations, computational applications, and empirical limitations. Then the module introduces and demonstrates the Logistic Regression as a better substitute of LPM for the categorical dependent variables.
- Week 2
- This module explores the relationship between probability and betting markets. It explains the concept of odds, and the relationship between betting odds and probabilities. It then develops a measure of the accuracy of betting odds using sports examples, and assesses the meaning of efficiency in betting markets.
- Week 3
- This module shows how to forecast the outcome of EPL soccer games using an ordered logit model and publicly available information. It assesses the accuracy of these forecasts against the betting odds and shows that they are remarkably accurate.
- Week 4
- This module assesses the efficacy of the EPL forecasting model covered in the previous week by replicating the model in the context of three North American team sports leagues (i.e., NHL, NBA, MLB). Specifically, this module shows how to forecast the outcome of NHL, NBA, MLB regular season games using an ordered logit model and publicly available information. It assesses the accuracy of these forecasts against the betting odds.
- Week 5
- In this module we examine the historical and social consequences of gambling, and the relationship between gambling and statistics. Gambling is explored from the perspective of different ethical and religious systems. Issues of problem gambling are explored and assessed.
Taught by
Youngho Park and Stefan Szymanski