h web viewpeta uses rigorous statistical analysis to, in his own words, “quantify...

38
Using the NFL Gambling Market as an Alternative Asset Class Kevin Meers, Sam Waters, and Zack Wortman Statistics 107 Spring 2013 A particular focus of modern portfolio theory involves diversifying risk through identifying and investing in assets that are uncorrelated with one another. In this pursuit, many investors have explored so-called “alternative asset classes” that are not traded on major stock markets. In this paper, we identify the market for gambling on games in the National Football League (NFL) as one such alternative asset. By modeling various betting outcomes, we develop multiple betting algorithms that provide returns from 7% to 20%, depending on what level of risk the investor would like to assume. These returns are, in theory, totally uncorrelated with market risk, and therefore valuable to any portfolio manager; further, they appear to dominate multiple market indices.

Upload: dokhuong

Post on 30-Jan-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

Using the NFL Gambling Market as an Alternative Asset Class

Kevin Meers, Sam Waters, and Zack Wortman

Statistics 107

Spring 2013

A particular focus of modern portfolio theory involves diversifying risk through identifying and investing in assets that are uncorrelated with one another. In this pursuit, many investors have explored so-called “alternative asset classes” that are not traded on major stock markets. In this paper, we identify the market for gambling on games in the National Football League (NFL) as one such alternative asset. By modeling various betting outcomes, we develop multiple betting algorithms that provide returns from 7% to 20%, depending on what level of risk the investor would like to assume. These returns are, in theory, totally uncorrelated with market risk, and therefore valuable to any portfolio manager; further, they appear to dominate multiple market indices.

Page 2: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

Table of Contents

Introduction

Methodology

Results

Discussion

Tables

Figures

References

Appendix

2

6

8

12

16

22

25

26

Page 3: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

I. Introduction

Literature Review

A particular focus of modern portfolio theory involves diversifying risk

through identifying and investing in assets that are uncorrelated with one

another. Given the centrality of traditional securities markets in the contemporary

investment world and the increasing interdependence of the global economy, an

asset that is uncorrelated with worldwide economic performance could provide

especial utility in risk diversification. While the performance of individual

securities may depend on various asset-specific or industry-specific risks,

virtually all can be affected by market volatility, and this shared risk factor

problematizes efforts to minimize risk in a given portfolio.

This shared association of stocks and bonds with market volatility

incentivizes a search for alternative asset classes, and a variety enjoy heavy

investment. Collectibles, from baseball cards to Impressionist paintings, are

among the most public examples, and commodities and real estate can also offer

investments that are (ideally) completely uncorrelated with market returns. In this

project, we have chosen to explore the betting market of the National Football

League as an alternative asset.

The idea of sports gambling as an alternative asset class has existed for

at least forty-five years, since Lyn Pankoff’s essay “Market Efficiency and

Football Betting.” Pankoff hypothesized that, in sports betting, the collective

judgments of individual bettors contained inefficient patterns that could be

exploited for consistent individual profit and that betting odds were not

Page 4: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

necessarily a good barometer of intrinsic value (Pankoff 1968). In Pankoff’s view,

football betting provided a unique lens for the analysis and understanding of

market efficiency, as it allowed particular ease of access in searching for patterns

and trends, given the clarity and availability of information (Pankoff 1968).

A generation later, Joseph Golec and Maurry Tamarkin published “The

Degree of Inefficiency in the Football Betting Market” in 1991. In their paper,

Golec and Tamarkin identify several reasons why they hypothesize that analysis

of football betting could produce a system that could yield consistent returns.

First, much of the information regarding professional football games is inherently

public, with various statistical measures of player and team performance (as well

as additional factors like playing surface, kickoff time, etc.) easily accessible to all

prospective bettors (Golec and Tamarkin 311). Second, the market is sufficiently

large for robust statistical analysis, with many possible wagers and many bettors

per wager (Golec and Tamarkin 312). Finally, the mix of professional and

amateur bettors in the market suggests both that consistently profitable betting

systems can exist and that the market is likely inefficient (Golec and Tamarkin

312).

Ultimately, the authors conclude that significant and easily identifiable

inefficiencies do exist in the NFL betting market, with bettors tending to

underestimate the impact of home-field advantage and overestimate the impact

of streaks and recent results (Golec and Tamarkin 312). Golec and Tamarkin

note that both of these inefficiencies seemed to be disappearing, however, and

Page 5: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

future researchers or prospective systemized bettors would need to identify new,

less obvious inefficiencies in the market.

The notion of sports betting as an alternative asset class has gained

traction since the publication of Joe Peta’s Trading Bases: A Story About Wall

Street, Gambling, and Baseball, in which Peta outlines his creation of a baseball

betting fund that produced a forty percent return in its first year by exploiting

similar statistical inefficiencies to those that we attempted to identify in this

project. Peta emphasizes a conceptual transition from viewing systemized sports

betting as a form of entertainment gambling to a form of legitimized strategic

trading (Peta 2012). Much as former Oakland Athletics General Manager Billy

Beane utilized the capabilities of sabermetrics to more accurately assess value in

baseball player performance, Peta uses those same capabilities to find value in

the betting market. Peta uses rigorous statistical analysis to, in his own words,

“quantify luck” (Peta 2012).

Gambling Terminology

There are two types of wagers one can traditionally make on an NFL

game: the spread and the over-under. The spread is the amount of points “given”

to the away team in order to draw equal volumes of bets on both teams. For

example, if the spread is minus three, Vegas “gives” the away team three points,

so the home team has to win the real game by more than three to “cover” and

win the bet. An over-under simply states the predicted total points scored in a

game by both teams combined. If the over-under is seventy, and the teams

combine to score more than seventy, over would be the winning bet; if they score

Page 6: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

under seventy, under would be the winning bet. If, in the last two examples, the

home team won by exactly three and the teams combined to score exactly

seventy points, both bets would be a push and all bets would be nullified.

Given this structure, we made four different types of bets when back-

testing the strategy we devised using our model. These bets were “over”,

“under”, “cover”, and “not cover”.

Data

In our attempts to create a model that outperformed Las Vegas betting

lines, we identified several factors that we felt might yield exploitative patterns,

including the day of the week games were played, the playing surface (grass or

turf), and the stadium type (outdoor, dome, or retractable roof).

In addition to these background characteristics, our model depended

heavily on a metric developed by Football Outsiders called Defense-adjusted

Value Over Average (DVOA). DVOA aggregates the degree of success that a

team had on all of its plays from scrimmage compared to the success of the

average team on those plays. In evaluating each play’s success, it controls for

the situation (down, distance, time, etc.) and opponent quality of that play. DVOA

essentially describes the past performance of a given team at any point in the

season, which we can use to predict how that team will perform in future games.

We collected data on 5159 regular season games from the 1991 through

the 2011 NFL seasons from www.Pro-Football-Reference.com and

www.FootballOutsiders.com to use in our model. These data include the Vegas

spread, over-under, the result of the spread, the result of the over-under, the field

Page 7: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

type, stadium type, time of the game, week number, and offensive, defensive,

and special teams DVOA for both the home and away teams.

Even armed with these data, we anticipated that it would be extremely

difficult to create a model that could consistently outperform betting lines set by

professionals. We hoped, however, that the high proportion of football bets that

are made by amateurs (or professionals who do not use statistical models) might

create inefficiencies in the market that would be fairly easy to identify and

significant enough to achieve healthy returns.

II. Methodology

The first component of this study uses R to model the probability of

potential outcomes for lines and over-unders of all NFL regular season games

from 1991 to 2011 using logistic regression. The second part conducts a

simulation in R in which we place bets on all of these games based on the

model’s predictions.

We model four different binary outcomes: total points scored over the

number set by Vegas (one if so, zero if not), total points scored under the number

set by Vegas (one if so, zero if not), the home team does cover the spread set by

Vegas (one if so, zero if not), and the home team does not cover the line set by

Vegas (one if so, zero if not). We model each of these four outcomes separately

rather than using two models with binary outcomes of over and under and cover

and not cover because of the possibility of a third outcome in each case, “push”,

in which the actual outcome falls right on the line or over-under set by Vegas so

that the bet is nullified and money is returned.

Page 8: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

For each of the four outcomes, we investigate three different model types.

All are logistic regressions. The first uses background characteristics which

include week (1-17), field surface (grass or not grass), dome type (dome or not

dome), roof type (retractable or not retractable), four dummy variables indicating

scheduling outside of the traditional Sunday afternoon time slot (Saturday,

Sunday Night, Monday, and Thursday), the Vegas spread, and the Vegas over-

under. The second model includes these background characteristics along with

Football Outsiders efficiency metrics. These metrics are offensive, defensive, and

special teams DVOA for both the home and away teams. The third model

includes all of the variables from the second model, along with additional

interaction terms for all of DVOA variables.

The betting simulation places $100 bets on all outcomes (over, under,

cover, not cover) for which the model imputes a probability over 50%. We then

calculate profits and expenditures for all bets made on each type of outcome to

find our total returns across all games. We run the same simulation 16 times,

with each simulation progressively cutting out an additional week in the beginning

of the season. (So in the first simulation we make bets starting in Week 2, in the

second we start in Week 3, in the third we start in Week 4, etc.) This enables us

to see how returns and risk vary as we wait longer into each season to start

making our bets. We reset the model for each of these simulations to only

include the weeks being examined. We also examine profits in only recent

seasons to see if the model’s returns hold up in more recent seasons to confirm

that the market’s inefficiencies are still exploitable.

Page 9: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

III. Results

Since we model four dependent variables (probability of over, under,

cover, or not cover) separately, and we do so for 16 sets of weeks (Week 2-17,

Week 3-17, Week 4-17,…, Week 17), we produced dozens of models over the

course of our analysis. Rather than display and evaluate the results for every

single regression, we will examine in detail only the four full models including

interaction terms for each of the four dependent variables over the entire season

(Weeks 2-17). These models are generally similar to those produced after

subsetting the data by week and are more comprehensive than the models

excluding DVOA terms, so explaining these models should convey the necessary

understanding of all models used.

P(Over)

The results of this regression can be found in Table 1. This model

estimates the probability of an over on the over-under line Vegas sets, given the

background and DVOA variables for the teams involved for all weeks. The only

regressors that were significant at the 5% level were away team offensive DVOA

and home team special teams DVOA in primetime. The coefficients on both are

positive, indicating that the expected probability of an over increases as away

offensive DVOA increases and as home special teams DVOA increases. It

appears that Vegas does not account enough for better away offenses and home

special teams in its expectation for total scoring.

It is important to remember that these results do not mean that these

variables are positively associated with higher-scoring games. It means they are

Page 10: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

positively associated with higher-scoring games in comparison to the

expectations set by Vegas’ over-under. This distinction is key and applies to all of

the over-under and spread models used in this study.

P(Under)

The results of this model are displayed in Table 2. This model estimates

the probability of an under on the over-under line Vegas sets, given the

background and DVOA variables for the teams involved for all weeks. Over-

under and away team offensive DVOA are significant at the 5% level. The

coefficient on over-under is positive, indicating that as the number of total points

that Vegas expects goes up, the probability of an under goes up. In other words,

Vegas is setting its over-unders too high, on average. The coefficient on away

offensive DVOA is negative, so the probability of an under increases as away

offensive productivity decreases. As in the P(Over) model, Vegas underestimates

the impact of the away offense on total points scored.

P(Cover)

The results of this analysis are found in Table 3. This model estimates the

probability of the home team covering the spread that Vegas sets, given the

background and DVOA variables for the teams involved for all weeks. The

dummy variables for Monday and Thursday night games are significant at the 1%

level with positive coefficients. This indicates that the home team is more likely to

cover the spread on Monday and Thursday night games.

Page 11: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

The interaction variable between home defensive DVOA and away

offensive DVOA is significant at the 5% level with a positive coefficient, but its

interpretation is complex because a more negative defensive DVOA implies a

stronger defense. Thus the probability of a cover increases when home

defensive DVOA and away offensive DVOA share the same sign; this

relationship implies a mismatch: either the home defense is strong and the away

offense is weak or the home defense is weak and the away offense is strong.

The probability of a cover decreases when there is no mismatch, i.e. when either

both the offense and defense are strong or both are weak.

P(Not Cover)

The results of this model are available in Table 4. This model estimates

the probability of the home team failing to cover the spread that Vegas sets,

given the background and DVOA variables for the teams involved for all weeks.

Over-under and away team offensive DVOA are significant at the 5% level. The

dummy variables for Monday and Thursday night games are significant at the 1%

level with negative coefficients. This indicates that the home team’s probability of

not covering the spread decreases on Monday and Thursday night games. (This

wording is confusing but necessary in order to interpret the model explicitly and

account for pushes.) This is basically the same as saying that the home team

covers more often on Monday and Thursday nights, which corroborates the

results from regression III.

In order to predict outcomes when betting only on a certain subset of

weeks, we repeated the above modeling techniques while excluding data from

Page 12: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

the weeks for which we were not betting. These models are too numerous too

display and analyze in full, but the general framework for analyzing these models

is the same as seen above.

After modeling all four outcomes over all of the necessary timeframes, we

calculated returns and standard deviation for each of those timeframes. Figures 1

and 2 provide a graphical representation of how risk and return vary depending

on when we start placing bets, and are explicitly stated in Table 5. Figure 1

shows how expected returns increase as the bettor delays his initial week of

betting longer. The solid line represents mean expected return, while the dotted

lines represent 67% and 95% confidence bands.

Figure 2 plots risk against return with each dot representing a different

week at which we began betting. We can see here that risk and return increase

together in approximately linear fashion as we move deeper into the season. A

bettor who starts betting in week two can expect a mean return of 7.0% and risk

of 5.1%, while a bettor who starts in week 17 can expect a mean return of 20.3%

and risk of 16.4%, so return and risk can vary greatly depending on how long the

bettor waits to collect information and make his bets. Separate simulations for

games played in 2000 and later, 2009 and later, and only 2011 produced similar

results, indicating that these types of returns persist over time, and that the

football betting market was still exploitable as of 2011.

While these returns are in theory completely uncorrelated with the overall

market, it is worth examining whether or not they actually are. To do so, we

downloaded data on the S&P 500 since 1991 and correlated its yearly returns

Page 13: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

with the returns of each of our betting strategies. These correlations can be found

in Table 6. The highest correlation is only 0.31, and most are much closer to

zero.

IV. Discussion

The viability of sports betting as an alternative class has gained traction in

recent years within the body of extant scholarly literature. Its lack of correlation

with the overall market presents a unique opportunity for risk diversification in

portfolio construction and possible inefficiencies in the betting market indicate the

potential for healthy returns. We used logistic regression to model the outcomes

of over-unders and spreads for NFL games, in an attempt to exploit these

potential inefficiencies. We were able to predict outcomes with enough accuracy

to earn healthy returns when back-testing our model, using Football outsiders

per-play efficiency metrics and background variables like field surface and kickoff

time.

Mean returns depended on which week the bettor started implementing

the model, with risk and return increasing as he waited longer to start. This is not

surprising, because as we get deeper into the season, we collect a larger sample

of information about each team. A team’s DVOA is a more accurate reflection of

its true talent level in Week 17 than it is in Week 2. Returns were higher later in

the season because of the model’s improved accuracy, but this came at the cost

of higher standard deviation because funds were necessarily concentrated

among a smaller number of bets.

Page 14: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

This trend allows for the creation of 16 different portfolios, each containing

bets starting in a different week. Portfolios with bets starting in later weeks have

higher expected returns and higher expected risk, while portfolios with bets

starting in earlier weeks have lower expected returns and lower expected risk.

The decision on which of these betting portfolios to invest in depends on the

utility function of the individual investor. A more risk-averse investor would start

betting at the beginning of each season, while a less risk-averse investor would

wait to start betting until the latter half of the season. Either type of investor could

incorporate his preferred betting strategy into his overall portfolio, using the

football betting market’s independence from the overall market to diversify risk in

his overall portfolio.

One issue that cuts into the returns discussed above is what sports books

refer to as the vigorish, or the vig. This is the fee that the book collects from the

gamblers on each bet. Pinnacle, the top sports book, charges a vig of 5% on the

amount expended for most bets. This dampens the impact of our returns

discussed above, though one would still expect positive returns. Reasons for

optimism remain, however, for the prospects of consistent healthy returns when

using sports betting as an alternative asset class.

Our model, while effective, is relatively simple compared to what a

professional bettor could construct with more time and information. We used only

background characteristics and one efficiency metric to model outcomes, but a

wealth of data exists that could improve the model. Using scouting grades,

individual player performance metrics, and individual player interaction terms

Page 15: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

would improve the model’s accuracy greatly. Incorporating news regarding player

injuries, suspensions, playing time, and trades, would greatly improve the

model’s efficacy as well.

We might also produce a more accurate model if predicting betting

outcomes in the playoffs. This paper demonstrates that returns increase as we

wait longer into the season to start placing bets. Waiting until the playoffs to start

betting allows even more time to pass and for us to collect even more

information. The unique conditions of playoff football might also produce

additional exploitable inefficiencies that would boost returns. It is encouraging

that such a simple model yielded positive returns, when further research could

yield a significantly more accurate model with much greater expected returns.

Further, our method’s risk and return significantly outperforms and

dominates the market indices “SPY” (the S&P 500) and “QQQ” (NASDAQ); it our

algorithm almost dominates “TIP” (10-year Treasury Bonds), but TIP’s yearly

returns have a standard deviation a half percentage point lower than our Week 2

and forward betting returns, so our strategy does not fully dominate TIP (though it

does provide five percentage points higher return for that additional half

percentage point of risk). These results are summarized in Figure 3.

While there is significant room for improvement in our methods and

models, our results clearly show that the NFL gambling market fulfills the

requisite conditions for a viable alternative asset class. Returns based on NFL

gambling are largely uncorrelated with major market returns. NFL betting returns

are robustly positive, and we have provided limited evidence that our algorithms

Page 16: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

may outperform these market indices. We therefore highly recommend that

portfolio managers invest significantly more time and energy into engineering

models that beat Vegas and arbitraging the significant inefficiencies that we have

found in this market.

Page 17: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

Table 1

P̂(¿)=¿

Page 18: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

Table 2

P̂(Under)=¿

Page 19: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

Table 3

P̂(Cover )=¿

Page 20: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

Table 4

P̂(Not Cover )=¿

Page 21: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

Table 5

Bets Starting Return Standard DeviationWeek 2 7.00% 5.11%Week 3 7.84% 6.93%Week 4 7.11% 6.73%Week 5 7.74% 5.65%Week 6 8.01% 6.44%Week 7 8.55% 7.16%Week 8 8.41% 6.53%Week 9 8.63% 6.14%

Week 10 9.21% 7.07%Week 11 8.96% 9.13%Week 12 9.42% 8.77%Week 13 12.10% 9.62%Week 14 12.16% 10.11%Week 15 13.15% 12.61%Week 16 15.02% 11.10%Week 17 20.25% 16.44%

Page 22: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

Table 6

Bets Starting Correlation with S&P 500 Yearly ReturnsWeek 2 -0.03Week 3 0.14Week 4 0.20Week 5 0.31Week 6 0.24Week 7 0.11Week 8 0.03Week 9 0.01Week 10 0.17Week 11 0.26Week 12 0.19Week 13 -0.01Week 14 -0.06Week 15 0.06Week 16 0.02Week 17 0.20

Page 23: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

Figure 1

Page 24: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

Figure 2

Page 25: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

Figure 3

V. References

Page 26: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

Works Cited

Golec, Joseph, and Maurry Tamarkin. "The Degree of Inefficiency in the Football Betting Market." Journal of Financial Economics 30.2 (1991): 311-23. Web. 5 May 2013.

Pankoff, Lyn D. "Market Efficiency and Football Betting." The Journal of Business 41.2 (1968): 203. Print.

Peta, Joe. Trading Bases: A Story about Wall Street, Gambling, and Baseball (not Necessarily in That Order). New York: Dutton, 2013. Print.

Data

www.FootballOutsiders.com

www.Pro-Football-Reference.com

Software

R

Microsoft Excel

Appendix I

mydata=read.csv("/Users/kevinmeers/Documents/Harvard/Junior Year/Spring/Stat 107/PROJECT/data.csv")

Page 27: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

primetime=mydata$Saturday+mydata$Sunday.Night+mydata$Monday+mydata$Thursday

mydata=cbind(mydata,primetime)adv.data=subset(mydata,Week>1)# Model Over/Under, P(Over)# SimpleOU.O=glm(mydata$Over~mydata$Week+mydata$Grass+mydata$Dome+mydata$Retractable.Roof+mydata$Saturday+mydata$Sunday.Night+mydata$Monday+mydata$Thursday+mydata$Spread+mydata$Over.Under,family=binomial("logit"))

summary(OU.O)# FO StatsOU.O2=glm(mydata$Over~mydata$Week+mydata$Grass+mydata$Dome+mydata$Retractable.Roof+mydata$Saturday+mydata$Sunday.Night+mydata$Monday+mydata$Thursday+mydata$Spread+mydata$Over.Under+mydata$Home.O.DVOA+mydata$Home.D.DVOA+mydata$Home.ST.DVOA+mydata$Away.O.DVOA+mydata$Away.D.DVOA+mydata$Away.ST.DVOA,family=binomial("logit"))

summary(OU.O2)# Full P(Over) ModelOU.O3=glm(mydata$Over~mydata$Week+mydata$Grass+mydata$Dome+mydata$Retractable.Roof+mydata$Spread+mydata$Over.Under+mydata$Home.O.DVOA+mydata$Home.D.DVOA+mydata$Home.ST.DVOA+mydata$Away.O.DVOA+mydata$Away.D.DVOA+mydata$Away.ST.DVOA+mydata$Home.O.DVOA*mydata$Away.D.DVOA+mydata$Home.D.DVOA*mydata$Away.O.DVOA+mydata$Home.ST.DVOA*mydata$Away.ST.DVOA+mydata$Home.O.DVOA*mydata$Away.D.DVOA*mydata$primetime+mydata$Home.D.DVOA*mydata$Away.O.DVOA*mydata$primetime+mydata$Home.ST.DVOA*mydata$Away.ST.DVOA*mydata$primetime,family=binomial("logit"))

summary(OU.O3)P.Over=fitted(OU.O3)adv.data=cbind(adv.data,P.Over)# Standard bets: $100 on everything# Calculating "Over" returnsn=length(adv.data$Week)bet.o=rep(NA,n)profit.o=c()for(i in 1:n){bet.o[i]=sum(adv.data$P.Over[i]>0.5)if(bet.o[i]==1 && adv.data$Over[i]==1){

profit.o[i]=100}else if(bet.o[i]==1 && adv.data$Push[i]==1){

profit.o[i]=0}else if(bet.o[i]==0){

Page 28: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

profit.o[i]=0}else {profit.o[i]=-100}}

profits.o=sum(profit.o)wins.o=length(profit.o[profit.o==100])win.pct.o=wins.o/sum(bet.o)# Calculating "Under" returnsbet.u=rep(NA,n)profit.u=c()for(i in 1:n){bet.u[i]=sum(adv.data$P.Under[i]>0.5)if(bet.u[i]==1 && adv.data$Under[i]==1){

profit.u[i]=100}else if(bet.u[i]==1 && adv.data$Push[i]==1){

profit.u[i]=0}else if(bet.u[i]==0){

profit.u[i]=0}else {

profit.u[i]=-100}}profits.u=sum(profit.u)wins.u=length(profit.u[profit.u==100])win.pct.u=wins.u/sum(bet.u)# Calculating "Cover" returnsbet.c=rep(NA,n)profit.c=c()for(i in 1:n){bet.c[i]=sum(adv.data$P.Cover[i]>0.5)if(bet.c[i]==1 && adv.data$Cover[i]==1){

profit.c[i]=100}else if(bet.c[i]==1 && adv.data$Push[i]==1){

profit.c[i]=0}else if(bet.c[i]==0){

profit.c[i]=0}else {

profit.c[i]=-100}}profits.c=sum(profit.c)wins.c=length(profit.c[profit.c==100])win.pct.c=wins.c/sum(bet.c)# Calculating "No Cover" returnsbet.nc=rep(NA,n)profit.nc=c()for(i in 1:n){bet.nc[i]=sum(adv.data$P.N.Cover[i]>0.5)if(bet.nc[i]==1 && adv.data$Not.Cover[i]==1){

profit.nc[i]=100}else if(bet.nc[i]==1 && adv.data$Push[i]==1){

profit.nc[i]=0}

Page 29: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

else if(bet.nc[i]==0){profit.nc[i]=0}

else {profit.nc[i]=-100}}profits.nc=sum(profit.nc)wins.nc=length(profit.nc[profit.nc==100])win.pct.nc=wins.nc/sum(bet.nc)# Summaryprofit.t=profit.o+profit.u+profit.c+profit.ncexpended.t=(bet.o+bet.u+bet.c+bet.nc)*100adv.data=cbind(adv.data,profit.t,expended.t)yr.profit=c()yr.expended=c()yr.return=c()for(i in 1991:2011){yr.profit[i-1990]=sum(adv.data$profit.t[adv.data$Year==i])yr.expended[i-1990]=sum(adv.data$expended.t[adv.data$Year==i])yr.return[i-1990]=yr.profit[i-1990]/yr.expended[i-1990]

}yrreturn1=yr.returnweek1=mean(yr.return)week1s=sd(yr.return)# Summary and Graphicsweeklyreturns=c(week1,week2,week3,week4,week5,week6,week7,week8,week9,week10,week11,week12,week13,week14,week15,week16)weeklysd=c(week1s,week2s,week3s,week4s,week5s,week6s,week7s,week8s,week9s,week10s,week11s,week12s,week13s,week14s,week15s,week16s)x=c(2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17)low=weeklyreturns-weeklysdhigh=weeklyreturns+weeklysdlowest=weeklyreturns-1.96*weeklysdhighest=weeklyreturns+1.96*weeklysdpar(xpd=FALSE)plot(x,weeklyreturns,type="l",xlab="Week",ylab="Return",main="Returns By Week",ylim=c(-0.1,0.6))lines(x,low,col="blue",lty=2)lines(x,high,col="blue",lty=2)lines(x,lowest,col="red",lty=2)lines(x,highest,col="red",lty=2)abline(h=0)legend("topleft",legend=c("Mean Return","67% Confidence Interval","95% Confidence Interval"),col=c("black","blue","red"),lty=c(1,2,2))

# Market ComparisongetSymbols("QQQ",from="1991-01-01")getSymbols("SPY",from="1991-01-01")

Page 30: h   Web viewPeta uses rigorous statistical analysis to, in his own words, “quantify luck” ... In our attempts to create a model that outperformed Las Vegas betting lines,

getSymbols("TIP",from="1991-01-01")SP=as.numeric(yearlyReturn(SPY))NQ=as.numeric(yearlyReturn(QQQ))TY=as.numeric(yearlyReturn(TIP))mean(SP)mean(NQ)mean(TY)sd(SP)sd(NQ)sd(TY)means=c(mean(SP),mean(NQ),mean(TY))sds=c(sd(SP),sd(NQ),sd(TY))cor(yrreturn1,SP)cor(yrreturn2,SP)cor(yrreturn3,SP)cor(yrreturn4,SP)cor(yrreturn5,SP)cor(yrreturn6,SP)cor(yrreturn7,SP)cor(yrreturn8,SP)cor(yrreturn9,SP)cor(yrreturn10,SP)cor(yrreturn11,SP)cor(yrreturn12,SP)cor(yrreturn13,SP)cor(yrreturn14,SP)cor(yrreturn15,SP)cor(yrreturn16,SP)# Graphicslibrary(calibrate)fit=lm(weeklyreturns~weeklysd)plot(weeklysd,weeklyreturns,xlab="Standard Deviation",ylab="Return",main="NFL Betting vs. Market Returns",type="n",xlim=c(0,0.4),ylim=c(0,0.2))points(sds,means,type="p",col=c("red","blue","darkgreen"),pch=19)points(weeklysd,weeklyreturns,type="p")textxy(weeklysd,weeklyreturns,labs=x,cx=.75)lines(weeklysd,fitted(fit),col="black")legend("topright",legend=c("Betting Returns by Starting Week","S&P 500 Yearly Returns","NASDAQ Yearly Returns","10-Year Treasury Bonds"),col=c("black","red","blue","darkgreen"),pch=c(1,19,19,19))