final case study powerpoint

Post on 18-Jul-2015

32 Views

Category:

Data & Analytics

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Final Case StudyPredictive Modelling for Equestrian Sports

N RAMACHANDRAN

Average by Stake Indicator

0 50000 100000 150000 200000 250000 300000 350000 400000 450000 500000

All

AP

CRC

FG

Handle by Stake Indicator

Y N

Average Handle by Day of Week

0

50000

100000

150000

200000

250000

300000

350000

Sun Mon Tue Wed Thu Fri Sat

Handle vs Day of week

All AP CRC FG

Average Handle by Hour of day

0

50000

100000

150000

200000

250000

300000

350000

400000

1 2 3 4 5 6 7 8 9

Handle vs Hour of day

hour_of_day All AP CRC FG

Average Handle by No of runners

0

100000

200000

300000

400000

500000

600000

700000

800000

3 4 5 6 7 8 9 10 11 12 13 14

Handle vs No of runners

All AP CRC FG

Average Handle vs Race Number

0

200000

400000

600000

800000

1000000

1200000

1 2 3 4 5 6 7 8 9 10 11 12 13 14

Handle by Race Number

All AP CRC FG

Average Handle by Month

0

50000

100000

150000

200000

250000

300000

350000

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec

Average Handle by Month

All AP CRC FG

Variables and their influence on the handle

Variables influencing Handle

All AP CRC FG

Purse_USA +ve +ve +ve +ve

Number of runners +ve +ve +ve +ve

Holiday +ve +ve +ve -ve

Weekend +ve NA +ve +ve

Race Type -ve +ve -ve +ve

Age Restriction -ve -ve NA +ve

Sex Restriction -ve -ve -ve -ve

Race Number +ve -ve -ve +ve

Hour of day +ve -ve +ve +ve

Track_Condition -ve -ve NA NA

Wager Type +ve +ve +ve +ve

Linear Regression

• The analytic modelling used to predict the handle values is Linear Regression .Since the handle is a continuous variable , this is the best method to understand the predict the values.

• Following are the charts that show the results of the predicted values and the error with respect to the original handle values .

• (The details of the variables used in the regression are in the Excel files.)

Predicted Handle vs Handle with All Track Ids

Original Handle vs Errors for all Track Ids

Predicted Handle vs Original Handle for track AP

0

200000

400000

600000

800000

1000000

1200000

1400000

0 200000 400000 600000 800000 1000000 1200000 1400000 1600000 1800000 2000000 2200000 2400000 2600000 2800000 3000000 3200000 3400000

predicted_handle

Original Handle value vs Error for Track AP

-600000

-400000

-200000

0

200000

400000

600000

800000

1000000

1200000

1400000

1600000

1800000

2000000

2200000

2400000

2600000

2800000

0 200000 400000 600000 800000 1000000 1200000 1400000 1600000 1800000 2000000 2200000 2400000 2600000 2800000 3000000 3200000 3400000

difference

Predicted Handle vs Original Handle for track CRC

0

100000

200000

300000

400000

500000

600000

700000

800000

0 100000 200000 300000 400000 500000 600000 700000 800000 900000 1000000 1100000 1200000 1300000

predicted_handle

Original Handle value vs Error for Track CRC

-400000

-200000

0

200000

400000

600000

800000

1000000

1200000

0 200000 400000 600000 800000 1000000 1200000 1400000

difference

Predicted Handle vs Original Handle for track FG

0

100000

200000

300000

400000

500000

600000

0 100000 200000 300000 400000 500000 600000 700000 800000 900000 1000000 1100000 1200000 1300000 1400000 1500000 1600000 1700000 1800000

predicted_handle

Original Handle value vs Error for Track FG

-400000

-200000

0

200000

400000

600000

800000

1000000

1200000

1400000

0 200000 400000 600000 800000 1000000 1200000 1400000 1600000 1800000

difference

Important Points

• The predicted values for the range upto handle = 700,000 is predicted with a good accuracy.

• The model does not do a good job of predicting higher values of handle.

• The Handle values vs error graph shows most of the values symmetrically placed along the x axis , the error are random and therefore there is not any collinearity issue.

• Adj R sq is in the range 0.60 – 0.75 for all the different analysis.

Ideal Variable Values to Maximize Handle

Ideal Values for the maximization of Handle

All AP CRC FG

Number of runners 14 14 13 13

Holiday 1 1 1 0

Weekend 1 0 1 1

Race Type STK STK STK STK

Age Restriction 4U 34 35 3

Sex Restriction No Restriction No Restriction No Restriction No Restriction

Race Number 3 9 6 2

Hour of day 7 1 2 2

Track_Condition FT GD FT FT

Wager Type E E E E

Month Jan Aug Jan Jan

Day of Week Wed Wed Mon Thu

top related