a poisson betting model with a kelly criterion element for
Post on 21-Oct-2021
3 Views
Preview:
TRANSCRIPT
1
APoissonBettingModelwithaKellyCriterionElementforEuropeanSoccer
KushalShah JamesHyman DominicSamangykshah07@syr.edu jmhyman@syr.edudsamangy@syr.edu
1. IntroductionSportsbettinghasexperiencedarapidriseinpopularityasaccessibilityandcommercializationofdailyfantasyandlivebettinghasincreased.Assportsbettingislegalizedindifferentcountriesandstates,wearepresentedwithanewopportunitytocreatestatisticalmodelsthatwecanutilizetopredictoutcomesofdifferentsportingeventsthatcanthenbeusedtofindinefficienciesinsportsbooks.Inthispaper,weattempttocreateamodelforEuropeanSoccerandmeasureitsperformanceagainstbettingmarketstounderstandifthismodelcanbeusedtogenerateprofits.ThepapershowshowwepredictedthenumberofgoalsbyeachteaminagameandutilizedthesepredictionstocreateaPoissondistributionthatdeterminedtheprobabilityofeachevent(Win,Lose,Draw).Toaddanotherdimensiontoourmodel,weusedanoptimizationtechniqueknownasKellyCriteriontodeterminetheoptimalamountofmoneythatshouldbebetoneachmatch.Thistechniquegeneratesbetamountswhilealsocreatingavalue(KCOValue)thatactsasanaccurateestimatoroftheriskassociatedwitheachmatch.Byexploringthecharacteristicsofthisvalue,wewereabletomaximizethesuccessofthemodel.Afterrunningthemodelforthe2018and2019seasonsacrossthefivemajorEuropeansoccerleagues,wecansafelysaythatourmodelwasnotonlysuccessfulinpredictingoutcomes,butalsoingeneratingsignificantprofityieldsforauser.Aprofitpercentageof119.48%canbeyieldedusingthismodel,whichimpliesthatauserwouldessentiallydoubletheirmoneyusingthismodel.Wealsoevaluatehowthemodelperformsindifferentleaguestounderstandwhichleaguecharacteristicsbenefitthemodel.Thehighestprofitpercentagewasseeninthe2019Bundesligaseasonwithaprofitpercentageof226.68%.Thesuccessofthemodelcannotonlyhelpusersgeneratesignificantprofits,butitcanalsoexposecertaininefficienciesinthemarket.
2. Generating𝝀ValuesforOurPoissonDistributionsThefirststeptouseourPoissonDistributionisdeterminingthe𝜆tobeused.InaPoissonDistribution,𝜆valuerepresentstheexpectedrateofoccurrencewithinagiventimeinterval.Inthiscase,the𝜆valuewillrepresentthenumberofgoalsweexpectateamtoscoreinagame.Toensurewehavethemostaccuratevalueof𝜆,weusedfourdifferentmodelstocreatedifferent𝜆valuesandtestedourmodelswitheachvalueof𝜆.Animportantaspectofanybettingmodelisaccountingforthecontextsurroundingthegame.Specifically,ifateamisplayingonhomeorawayandthestrengthoftheiropponent.Eachmodelweensurethatour𝜆valuewegenerateareadjustedforthestrengthoftheopponentandiftheteamishomeoraway.Inadditiontothis,weensurethatwe
2
preventeddataleakagewithinourmodel.Dataleakageiswhenmodelsusedataforpredictionsthatatthetimeofthepredictionwouldnotbeavailable.The𝜆valueswecalculatedforeachmodelwerecalculatedastheseasonprogressed,forexamplewewouldusedataavailablethrough12weeksintotheseasontopredictoutcomesforthematchesingameweek13.Thisalsodenotesthatour𝜆valuesarebeingupdatedandimproveduponastheseasonprogresses.Tosummarize,foreachweensuredeachofour𝜆valuesfactorsin3majorcharacteristics:
1. The𝜆valueaccountsfortheopponent2. The𝜆valueaccountsforwhethertheteamisplayinghomeoraway3. The𝜆valueupdatesastheseasonprogresses,becomingmoreaccurate
2.1. 𝝀ValuesbasedonGoalsThefirstmodelwecreatedusedgoalsscoredandgoalsallowedtocreateattackinganddefendingstrengthsforeachteam(Smarkets,2020).Theformulaweuseinthemodeltodeveloptheaforementionedstrengthsare:
𝐴𝑡𝑡𝑎𝑐𝑘𝑖𝑛𝑔𝑆𝑡𝑟𝑒𝑛𝑔𝑡ℎ = 𝐺𝑜𝑎𝑙𝑠𝑃𝑒𝑟𝐺𝑎𝑚𝑒𝐹𝑜𝑟𝑇ℎ𝑒𝑇𝑒𝑎𝑚
𝐺𝑜𝑎𝑙𝑠𝑃𝑒𝑟𝐺𝑎𝑚𝑒𝐴𝑐𝑟𝑜𝑠𝑠𝑇ℎ𝑒𝐿𝑒𝑎𝑔𝑢𝑒 (1)
𝐷𝑒𝑓𝑒𝑛𝑑𝑖𝑛𝑔𝑆𝑡𝑟𝑒𝑛𝑔𝑡ℎ = 𝐺𝑜𝑎𝑙𝑠𝐴𝑙𝑙𝑜𝑤𝑒𝑑𝑃𝑒𝑟𝐺𝑎𝑚𝑒𝐹𝑜𝑟𝑇ℎ𝑒𝑇𝑒𝑎𝑚
𝐺𝑜𝑎𝑙𝑠𝐴𝑙𝑙𝑜𝑤𝑒𝑑𝑃𝑒𝑟𝐺𝑎𝑚𝑒𝐴𝑐𝑟𝑜𝑠𝑠𝑇ℎ𝑒𝐿𝑒𝑎𝑔𝑢𝑒 (2)
Thesestrengthswerecalculatedwithhomeandawaysplits.Thisresultedineveryteamhaving4differentstrengths:
1. HomeAttackingStrength2. HomeDefendingStrength3. AwayAttackingStrength4. AwayDefendingStrength
Weusethesestrengthstocalculateeachteam’s𝜆valueforagame.Theformulabelowisused:
𝐻𝑜𝑚𝑒𝐺𝑜𝑎𝑙𝑠𝜆 = 𝐻𝑜𝑚𝑒𝐴𝑡𝑡𝑎𝑐𝑘𝑖𝑛𝑔𝑆𝑡𝑟𝑒𝑛𝑔𝑡ℎ × 𝐴𝑤𝑎𝑦𝐷𝑒𝑓𝑒𝑛𝑑𝑖𝑛𝑔𝑆𝑡𝑟𝑒𝑛𝑔𝑡ℎ × 𝐴𝑣𝑒𝑟𝑎𝑔𝑒𝐿𝑒𝑎𝑔𝑢𝑒𝐻𝑜𝑚𝑒𝐺𝑜𝑎𝑙𝑠 (3)
𝐴𝑤𝑎𝑦𝐺𝑜𝑎𝑙𝑠𝜆 = 𝐴𝑤𝑎𝑦𝐴𝑡𝑡𝑎𝑐𝑘𝑖𝑛𝑔𝑆𝑡𝑟𝑒𝑛𝑔𝑡ℎ × 𝐻𝑜𝑚𝑒𝐷𝑒𝑓𝑒𝑛𝑑𝑖𝑛𝑔𝑆𝑡𝑟𝑒𝑛𝑔𝑡ℎ × 𝐴𝑣𝑒𝑟𝑎𝑔𝑒𝐿𝑒𝑎𝑔𝑢𝑒𝐴𝑤𝑎𝑦𝐺𝑜𝑎𝑙𝑠 (4)
Toprovidemoreclarity,wewillusetheCrystalPalace(H)vsWestBrom(A)gameasanexampleduringthelastgameweekofthe2018PremierLeagueseason.Belowisacalculationofbothteam’sattackinganddefendingstrengthsalongwithourfinal𝜆valueforeachteam:
CrystalPalaceAverageGoalsScoredandGoalsAllowedatHomeRespectively:1.5and1.5WestBromAverageGoalsScoredandGoalsAllowedwhenAwayRespectively:0.56and1.39AverageLeagueGoalsScoredandGoalsAllowedatHomeRespectively:1.52and1.15
3
𝐶𝑅𝑌𝐻𝑜𝑚𝑒𝐴𝑡𝑡𝑆𝑡𝑟𝑒𝑛𝑔𝑡ℎ = 1.51.52
= 0.99𝑊𝐵𝐴𝐴𝑤𝑎𝑦𝐴𝑡𝑡𝑆𝑡𝑟𝑒𝑛𝑔𝑡ℎ = 0.561.15
= 0.48
𝐶𝑅𝑌𝐻𝑜𝑚𝑒𝐷𝑒𝑓𝑆𝑡𝑟𝑒𝑛𝑔𝑡ℎ = 1.51.15
= 1.31𝑊𝐵𝐴𝐴𝑤𝑎𝑦𝐷𝑒𝑓𝑆𝑡𝑟𝑒𝑛𝑔𝑡ℎ = 1.391.52
= 0.91
𝐶𝑅𝑌𝜆 = 0.99 × 0.91 × 1.52
𝑊𝐵𝐴𝜆 = 0.48 × 1.31 × 1.15
CrystalPalace𝜆Value≈1.37
WestBrom𝜆Value≈0.73
2.2. 𝝀ValuesbasedonExpectedGoals(xG)Thesecondmodeweusedtogenerate𝜆valuesisexactlylikethefirstmodel,howeverratherthangoalsweusedexpectedgoals(xG).Expectedgoalsisapopularmetricintheworldofsoccertodaythatrepresentsthenumberofgoalsthatareexpectedtobescoredbasedonthelocationandwayashotistaken.BysubstitutingxGinplaceofgoals,areformulasnowlooklikethis:
𝐴𝑡𝑡𝑎𝑐𝑘𝑖𝑛𝑔𝑆𝑡𝑟𝑒𝑛𝑔𝑡ℎ = 𝑥𝐺𝑃𝑒𝑟𝐺𝑎𝑚𝑒𝐹𝑜𝑟𝑇ℎ𝑒𝑇𝑒𝑎𝑚
𝑥𝐺𝑃𝑒𝑟𝐺𝑎𝑚𝑒𝐴𝑐𝑟𝑜𝑠𝑠𝑇ℎ𝑒𝐿𝑒𝑎𝑔𝑢𝑒 (5)
𝐷𝑒𝑓𝑒𝑛𝑑𝑖𝑛𝑔𝑆𝑡𝑟𝑒𝑛𝑔𝑡ℎ = 𝑥𝐺𝐴𝑙𝑙𝑜𝑤𝑒𝑑𝑃𝑒𝑟𝐺𝑎𝑚𝑒𝐹𝑜𝑟𝑇ℎ𝑒𝑇𝑒𝑎𝑚
𝑥𝐺𝐴𝑙𝑙𝑜𝑤𝑒𝑑𝑃𝑒𝑟𝐺𝑎𝑚𝑒𝐴𝑐𝑟𝑜𝑠𝑠𝑇ℎ𝑒𝐿𝑒𝑎𝑔𝑢𝑒 (6)
Thegameusedasanexampleearlier(CRYvsWBA)wouldhavethefollowingcalculationforthe𝜆valuesforthismodel:
CrystalPalaceAveragexGScoredandxGAllowedatHomeRespectively:1.73and1.2WestBromAveragexGScoredandxGAllowedwhenAwayRespectively:0.77and1.37AverageLeaguexGScoredandAllowedatHomeRespectively:1.39and1.08
𝐶𝑅𝑌𝐻𝑜𝑚𝑒𝐴𝑡𝑡𝑆𝑡𝑟𝑒𝑛𝑔𝑡ℎ = 1.731.39
= 1.24𝑊𝐵𝐴𝐴𝑤𝑎𝑦𝐴𝑡𝑡𝑆𝑡𝑟𝑒𝑛𝑔𝑡ℎ = 0.771.08
= 0.71
𝐶𝑅𝑌𝐻𝑜𝑚𝑒𝐷𝑒𝑓𝑆𝑡𝑟𝑒𝑛𝑔𝑡ℎ = 1.21.39
= 1.11𝑊𝐵𝐴𝐴𝑤𝑎𝑦𝐷𝑒𝑓𝑆𝑡𝑟𝑒𝑛𝑔𝑡ℎ = 1.371.08
= 0.98
𝐶𝑅𝑌𝜆 = 1.24 × 0.98 × 1.39
𝑊𝐵𝐴𝜆 = 0.71 × 1.11 × 1.08
4
CrystalPalace𝜆Value≈1.70
WestBrom𝜆Value≈0.85
2.3.𝝀ValuesbasedonLinearRegression Thethirdmodelweusedtopredict𝜆waslinearregression.Weusedthecumulativestatisticsofateamandtheteam’sopponenttodevelopthemodel.Thelinearregressionpredictstheexpectednumberofhomeandawaygoalsusingthemetricsofbothteams.Thesemetricsinclude:
1. Goals2. Shots3. ShotsonTarget4. Fouls5. Corners6. YellowCards7. RedCards
The𝜆valuegeneratedfromthismodelfortheCrystalPalaceWestBromgameweusedasanexampleearlierwouldbe:
CrystalPalace𝜆Value≈1.54WestBrom𝜆Value≈0.42
2.4.𝝀ValuesbasedonaRandomForestRegressionThismodelpredicts𝜆valuesinthesamemannerasthethird(LinearRegression)model.Ratherthanalinearregression,thismodelmakesuseofaRandomForestregressioninstead.ThemetricsusedbythemodeltomakepredictionsarethesameusedbytheLinearRegressionmodel.
The𝜆valuegeneratedfromthismodelfortheCrystalPalaceWestBromgameweusedasanexampleearlierwouldbe:
CrystalPalace𝜆Value≈1.70WestBrom𝜆Value≈0.38
3. GeneratingProbabilitiesforEveryEvent
3.1.Using𝝀ValuestoGenerateProbabilitiesforaHomeWin,AwayWin,andDrawNowthatwehaveour𝜆values,wecanusethePoissonDistributiontocalculateourprobabilities.BelowistheformulathePoissondistributionusestopredict𝑥numberofevents:
𝑃(𝑥) =
𝑒!" × λ$
𝑥! (7)
5
Forexample,ifwewouldwanttopredicttheprobabilityofateamscoring3goals,𝑥wouldtakethevalueof3.Essentially,tocalculateeachoutcome’stotalprobabilityweusedtheaboveformulatocalculateeverypossiblescorelinefrom0to0to5to5.Usingsimpleprobabilityrules,theprobabilityofeachscorelinecanbecalculatedinthefollowingmanner:
𝑃𝑟𝑜𝑏𝑜𝑓𝑆𝑐𝑜𝑟𝑒𝑙𝑖𝑛𝑒 = 𝑃(𝐻𝑜𝑚𝑒𝐺𝑜𝑎𝑙𝑠) × 𝑃(𝐴𝑤𝑎𝑦𝐺𝑜𝑎𝑙𝑠) (8)
So,theprobabilityofa3-2scorelinewouldbetheprobabilityofthehometeamscoringthreegoalstimestheprobabilityoftheawayteamscoringtwogoals.Wecanshowtheprobabilityofeachscorelineas“ScoreLineMatrix”asseenbelow:
TherowscorrespondtothenumberofgoalsscoredbyWestBrom(Away)andthecolumnscorrespondtothenumberofgoalsscoredbyCrustalPalace(Home).Thenumberpresentwithineachcellistheprobabilityofthatscorelinehappening.Asdiscussedearlierthelikelihoodofeachscorelineiscalculatedbymultiplyingtheprobabilities.Basedonthematrix,wecanseethatthemodelbelievesthemostlikelyscorelineis1–0whichhasa0.21238(0.31056×0.68386)chanceofoccurring.
6
Thetotalprobabilityforeveryeventisthesumofallthescorelinesthatcorrelatewiththatoutcome.Belowaretheformulasweusedforthis:
𝐻𝑜𝑚𝑒𝑊𝑖𝑛 = S𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑖𝑒𝑠𝑜𝑓𝑆𝑐𝑜𝑟𝑒𝑙𝑖𝑛𝑒𝑠𝑤𝑖𝑡ℎ𝐻𝑜𝑚𝑒𝑇𝑒𝑎𝑚𝑊𝑖𝑛𝑛𝑖𝑛𝑔 (9)
𝐴𝑤𝑎𝑦𝑊𝑖𝑛 = S𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑖𝑒𝑠𝑜𝑓𝑆𝑐𝑜𝑟𝑒𝑙𝑖𝑛𝑒𝑠𝑤𝑖𝑡ℎ𝐴𝑤𝑎𝑦𝑇𝑒𝑎𝑚𝑊𝑖𝑛𝑛𝑖𝑛𝑔 (10)
𝐷𝑟𝑎𝑤 =S𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑖𝑒𝑠𝑜𝑓𝑆𝑐𝑜𝑟𝑒𝑙𝑖𝑛𝑒𝑠𝑒𝑛𝑑𝑖𝑛𝑔𝑤𝑖𝑡ℎ𝑎𝐷𝑟𝑎𝑤 (11)
Foreachofourmodels,hereweretheprobabilitiesgeneratedfortheCrystalPalaceWestBromgame:
Weconsideredthescorelinesfrom0-0to5-5tocoverenoughoutcomestogenerateanaccurateprobabilityforeacheventinthegame.Belowisadistributionofthetotalprobabilityforeverygameforeachmodel.Ideally,wewanttheprobabilitytobeequalto1asthatwouldentailthatalloutcomesarecovered.
Model HomeWinProbability(CrystalPalaceWin)
AwayWinProbability(WestBromWin)
DrawProbability
Goals 0.52 0.198 0.297
xG 0.567 0.186 0.239
LinearRegression 0.649 0.098 0.248
RandomForest 0.696 0.077 0.22
7
Ifwelookatthedistributionsofthetotalprobabilityforeachmodel,wecanseeanoverwhelmingmajorityhaveasumover0.9withmostbeingequaltoatleast0.99.Thesearealsofromgamesbeginninginmatchweek5,henceastheseasonprogresses,weanticipatethedistributiontomoveevencloserto1.Hence,thePoissonDistributionweusedwitheach𝜆valuewegeneratedaccuratelyaccountsforeachoutcomeinthegame.
4. ModelCalibration(BrierScore)WecalibratedthemodelusingaBrierScoretodeterminewhichmodelgeneratedthemostaccurate𝜆value.TheBrierScoreisascorefunctionthathelpsdeterminetheaccuracyofanyprobabilisticmodel.TheBrierscoreforpredictionsisgivenbytheformulabelow:
𝐵𝑆 =
1𝑛S(𝑝! − 𝑜!)"#
!$%
(12)
Essentially,wearemeasuringiftheoutcomeswepredictwithacertainprobabilityareoccurringinthatproportion.Thebestperformingmodelwouldhaveprobabilityvaluescorrespondingtotherespectiveinterval.Forexample,amongalltheeventsfortherestoftheseasonthatwepredictedtooccurwithaprobabilityof0.6and0.7,wewouldwanttobecorrectaround65%(0.65)ofthetime.Theoretically,calibrationresultsaftereachgameweekwouldimproveasthemodelswouldhavemoredata.Wewantedtoidentifywhichmodelandatwhichgameweekthemarginalimprovementisminimaltobeabletounderstandtheidealtimetobegintousethebestperformingmodel.Basedonourcalibrationresults,wedeterminedthatthebestmodeltoestimate𝜆wastheRandomForestModelaftergameweek13.Thisisthemodelweusewhenevaluatingourmodel’ssuccess.Thecalibrationresultswerecalculatedaftereachgameweek,withthebestcalibrationresultbeingshowninthegraphsbelow.Thegraphsshowcaseall4modelsandhowwelltheydoinpredictingthehometeamandtheawayteamtowin.Thesearegraphsforthecalibrationresultsaftergameweek13.Thedashedredlineiswhatperfectcalibrationwouldlooklikeandprovidesareferenceforcomparison.Themodelclosesttotheredlinewouldbethebestmodelintermsofpredictions.InthecaseofbothhomewinsandawaywinstheRandomForestisthebestmodel.
8
5. BankrollManagementSystem(KellyCriterion)TheKellyCriterionmodelisaformofprobabilitytheoryoftenusedbyinvestorsthatweplantoincorporateintothismodel.Thegoalofthemodelistomaximizeprofitwhileaccountingfortheriskassociatedwithalostbet.Thisisdonebymaximizingthelogarithmofthepotentialendingbankrollsafterthebetisplaced.ManyonlinesourceschoosetosimplifythemathbehindtheKellyCriterionTheoryintoagenericformulathatlookslikethis:
(𝑂 × 𝑃&) − 𝑃'𝑂
= 𝐵
O = 𝑂𝑑𝑑𝑠, 𝑃& = 𝑃𝑟𝑜𝑏𝑜𝑓𝐵𝑒𝑡𝑊𝑖𝑛, 𝑃' = 𝑃𝑟𝑜𝑏𝑜𝑓𝐵𝑒𝑡𝐿𝑜𝑠𝑠, 𝑎𝑛𝑑𝐵 = 𝑃𝑒𝑟𝑐𝑒𝑛𝑡𝑎𝑔𝑒𝑜𝑓𝐵𝑎𝑛𝑘𝑟𝑜𝑙𝑙
(13)
WhilethisformuladoesinvolvesomeprinciplesoftheKellyCriterionTheory,itdoesnotreflectthetheoryitself.Also,asoccermatchhasthreeoutcomes,slightlycomplicatingthemathofthissimplifiedtwooutcomeformula.OurgoalistosimplymaximizethelogarithmofourpotentialbankrollaccordingtotheKellyCriterionTheory.BelowisouroptimizationfortheCrystalPalacevs.WestBromexamplediscussedearlier.Theinputsaretheoddsandprobabilitiesofeachpossibleresultaswellasthestartingbankroll.Usingthisinformation,wecancalculateourendingbankrollforeachpossibleeventandthentakethelogarithmofthat.Lastly,the“objective”(KCOValue)isaweightedaverageofthelogarithmsandtheirassociatedprobabilities.𝐾𝐶𝑂 = (𝐻𝑜𝑚𝑒𝑊𝑖𝑛𝑃𝑟𝑜𝑏 × 𝑙𝑜𝑔(𝐻𝑊)) +(𝐴𝑤𝑎𝑦𝑊𝑖𝑛𝑃𝑟𝑜𝑏 × 𝑙𝑜𝑔(𝐴𝑊)) +(𝐷𝑟𝑎𝑤𝑃𝑟𝑜𝑏 × 𝑙𝑜𝑔(𝐷𝑊))
HW=EndingBankrollHomeWin,AW=EndingBankrollAwayWin,DW=EndingBankrollDraw
(14)
9
AboveisanimageconsistingoftheinputsandusingthisinformationandR,wemaximizetheobjectivecellbychangingthe“BetAmount”cells.Theimagebelowiswhatatypicaloutputwouldlooklike.Basedontheseoddsandprobabilities,themodelsuggestsabetof$22.34onCrystalPalacetowinandabetof$1.65onWestBromtowinaretheoptimalbetstoplacethatmaximizelongtermprofitability.Thesecondbetisaformofhedging(TheLines,2020).Theresultofthegamewas2–0infavorofCrystalPalace,hencewewouldhaveprofited$16.22forthisbet.Wecanseeherethatthemodelpredictsaformofhedging,tominimizelosstosomedegree
Tofurtherunderstandhowtheobjectivecellinteractswithdifferentbetamountswecanusethegraphbelow:
10
Theabovegraphshowshowtheobjectivecellinteractswithdifferentbetamountsforasamplegame(blue)inwhichtheteambeingbetonhas+100bettingoddsanda60%chancetowinthegame.Itfollowssomewhatofaquadraticformthatvariesinshapedependingonthebettingoddsandprobabilityofwinning.Theleftsideofthecurveindicatesbetamountsthatareprofitable,butnotasprofitableashigherbetamounts.Therightsideofthecurveindicateshigherbetamountswouldbeconsideredhigherriskbetsthatwouldprohibitlongtermsuccess.Theorangelineindicatestheobjectivenumber(KCOValue)ifnobetisplaced.Inotherwords,betsinwhichthebluelineisbelowtheorangelineshouldnotbeplaced.Asmentionedbefore,ourmodelessentiallylookstofindthemaximumoftheblueline(HaghaniandDewey,2020).ThismethodofriskmanagementandprofitmaximizationhasbeenprovenveryeffectiveinastudyconductedbyVictorHaghaniandRichardDewey.Beforemovingontoourresults,itisimportanttopointouttwomoreaspectsofthismethodology.Thefirstishowtointerpretthe“objective”numberproducedbytheoptimization.Asmentionedearlier,theobjectivenumberistheproductofthelogarithmofallpossibleendingbankrolls.Sinceweassumedastartingbankrollof$100,theobjectivecellwhenplacingnobetis2sincethereisa100%chanceofendingwith$100andlog(2)=100.Afterwefindtheoptimalbet,theobjectivecellbecomesarepresentationofourconfidenceinprofitabilitywithaKCOvalueover2havingalotofconfidencerelativetoavaluebelow2.Lastly,therearesomegamesinwhichabetmightbeplacedonateamtowinaswellasasmallerbetononeoftheotheroutcomes.Thisoccursasaformofhedgingthatreducesthepotentiallossofourbetnothitting.6. ModelEvaluation6.1.InterpretingKCOValuesasRiskEstimatorsBeforeanalyzingourmodel’sresults,itisimportanttounderstandtheKCOvalueassociatedwitheachbet(GameWeek5onwards).BelowisadensityplotfortheKCOvaluesassociatedwitheachbet:
11
Asmentionedearlier,aKCOvalueof2isassociatedwhennobetisplaced.Wecanseethatthepeakofthedistributionisatavalueextremelycloseto2.Thismakessenseasoddsmakersareextremelyeffectiveatsettingoddsandhenceamajorityofthemodel’ssuggestionsaretonotbetanything.Thevaluesontheleftsideofthedistributionarebetsthataredeemed“risky”withbetstotherightofdistributionbeingabetwithlessrisk.Wecanalsoseeaweakbutpositiverelationshipbetweenthesuccessofabetandthebet’sKCOvalue.Wecanseethisrelationshipbelow,withNetIncomebeingtheprofitorlossfromabet(GameWeek13onwards):
WecanseeastheKCOvaluesincreases,thenetincomeofabetalsotendstoincrease.WecanalsoseethattheproportionofbetsthatmakeaprofitaregreaterwhentheKCOvalueisgreaterthan2.ThisweakpositivecorrelationbetweentheNetIncomeandKCOValueshowsusthattheKCOValuecanbeusedasariskestimator.Thisalsoentails,thatifwefilterthemodelbasedontheKCOValue,wecanincreaseourprofitpercentage.Theoretically,ifweonlyplacebetswithaKCOvaluegreaterthan2or2.1,ourprofitpercentagewillimprove.Wefurtherexplorethishypothesisinthenextsection.
6.2.SuccessoftheModelintermsofProfitInthissection,wewillobservehowsuccessfulthemodelwasintermsofoverallprofit.ItisimportanttonotethattheKellyCriterionsuggestsdifferentamountstobetforeachgamebasedontherisk,hencewedecidedtouseprofitpercentage,asanindicatorforthemodel’ssuccessinadditiontosimplyprofit.Asmentionedearlier,weusedabankrollof$100foreachgamesothateachbetcouldbemadeindependentofthesuccessofanotherbet.ThefirstaspectwelookedatwastherelationshipbetweenprofitpercentageandtheKCOvalue.Ifwelookatthe2019LaLigaSeason,wecanseethatasweincreaseourfilterfortheKCOvaluewithincrementsof0.01,theprofitpercentagealsosteadilyincreases.Inadditiontothis,thenumberofbetssuggestedalsodecreasessteadilyinthiscase.Thisishighlightedbelow:
12
Wethenapplytothistoeveryleague,lookingatprofitpercentagewith3categoriesforthebetsbasedontheirKCOValue.IftheKCOValueisgreaterthan2,greaterthan2.05,andgreaterthan2.1.
Wecanclearlyseeforeveryseason,theoverallprofitpercentagedrasticallyimprovesasweselectbetswithhigherKCOvalues.Everyorangebarishigherthantheleague’srespectivebluebarandeverypurplebarishigherthantheleague’srespectiveorangeandbluebaroutsidethePremierLeague.WhilethereisamassiveamountofsuccessbasedonprofitpercentageforbetswithaKCOvaluegreaterthan2.1,thenumberofbetsbeingplacedisminimal.TheBundesligain2019hadonly
13
26betswithaKCOvaluegreaterthan2.1whichwasthemostwesawinaseason.Whiletheprofitpercentageishigher,theoverallprofitishigherwhenplacingbetswithaKCOvaluegreaterthan2or2.05.Hence,ifthegoalisprofitmaximizationausedcouldplacebetswithaKCOvaluegreaterthan2or2.05.Belowisagraphhighlightingtheamountsriskedandwonforthesebets:
TheKCOvalueprovidesauniqueaspectofthemodelasitcanbetailoredtoanindividual’spreferenceofhowriskytheywouldliketobewiththeirbets.Toprovideaholisticexplanation,belowisatablehighlightingtheresultsforthemodel:
LeagueKCOValueGreaterthan2 KCOValueGreaterthan2.1
AmountRisked AmountProfited AmountRisked AmountProfited
Bundesliga2018 $3251.95 $1273.59 $454.73 $616.80
Bundesliga2019 $4705.58 $7780.69 $1720.42 $5014.22
LaLiga2018 $4633.20 $1363.87 $1098.77 $703.33
LaLiga2019 $5517.66 $2847.05 $1215.85 $2555.30
Ligue12018 $4727.76 $1812.51 $894.89 $1117.37
Ligue12019 $4136.08 $1555.22 $674.37 $519.70
PremierLeague2018 $5261.52 $3483.21 $1148.28 $933.22
PremierLeague2019 $5464.42 $2060.52 $897.53 $176.17
SerieA2018 $4519.20 $796.81 $1178.63 $1200.40
SerieA2019 $4070.81 $804.77 $704.14 $625.49
14
Ifweweretouseourmodelforthepast2years,placingbetsthathadaKCOvalueofgreaterthan2wewouldprofit$23,778.24frombettingon1164games.IfweonlyplacedbetswithaKCOvalueofgreaterthan2.05wewouldprofit$17,450.94frombettingon387games.Lastly,ifweonlyplacedbetswithaKCOvalueof2.1wewouldprofit$9,987.61frombettingon161games.InordertofullyunderstandtheimportanceoftheKCOvalue,wecanlookattheamountofmoneywewouldwinorlosewithoutimplementingtheKCOaspect.Ifweweretosimplybetontheoutcomewiththehighestprobabilityforeachgame,overthe10seasonswewouldprofit$62,340.Whilethisseemslikealargeamount,wewouldberiskingover$237,800.Thisyieldsaprofitpercentageof26.22%whichisnotverygoodcomparedtothemodelwiththeKCOelement.ThereisasignificantlylargerreturnonyourinvestmentwhenusingtheKCOaspectwithinmodelandhencewebelieveitisextremelyimportanttouse.Moreover,asdiscussedearlierthedifferentKCOintervalsallowforusertotailorthemodeltotheirpreferenceinhowriskytheywouldliketobe.ThisuniqueaspectisalsoabigadvantageprovidedbytheKCOaspectofthemodel.7. FutureResearch
7.1.BivariateandZeroInflatedPoissonDistributionsItwouldbeextremelyinterestingtorepeatthisprocesswithaBivariateorZeroInflatedPoissonDistribution.Abivariatedistributionprovidesprobabilitieswhenyouhavetwoindependentvariables,allowingforeachcombinationtobeaccountedfor.HencebyusingHomeGoalsandAwayGoalsastheindependentvariables,thedistributionwouldprovideprobabilitiesforeachcombinationofhomegoalsandawaygoal.Inasimilarmanneraswedointhispaper,wecansumtheprobabilitiesforeachcorrespondingevent.
AzeroinflatedPoissondistributionissimilartoaPoissonDistribution,howeveritisprimarilyusedwhenanexcessof0isexpectedwithinthedata.Consideringthatoutcomeswith0goalsbeingscoredbyateamtendtobehigherinnature,azeroinflatedPoissonDistributionmaybemoresuitedtopredictingprobabilities.
7.2.FactoringLineupSelectionsWithinOurPredictionsAmajorlimitationforourmodelisthatourmodeldoesnotaccountforinjuriesandlineupchanges.Ifwecanfactorinthestrengthoflineupsusingaholisticstatistic,ourpredictionscandrasticallyimproveintermsofitsaccuracy.AnexamplewouldbeusingtherecentlycreatedAdvancedRPMmetricbytheSyracuseUniversitySoccerAnalyticsClub.Byusingeachplayer’sAdvancedRPMvaluewithinourmodel,eachlineup’sstrengthisbeingaccountedfor.Thisfactorsinwhichplayersareplayingforeachteam,allowingustocreateevenmoreaccuratepredictions.
7.3.AdjustingforDifferentLeaguesWecanseeadecentvariationinsuccessbetweentheleagueswelookedat.Whileallofourmodelsarepositiveandyieldagreatamountofprofit,thereispossibleroomforimprovementbyadjustingforthetalentdistributionamongtheleagues.Forexample,thePremierLeagueisconsideredoneof
15
themorecompetitiveleagues,ifwecanfactorthatwithinourmodel,thesuggestedbetsandoverallprofitfromthemodelcanbesignificantlyimproved.
8. ConclusionWhenappropriatelyusingpastdatatopredictgameoutcomeprobabilitiesandthenallocatingfundsefficientlytominimizeriskitisclearthatthereismoneytobemadebettinginEuropeansoccer.UtilizingourpredictivemodelandwiththeKellyCriterion,wefoundalargeamountofsuccess.ThefilteringofourKCOvaluesineachrespectiveleaguehelpusmaximizeourreturnoninvestmentwhenbetting.WefoundthemostsuccessandgreatestprofitpercentageforallgameswheretheKCOvaluewasgreaterthan2.1withaprofitpercentageincreasingupto291.45%intheBundesligain2019.Asdiscussedabove,therearealwaysimprovementsthatcanbemadewithourmodel.Wehopetoexpandourmodelandcontinuetogrowthoseprofitpercentages,findingthemostaccurateandprofitablewaytopredictEuropeansoccermatchoutcomes.
16
References[1]HowtocalculatePoissondistributionforfootballbetting.(n.d.).Retrieved2020,fromhttps://help.smarkets.com/hc/en-gb/articles/115001457989-How-to-calculate-Poissondistribution-for-football-betting[2]ListofCompetitions.(n.d.).Retrieved2020,fromhttps://fbref.com/en/comps/[3]TherealKellyCriterion:Acriticalanalysisofthepopularstakingmethod.(2017).Retrieved2020,fromhttps://www.pinnacle.com/en/betting-articles/Betting-Strategy/the-real-kellycriterion/HZKJTFCB3KNYN9CJ[4]FootballResults,Statistics&SoccerBettingOddsData.(n.d.).Retrieved2020,fromhttps://www.football-data.co.uk/data.php[5]HowToHedgeABet:SportsBettingStrategyExplained.(n.d.).Retrieved2020,fromhttps://www.thelines.com/betting/hedge[6]Haghani,VictorandDewey,Richard,RationalDecision-MakingunderUncertainty:ObservedBettingPatternsonaBiasedCoin(October19,2016).Retrieved2020,fromhttps://papers.ssrn.com/sol3/papers.cfm?abstract_id=2856963
top related