regression to the mean at the masters golf …economics-files.pomona.edu › garysmith › econ190...

18
Regression to the Mean at The Masters Golf Tournament A comparative analysis of regression to the mean on the PGA tour and at the Masters Tournament Kevin Masini Pomona College Economics 190

Upload: others

Post on 29-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Regression to the Mean at The Masters Golf …economics-files.pomona.edu › GarySmith › Econ190 › Econ190 2018...one of the most famous golf courses in the world. The Masters

RegressiontotheMeanatTheMastersGolfTournamentAcomparativeanalysisofregressiontothemeanonthePGAtourandattheMastersTournament

KevinMasiniPomonaCollegeEconomics190

Page 2: Regression to the Mean at The Masters Golf …economics-files.pomona.edu › GarySmith › Econ190 › Econ190 2018...one of the most famous golf courses in the world. The Masters

2

1. Introduction

Everysportinvolveselementsofluckandskill.EvenonthePGAtour,whichis

consideredasthehighestlevelofgolf,scoresandwinnersareoftendeterminedbyafortuitous

bounceontothegreenoranunluckykickintoahazard.Becausegolfissuchagameofinches,

thereisanimperfectcorrelationbetweenplayerperformanceandskill.Thisimperfect

correlationcanbeseeninallsports,andisespeciallyevidentinthegameofgolf.Thisiswhy

weseesomanydifferentwinnersonthePGAtourandwhyitissodifficultforplayerstowin

multiplestournamentsinagivenseasonandeventhroughoutaplayer’scareer.The

aforementionedimperfectcorrelationleadstoaphenomenonknownasregressiontothe

mean.

1.1RegressiontotheMean

Regressiontothemeanisthephenomenonwheresomeonewhoperformstowardan

extremeoneyearislikelytoperformclosertothemeanthefollowingyear.Regressiontothe

meancanbeseeninmanydifferentaspectsoflife,butisespeciallynoticeableinsports.Itwas

firstobservedin1886whenSirFrancisGaltonstudiedtherelationshipbetweentheheightsof

parentsandtheirchildren(Galton,1886).Thisinauguralworkhasledtofurtherresearchonthe

phenomenon.Awell-knownexampleofregressiontothemeanisthe“sophomoreslump”.

Thesophomoreslumpiswhereaplayerwhohasaparticularlyexceptionalrookieseasonshows

declineintheirsecondseason.Thisisverymuchthedefinitionofregressiontothemean.A

rookiewhohadanexceptionalseasonlikelyoutperformedtheirtrueabilityandwillregress

Page 3: Regression to the Mean at The Masters Golf …economics-files.pomona.edu › GarySmith › Econ190 › Econ190 2018...one of the most famous golf courses in the world. The Masters

3

towardsthemeanthefollowingyear.Justasaplayerwhounderperformsintheirfirstseason

willlikelyperformbetterintheirsecondseason.

1.2TheMasters

Eachseasontherearenearly50PGAtourevents.Ofthesetournamentstherearefour

majortournaments(majors).Thefourmajorsareviewedasthemostimportanttournaments

eachyear.Ofthefour,TheMastersTournamentistheonlyoneplayedatthesamecourse

everyyear.TheMasterswasfirstplayedin1934andtypicallyhasafieldofeightytoone

hundredofthebestgolfersintheworld.EachyearTheMastersisplayedatAugustaNational,

oneofthemostfamousgolfcoursesintheworld.

TheMastershasbeenplayedatAugustaNational73times,ofthose73,47havebeen

wonbymultipletimewinners.Thatis,peoplewhohaveoneatleasttwiceaccountfornearly

two-thirdsofthevictoriesatAugusta.Thatmeanstherehavebeen26one-timewinnersatThe

Masters.TrevorImmelmanwonthetournamentin2008asoneofhisonlytwowinsonthePGA

tour.Furthermore,hehasonlyfinishedinthetop10twiceinhisfifteenappearancesat

Augusta.ThisisarareoccurrenceatTheMasters.Typically,fansseefamiliarnamesatopthe

leaderboardeachyear.Forexample,PhilMickelsonhasfinishedintheTop10atTheMasters

infourteenofhistwenty-fourprofessionalstarts,winningthreetimes.Toputthatinto

perspective,Philhasfinishedinthetop10in58%ofhisMastersstartscomparedto34%ofhis

PGAtourstarts.SimilartoMickelson,manyplayersseemto‘showup’atTheMastersevery

year.Whetheritbethecourse,thefactthatmanyplayerstailortheirschedulearoundthe

tournament,orsomeotherreason,itseemsthatcertainplayersshowlessregressiontothe

Page 4: Regression to the Mean at The Masters Golf …economics-files.pomona.edu › GarySmith › Econ190 › Econ190 2018...one of the most famous golf courses in the world. The Masters

4

meanfromyeartoyearatTheMasters.ItisbecauseofthisthatIhypothesizethatwewillsee

lessregressiontothemeanatTheMastersthanisseenduringtheentirePGATourseason.

Thisgoesforbothyear-to-yearaswellasfromround-to-round.

2. LiteratureReview

Regressiontothemeanisstudiedinanumberofdifferentareas,withsportsbeingone

ofthemainfocuses.Whenitcomestosports,aplayer’sperformancecanbemodeledbya

combinationofluckandskill.Essentially,eachathletehasabaseskilllevelandthenhas

differentlevelsofluckonagivendayorduringagivenseason.Intermsofgolf,weseethese

fluctuationsinluckmoreoftenthanthetypicalsport.InKahnemen’sThinkingFastandSlow

(2011)heoffersasimplemodelofluckandskill,whichisasfollows:

𝑠𝑢𝑐𝑐𝑒𝑠𝑠 = 𝑡𝑎𝑙𝑒𝑛𝑡 + 𝑙𝑢𝑐𝑘

𝑔𝑟𝑒𝑎𝑡𝑠𝑢𝑐𝑐𝑒𝑠𝑠 = 𝑎𝑙𝑖𝑡𝑡𝑙𝑒𝑚𝑜𝑟𝑒𝑡𝑎𝑙𝑒𝑛𝑡 + 𝑎𝑙𝑜𝑡𝑜𝑓𝑙𝑢𝑐𝑘

Thissimplemodeloffersinsightonregressiontothemeaningolfandhowtointuitively

understandthefluctuationsinplayer’sscores.Thinkofthefirsttworoundsofagolf

tournament.Saythattheaveragescoreispar,ora72.Onewouldexpectthataplayerthat

shota65hasaboveaverageskill,butalsoexperiencedaboveaverageluck.Thisplayerislikely

tobesuccessfulonthesecondday,butprobablylesssuccessfulbecausetheywillnotbeas

luckyastheywereonthefirstday(Kahneman,2011).Kahnemandoesagoodjobofdescribing

thetheorybehindregressiontothemeanandmorespecificallyluckandskillingolf,butdoes

notofferanydataonthesubject.

Page 5: Regression to the Mean at The Masters Golf …economics-files.pomona.edu › GarySmith › Econ190 › Econ190 2018...one of the most famous golf courses in the world. The Masters

5

ConnollyandRendleman(2008,2009)usethismodelofluckandskill,butoffermore

insightsonthedirectresultthatithasongolfers.Theydiscoveredthatthewinnerofanormal

PGAtoureventexperiencesroughly2.5strokesperroundofabnormallyfavorablerandom

variationinscoring.BroadieandRendleman(2015)wentdeeperintheiranalysisofluckand

skillatalllevelsofgolfbylookingathowplayer’sperformancechangedfromthefirstroundto

thesecondroundoftournaments.Theysplitplayersintotwogroups,basedontheirfirstround

performance.Group1beingplayersinthetophalfandGroup2beingplayersinthebottom

half.Theythenlookedathowplayersineachgroupperformedinthesecondround.They

foundthatGroup1asacollectiveperformedmuchworseontheseconddaywhileGroup2

showedmuchimprovement.Thistestshowedclearevidenceofregressiontothemean

betweenthefirsttworoundofprofessionalgolftournaments.Theiranalysisalsolookedat

howdifferentskilllevelsareeffectedbyluckandskill.Theydiscoveredthatasyoudecreasethe

skilllevelofgolfersfromprofessionalstoamateurstoyoureverydaycountryclubgolfer,the

variationinscoresismorelikelytobeduetoskillratherthanluckwhentheplayersareless

skilled.Thisisknownastheparadoxbetweenluckandskill.

SchallandSmith(2000)lookedatregressiontothemeaninprofessionalbaseball

players.Theiranalysisdidnotfocusonthemodelofluckandskill,butusedaverysimilar

modelforplayerperformance.Theydidaseason-by-seasonanalysisofbattingaveragesand

earnedrunaveragesstandardizedeachseasontohaveameanofzeroandastandarddeviation

of1.Theyfoundthattherewasanimperfectcorrelationinperformancefromoneyeartothe

next.Becauseperformanceisimperfectlymeasured,playersbattingaveragesandearnedrun

averagesregresstowardsthemean.

Page 6: Regression to the Mean at The Masters Golf …economics-files.pomona.edu › GarySmith › Econ190 › Econ190 2018...one of the most famous golf courses in the world. The Masters

6

3. Data

ThispaperutilizesdataobtainedfromthePGAtoursShotLinkdatabase.Thedatabase

hasdataontheoverallresultsoftournamentsaswellasshot-by-shotdataforeveryshothitin

competitionplay.ThePGAtourhashundredsofvolunteersateachtournamenttohelpwith

thecollectionoftheshot-by-shotdata.Theyusethisshot-by-shotdatatorunanalyseson

playersandtournamentstoofferinsightintohowplayersindividuallyandasagroupperform

onanumberofdifferentlayersofskillsets.

Intermsofthisanalysis,theshot-by-shotdataisnotnecessary.Thispaperutilizes

playerscoresduringthefirsttworoundatTheMastersTournamentaswellasaveragefirstand

secondroundscoresforplayersthroughouttheentireseason.Scoresfromthethirdandfourth

roundsarenotusedastheyoccurafteranumberofplayersare“cut”fromthetournament.

Datawaspulledfortheten-yearstretchfrom2008until2017.

4. Methodology

Thisanalysisdiffersfrompreviousanalysesinthatitisacomparativeanalysisbetween

thePGAtourseasonandTheMastersTournament.Ilooktoseeifthereisasignificant

differenceinhowplayersregresstothemeanatTheMasterscomparedtothroughoutthe

season.Regressiontothemeanislookedatfromyear-to-yearaswellasfromround-to-round

inagivenyear.Atypicalprofessionalgolftournamentconsistsoffourroundsoftournament

playwithpoorerperformingplayersbeingcutfollowingthesecondround.Thispaperfocuses

onthefirsttworoundsofthetournamentinordertoincludeeveryplayerinthefieldfora

giventournament.Inordertoseehowplayersperformfromoneroundtothenext,thisstudy

Page 7: Regression to the Mean at The Masters Golf …economics-files.pomona.edu › GarySmith › Econ190 › Econ190 2018...one of the most famous golf courses in the world. The Masters

7

usesatestverysimilartotheoneperformedbyBroadieandRendleman(2015).Thesecond

partoftheanalysisistoseehowplayersperformacrossseasons.Inordertorunthisanalysis

thispaperwilluseamodelsimilartothatusedbySchallandSmith(2000).

4.1Round-By-RoundAnalysis

Theround-by-roundanalysiscompareshowplayersperformfromoneroundtothenext

duringthePGATourseasonandatTheMasters.Foreachgroup,playersareassignedtoaone

oftwogroupsafterthefirstroundofplay.Thetophalf(theplayerswhoshotthelowest

scores)areplacedinGroup1,andthebottomhalfisplacedinGroup2.Thentheaverage

second-roundscoreiscomputedforthesamegroups.

Thereareseveraldifferentfactorsthatgointothegroupingofplayers.Playersinthe

firstgroupmaysimplybemoreskilledthanthoseinthesecondgroup.Or,itcouldbethatthe

firstgroupjustexperiencedmorefavorablerandomvariation,alsoknownas“luck”.Ifitwas

onlytheskilloftheplayerthatdeterminedthegroupsonewouldexpectthattheplayersfrom

Group1wouldhaveasecond-roundaveragescoreroughlythesamenumberofstrokesbetter

thanGroup2astheydidinthefirst-round.Ifluckwastheonlyfactorinthefirstround,then

onewouldexpectthatthetwogroupswouldhaveaveragesthatareclosetoequalinthe

secondround.Finally,ifacombinationofluckandskilliswhatdeterminesscoresthenone

wouldexpectthatthedifferencebetweensecond-roundscoreswouldbesmallerthanthe

differencewasforfirst-roundscores.Thedifferenceforgroupsarethencomparedbetween

thePGATourseasonandTheMasters.Thiscomparisoncanbequantifiedbylookingatthe

correlationbetweendifferences.

Page 8: Regression to the Mean at The Masters Golf …economics-files.pomona.edu › GarySmith › Econ190 › Econ190 2018...one of the most famous golf courses in the world. The Masters

8

4.2Year-To-YearAnalysis

Inordertocompareplayerscoresfromdifferentyears’performancecanbe

standardizedbyfindingthedifferencebetweenaplayer’sperformancefromagivenyearand

themeanperformanceforallplayersduringsaidyear.Thisnumbermustbedividedbythe

standarddeviationofperformanceacrossallplayersfortheseason.

FollowingtheworkofSchallandSmith(2000),aplayer’sperformanceforagivenyearis

determinedbyanexpectedvalue(x),whichcanbethoughtofustheplayer’sskilllevelortrue

ability.Theplayer’sactualperformancethendiffersfromtheirtrueabilitybyarandomterm

(E)thathasanexpectedvalueofzeroandisindependentofskillaswellastherandomterms

valueinotherseasons.Thisthengivesusthefollowingequation:

𝑌 = 𝑥 + 𝐸

Onceplayersscoresarestandardized,player’sperformancecanbecomparedfromyear-to-year

andbetweenthePGATourseasonandTheMasters.

5. Results

Analysesofthepast10seasonsshowthatregressiontothemeanatTheMastersisnot

significantlydifferentthanitisduringthePGAtourseason.Ifanything,thereismoreregression

tothemeanatTheMastersthanduringtheseason.Whenlookingatthedifferencebetween

playerscoreandtheaveragescore,theR-squaredvalueattheMastersforthe2015and2016

seasonsis.105.Thisiscomparedwithavalueof.185forthePGAtourseason.Onecansee

Page 9: Regression to the Mean at The Masters Golf …economics-files.pomona.edu › GarySmith › Econ190 › Econ190 2018...one of the most famous golf courses in the world. The Masters

9

thatwhilebothvaluesarelow,theR-squaredforTheMastersissignificantlylowerthanduring

thePGATourseason.

Whenlookingfromround-to-roundin2015,thePGAtourseasonshowsasexpected

regressiontothemeanwithanr-squaredvalueof.131.Themastersshowedanevensmaller

value.TheR-squaredforTheMastersin2015is.00034,showingnearlynorelationship

betweenfirstandsecondroundscoresofplayers.Thisseemstoshowtheparadoxofluckand

skill,whichhasbeenseeninpreviousworks.

Thislackofcorrelationbetweenthescoresofplayersbetweenroundsisevidentinthe

round-by-roundanalysisusingtwogroups.Table1abelowshowsthatthegroupsconverge

towardsthemeaninthesecondround.Thisgivessolidevidenceconfirmingtheworkof

BroadieandRendleman(2015),sayingthatacombinationofluckandskilliswhatleadstototal

performanceinprofessionalgolf.Furthermore,therewasnosignificantdifferencebetween

thegroupsatTheMastersandduringtheregularPGATourseason.DuringthePGATour

season,playersinthefirstgroupstillhavealowerscorethanthoseinthesecondgroupinthe

secondround.ThisisnottrueforTheMasters.AttheMastersweseethatthefirstgrouphas

aslightlynegativecorrelationbetweenthefirstandsecondrounds.Regressiontothemeanis

soseverethatGroup1scoresworsethanthesecondgroupduringthesecondroundatThe

Masters.ThisseemstosuggestthatdeviationinscoresbetweengroupsatTheMastersis

causedsolelybyluck.

Whencomparingthecorrelationoffirstandsecondroundscoresbetweenthedifferent

groups,oneseesverylittlecorrelationforbothgroups.Maybethemostinterestingpartisthe

Page 10: Regression to the Mean at The Masters Golf …economics-files.pomona.edu › GarySmith › Econ190 › Econ190 2018...one of the most famous golf courses in the world. The Masters

10

mannerinwhichcorrelationsfluctuatefromyeartoyearascanbeseeninTable1b.For

example,in2015Group1hadsawafairlysignificantpositivecorrelationbothduringThe

Masters(.24)andduringtheseason(.44)whilethegroupwasnearlyzeroforallotherseasons.

Group2,ontheotherhand,showedapositivecorrelationin2016duringtheseason(.28)anda

similarlynegativecorrelationatTheMasters(-.22).Thefactthatthecorrelationistypically

closetozero,andthattheyfluctuateyearbyyearandgroupbygroupgoestoshowjusthow

randomgolfcanbe.

LookingatthecorrelationbetweenroundsfortheentirefieldatbothTheMastersand

duringthePGAseasonoverthepast10yearsfurtherrevealstherandomnessbetweenrounds.

ThePGAseasonismuchmoreconsistentthanTheMasterswithcorrelationsfluctuating

between.29and.51overthepast10years.Ontheotherhand,TheMastersfluctuatesfrom

.08to.47overthesameyears.ThePGAseasonhasahighercorrelationbetweenroundsin8of

the10seasons,againsuggestinglessregressiontothemeanduringtheseasonthanduringThe

Masters(Figure1).

Ithensplitplayersintotwogroupsbasedontheiraveragescoreontouroverthepastfour

years.Group1consistsofthetophalfofplayersoftheperiodandGroup2consistsofthe

bottomhalf.Thepointofthiswastosplitplayersintogroupsbasedontheirtrueabilityin

ordertodetermineifbetterplayersregresstothemeanlessthanlessskilledplayers.Group1

beingthebetterplayersandGroup2beingtheless-skilledplayers.Ithenlookedathoweach

groupperformedfromthefirsttothesecondroundatTheMastersandduringtheentirePGA

Tourseason.IfoundthattheplayersinGroup1playedthefirstroundofTheMastersnearly

halfastrokebetterthanthesecondroundoverthelastthreetournaments.Thisiscompared

Page 11: Regression to the Mean at The Masters Golf …economics-files.pomona.edu › GarySmith › Econ190 › Econ190 2018...one of the most famous golf courses in the world. The Masters

11

Table1a:round-by-roundcomparison

tothemshooting.15strokesbetterinthefirstroundduringtheentireseasonoverthepast

threeyears.Ontheotherside,thesecondgroupshotnearlyhalfastrokebetterinthesecond

roundofTheMastersthanthefirst.Thiscomparedtoscoringslightlybetterinthesecond

roundthroughoutthePGATourseason.Theselargerdifferencebetweenroundsatthe

MastersprovidesfurtherevidenceofmoreregressiontothemeanatTheMastersthanduring

thePGATourseason.

Whilethistestdidnotshowanydifferenceinregressiontothemeanbetweendifferentskill

groups,itdidshowthatthegroupsperformedmuchdifferentlyfromroundtoround.Thetest

showsevidencethatthemoreskilledplayersontourplaybetterinthefirstroundthanthe

secondroundandviceversaforlessskilledplayers.Thiscouldbebecausetheworseplayers

havetoplaybettertomakethecut,oritcouldbecausedbysomeotherreason.

Table1b:round-by-roundcorrelation

Page 12: Regression to the Mean at The Masters Golf …economics-files.pomona.edu › GarySmith › Econ190 › Econ190 2018...one of the most famous golf courses in the world. The Masters

12

6. Conclusion

Analysesshowthatthereisnotasignificantdifferenceinregressiontothemeanbetween

TheMastersTournamentandthePGAtourseason.Thisisapparentonboththeround-to-

roundlevelaswellastheyear-to-yearanalysis.Itisofnotethatthenumberofobservationsare

lowbecauseofthefactthattheaveragegolftournamenthasfewerthanonehundredplayers.

Onethingthatisnotcontrolledforintheround-by-roundanalysisisdifferingweather

conditions.Playerstypicallyhaveoneroundinthemorningandoneroundintheafternoon

duringthefirsttworoundsofatournament.Onoccasionthereisanextremedifferencein

playingconditionsbetweenthemorningandafternoon.Thischangeinweathercouldbea

causeforregressiontothemeanwhenlookingatasingulartournament.Itisunlikelythatthis

wouldbeafactorwhenlookingattheentireseason.

Figure1:Round-to-roundcorrelationduringPGAseasonandatTheMastersfrom2008-2017

Page 13: Regression to the Mean at The Masters Golf …economics-files.pomona.edu › GarySmith › Econ190 › Econ190 2018...one of the most famous golf courses in the world. The Masters

13

ThefactthatatTheMastersplayersfromdifferinggroupsscorepracticallythesamein

thesecondroundrevealsthatscoringatTheMastersisbasedmoreonluckthanduringthe

PGAseason.ThiscouldbeduetothefactthatitismuchmoredifficulttoqualifyforThe

Mastersthanitisforregularevents.MeaningthattheplayersatTheMastersarecloserintrue

abilitythantheyareinanormaltournament.

IfplayersattheMastersshowmoreregressiontothemeanthanduringtheseason,

thenwhyisitthatplayerslikePhilMickelsonseemtoperformbetteratTheMasters?One

explanationcouldbethatMickelsonandotherplayerssimplymatchupwellwithAugusta.Itis

seeninothertournamentsthatplayersplaybetteratcertaincourses.Itcouldbethat

Mickelsonjustsohappenstohaveagamethatfitswellwithoneofthemostprestigious

coursesintheworld.

Page 14: Regression to the Mean at The Masters Golf …economics-files.pomona.edu › GarySmith › Econ190 › Econ190 2018...one of the most famous golf courses in the world. The Masters

14

7. References

(1) Broadie,Mark,andRichardRendleman.“AretheOfficialWorldGolfRankingsBiased?

”Http://www.columbia.edu/~mnb2/Broadie/Assets/owgr_20120507_broadie_rendlema

n.Pdf,7May2012.

(2) Connolly,RobertA.andRichardJ.Rendleman,Jr.,2008,Skill,LuckandStreakyPlayon

thePGATour,"JournaloftheAmericanStatisticalAssociation,103(March):74-88.

(3) Connolly,RobertA.andRichardJ.Rendleman,Jr.,2012,\WhatitTakestoWinonthe

PGATour(IfYourNameisTiger"orIfItIsn't),"InterfacesNovember-December,

42(6):554-576.

(4) Galton,F.(1886),“RegressionTowardsMediocrityinHereditaryStature,”Journalofthe

AnthropologicalInstitute,15,246-263.

(5) Kahneman,Daniel.Thinking,FastandSlow.Farrar,StrausandGiroux,2013.

(6) PastWinners,2018.www.masters.com/en_US/discover/past_winners.html.

(7) PGA.“WhatIsShotLinkIntelligence.”PGATour,2005,

www.pgatour.com/stats/shotlinkintelligence/overview.html.

(8) TeddySchall&GarySmith(2000)DoBaseballPlayersRegresstowardtheMean?,The

AmericanStatistician,54:4,231-235

Page 15: Regression to the Mean at The Masters Golf …economics-files.pomona.edu › GarySmith › Econ190 › Econ190 2018...one of the most famous golf courses in the world. The Masters

15

8. GraphsandFigures

Figure3:Mastersround1comparison2016-2017

Figure2:PGAround1comparison2016-2017

Page 16: Regression to the Mean at The Masters Golf …economics-files.pomona.edu › GarySmith › Econ190 › Econ190 2018...one of the most famous golf courses in the world. The Masters

16

Figure4:PGAround-to-round2017

Figure5:Mastersround-to-round2017

Page 17: Regression to the Mean at The Masters Golf …economics-files.pomona.edu › GarySmith › Econ190 › Econ190 2018...one of the most famous golf courses in the world. The Masters

17

Table2:PGA

Tou

rgroup

sforro

undcompa

rison

Page 18: Regression to the Mean at The Masters Golf …economics-files.pomona.edu › GarySmith › Econ190 › Econ190 2018...one of the most famous golf courses in the world. The Masters

18

Table3:M

astersgroup

sforro

undcompa

rison