wolfram wk2014

Upload: daselknam

Post on 03-Jun-2018

230 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 Wolfram WK2014

    1/16

    PRODUCTS SOLUTIONS PURCHASE SUPPORT COMMUNITY COMPANY OUR SITES SEARCH

    Check out Etiennes updated predictions from Thursday, June 26 here.

    The FIFA World Cup is underway. From June 12 to July 13, 32 national football

    teams play aainst ea!h other to determine the FIFA world !hampion for the ne"t

    four years. Who will su!!eed# $"perts and fans all ha%e their opinions, but is it

    possible to answer this &uestion in a more s!ientifi! way# Football is an

    unpredi!table sport' few oals are s!ored, the supposedly wea(er team often

    manaes to win, and referees ma(e mista(es. )e%ertheless, by in%estiatin the

    data of past mat!hes and usin the new ma!hine learnin fun!tions of the

    Wolfram *anuae Predictand Classify, we !an attempt to predi!t the

    out!ome of mat!hes.

    The first step is to ather data. FIFA results will soon be a!!essible from

    Wolfram+Alpha, but for now we ha%e to do it the hard way' s!rape the data fromthe web. Fortunately, many websites ather histori!al data www.espn.!o.u(,

    www.rsssf.!om, www.11%11.!om, et!.- and all the s!rapin and parsin !an be

    done with Wolfram *anuae fun!tions. We first stored web paes lo!ally usin

    URLSaveand then imported these paes usin Import

    [myfile,"XMLObject"] and Import[myfile,"yperli!s"]

    for the lin(s-. sin /0* obe!ts allows us to (eep the stru!ture of the pae, and

    the !ontent !an be parsed usin Partand patternmat!hin fun!tions su!h as

    Cases. After the s!rapin, we !leaned and interpreted the data' for e"ample, we

    had to infer the !ountry from a lare number of !ities and used I!terpreter

    to do so'

    From s!rapin %arious websites, we obtained a dataset of about 3,international mat!hes of 23 teams from 145 to 216 and 75, players.

    *oaded into the Wolfram *anuae, its si8e is about 209 of data. :ere is a

    mat!h and a player e"ample stored in a #ataset'

    0at!hes in!lude s!ore, date, lo!ation, !ompetition, players, referee, et!. alon

    with players; birth date, heiht, weiht, number of sele!tion in national teams,

    et!. :owe%er, the dataset !ontains missin elements' most players ha%e missin

    !hara!teristi!s, for e"ample. Fortunately, ma!hine learnin fun!tions su!h as

    Predictand Classify!an handle missin data automati!ally.

    9efore startin to !onstru!t a predi!ti%e model, let;s !ompute some amusin

    statisti!s about football mat!hes and players.

    The mean number of oals per mat!h is 2.< whi!h !orresponds to one oal e%ery

    3 minutes on a%erae-. :ere is the distribution of this %ariable'

    2= Comments

    Predicting Who Will Win the World Cup with Wolfram LanguageJune 2, 216 > $tienne 9ernard

    11.1K3739441

  • 8/12/2019 Wolfram WK2014

    2/16

    It can be roughly approximated by a PoissonDistributionwith mean

    2.8, which tells us that the probability rate for a goal to happen is about the same

    in most matches. Another interesting analysis is the evolution of the mean number

    of goals per match from the 19!s to present day"

    #e see that in the $!s, almost four goals were scored on average, while sadly it

    is only about 2. goals per match nowadays. As a result, the probability for teams

    to tie is now higher %almost 2& end in draws now, against 2!& in the $!s'.

    (ere are the evolutions of the %estimated' probabilities to win when teams are

    playing in their home country and when they are playing away"

    )he effect of playing at home is important" teams have about a !& chance of

    winning when they are at home, while only a 2*& chance when they are away+ A

    naive predicting strategy might then be to always predict the victory of the home

    team. ut there is not always a home team" for this #orld -up, the only hometeam is rail.

    /et$s now analye what we can determine about players. (ere is the average

    player height for matches played in a given year"

  • 8/12/2019 Wolfram WK2014

    3/16

    As expected, players tend to be taller (matching the growth of the entire

    population). However, they have not gotten heavier (at least not in the last 30

    years), in fact, they are getting thinner. Here is their average ody !ass "ndex

    (!", computed as weight#height$) as a function of time%

    &e can see that in the '0s, players' average !" increased from $3 g#m*$to $+

    g#m*$. "n the '0s, the average !" stayed roughly the same, and since the '-0s

    it has been steadily decreasing, down to $$. g#m*$in $0+. "t is hard to interpret

    the reasons for this behavior, though one could argue that in modern football,

    speed and agility are preferred over impact sills.

    /et's now dive into the predictions of football matches. "n order to predict thewinning probabilities of the &orld up, we need to be able to predict the results

    of individual matches. 1redicting the exact score would be interesting, but it is not

    necessary for our problem. "nstead we prefer predicting whether the first team

    will win (labeled Team1), the second team will win (labeled Team2), or the

    match will end in a draw (labeled Draw). &e thus want a classifier for the

    classes Team1, Team2, and Draw.

    A first classifier would be to pic a class randomly with a uniform distribution,

    which would give 332 accuracy. o do better, we can use some of the statistical

    information we gathered earlier on% for example, we now that only $32 of

    matches are tied, so we could then predict either Team1or Team2at random,

    which would give 3.42 accuracy. o improve upon these naive baselines, we

    need to start using information about matches and teams, that is, to extract

    5features6 and use them in machine learning algorithms.

    &ith our dataset, we can construct many features in order to feed machine

    learning algorithms% the number of goals scored in previous matches, the fact that

    a team plays at home, etc. hese algorithms try to find statistical patterns in these

    features, which will be used to predict the outcome of matches. &ith the new

    functions Classifyand Predict, we don't have to worry about how these

    algorithms wor or which one to choose, but only about which features we want

    to give them. "n our problem, we want to predict classes, and thus we will use the

    Classifyfunction.

    &e saw in the previous analyses that when teams are playing in their country they

    have a greater chance of winning. his effect is also present for continents

    (although in a much less important way). &e thus construct a first classifier that

    uses features indicating whether teams play in their own country or continent. he

    Countryfeature will be set to Team1if the first team plays in their own

    country, Team2if the second team plays in their own country, and Neutralif

    both teams play away. 7ame goes for theContinentfeature (when both teams

    are from the same continent, the feature is also set to Neutral). 8ur dataset

    uses associations to have named features9 here is a sample of it%

    "n order to assess the :uality of our classifier, we split the dataset into a training

    set and a test set, which is composed of the $000 most recent matches (the dataset

    is sorted by date here)%

    &e can now train the classifier with a simple command%

  • 8/12/2019 Wolfram WK2014

    4/16

    With this dataset, the k-nearest neighbors algorithm has been selected by

    Classify. We can now evaluate the classification performance on the test set:

    We obtain about 48% accuracy, which roughly corresponds to the 5% accuracy

    when always predicting a home win !e"cept that the test set also contains matches

    played in neutral locations#.

    $ets now add a very valuable feature: the &lo ratings of teams. 'riginally

    developed for chess, the &lo rating system has been adapted for football !see

    (World )ootball &lo *atings(#. +his system rates teams according to how good

    they are. +he rating has a probabilistic interpretation: ifD &loteam &loteam/,

    then the predicted probability for teamto win isP(D) 0!1-204#.

    +he &lo rating of all teams starts at 5 !this value is arbitrary#. 3fter a match is

    played by a given team, their &lo rating is updated according to the formula

    &lonew &loold1K !r P(D)#, whereP(D)is the probability for the team to

    win, ris a variable mared if the team won, if they lost, and .5 for a draw,

    andKis a coefficient that depends on the match type and the difference of goals.6ere is an implementation of the rating update in the Wolfram $anguage:

    where matchWeightgives a weight depending on the competition !7 for

    World up finals, / for friendly matches, etc.#. 6ere are the computed &lo

    ratings with our dataset !restricted to matches before the World up#:

    and the time evolution of &lo ratings for some selected teams:

  • 8/12/2019 Wolfram WK2014

    5/16

    We then compute, before each match, the &lo ratings of both teams and add them

    as features. 6ere is a training e"ample:

    3gain we train a classifier and test its accuracy:

    +his time, Classifychose the logistic regression method. With this new

    classifier, about 58.9% of test set e"amples are correctly classified, which is a

    great improvement upon the previous classifier. n matches where draws are

    forbidden !in the nocout phase, for e"ample#, this classifier obtains ;5.;%

    accuracy.

    $ets now add some e"tra features that we thin are relevant in order to build a

    better classifier. ation of our prediction to new e"amples#. )ortunately, Classifyhas

    automatic regulari>ation methods to avoid overfitting, so we should not be too

    concerned about that. We choose to add four e"tra features for each team:

    goal average of the last three matches

    mean age of players

    mean number of national selection of players

    mean ?ody @ass nde" of players

    6ere is a training e"ample of the dataset:

    $ets now train our final classifier:

    +he logistic regression has again been used. We now generate a

    ClassifierMeasurements[...]ob=ect in order to Auery various

    performance results:

    We now have 58.B% accuracy on the test set. n nocout-type matches, thisclassifier gives ;7.5% accuracy. 3s we can see, it is only a marginal improvement

    on the previous classifier. +his confirms how powerful the &lo rating feature is,

    and it is a sign that, from now on, accuracy percentages will be hard to improve.

  • 8/12/2019 Wolfram WK2014

    6/16

    However, we have to keep in mind that our dataset contained many missing

    values for these extra features.

    Lets now have a look at the confusion matrix for the classification on the test set:

    This matrix shows the counts cijof class iexamples classified as classj. The rows

    represent the true classes while the column represents the predicted classes. For

    example, we can read that amongst ! matches won "y Team1, two have "eenclassified as Draw, #$$ as Team1, and % as Team2. &nterestingly, the

    classifier decides to predict Drawvery rarely. This is due to the low proportion

    of tied matches 'only ()*+, "ut it does not mean the classifier excludes the

    possi"ility of draws here are the classification pro"a"ilities on an example:

    &s it possi"le to improve upon this classifier- ertainly, "ut we will pro"a"ly need

    more and "etter/0uality data. &t would "e interesting to have access to national

    championship results, infer players skills, how players interact together, etc.

    1ith our data, the prospects for improvement seem limited, so we will thus

    continue using this classifier to predict 1orld up matches.

    2ur goal is to predict the pro"a"ilities for each team to access a given stage of the

    competition 'round of %#, 0uarter/finals, semi/finals, finals, and victory+. 1e

    must infer these pro"a"ilities from the outcome pro"a"ilities of individual

    matches given "y the classifier. 2ne way to do so would "e to compute the

    pro"a"ilities for all possi"le 1orld up results. 3nfortunately, the num"er of

    possi"le configurations grows exponentially with the num"er of matches it will

    thus "e very slow to compute. &nstead, we will simulate 1orld up results

    through 4onte arlo simulations: for each match, we randomly pick one of the

    outcomes 'with RandomChoice+ according to their distri"ution. 1e can then

    simulate the development of many imaginary 1orld ups and count how many

    times a given team reached a given stage.

    1e first compute the features associated with each team 'continent, 5lo rating,

    mean age, etc.+. Here are the features for 6ra7il:

    3sing this, we construct a function converting the features of "oth teams into

    features used "y the classifier:

  • 8/12/2019 Wolfram WK2014

    7/16

    In the group stage, a victory is three points, a draw one point, and a defeat zero

    points. Only the first and second teams qualify. Here is a function that simulates

    the qualified teams for the round of 16!

    "s we cannot compute goal averages, if two teams have an equal num#er of

    points, their order is chosen randomly.

    $e then code a function that simulates a %noc%out round from a list of countries.

    &o do so, we use the option ClassPriorsin order to tell the classifier that the

    pro#a#ility of Drawin this phase is '!

    $e can now have our full simulation function!

  • 8/12/2019 Wolfram WK2014

    8/16

    Here is one simulation and the corresponding plot of the tournament tree:

    We can now perform many trials and count how many times each team reaches agiven level of the competition.

    After performing 100,000 simulations, here is what we obtained for winning

    probabilities:

    As one might expect, ra!il is the favorite, with a probability to win of "#.$%.&his stri'ing result is due to the fact that ra!il has both the highest (lo ran'ing

    and plays at home. )pain and *ermany follow and are the most serious

    challengers, with about #1.$% and 1$.+% probability to win, respectively. &here

  • 8/12/2019 Wolfram WK2014

    9/16

    is almost 80% chance that one of these teams will win the World Cup according

    to our model.

    Lets now look at the probabilities to get out of the group phase:

    This ranking follows the ranking of final ictor!. There are some interesting

    things to note: while "erman! and #rgentina hae about the same probabilit! to

    get out of their group$ "erman! is more than three times as likel! to win. This is

    partl! due to the fact that "erman! has strong opponents in its group &ortugal$

    '(#$ and "hana)$ while #rgentina is in *uite a weak group.

    +inall!$ here are plots of the probabilities to reach each stage of the competition

    for the nine faorite teams:

    We can see the domination of ,urope and (outh #merica in football.

    #t the time of writing -une /)$ some matches hae alread! been pla!ed. Lets

    see how our classifier would hae predicted them:

  • 8/12/2019 Wolfram WK2014

    10/16

    From the first 15 matches, 11 have been correctly classified, which gives 73.3%

    accuracy. This is higher than exected! we have been luc"y. #e will reort the

    final accuracy on all the matches after the #orld $u is over.

    o what else can we do with this classifier& 'esides being disaointed that ourfavorite team has little chance of winning, one straightforward alication is for

    betting. (ow could we do that& )et*s say that we +ust want to bet on the result of

    matches Team1wins, Team2wins, or Draw-. The naive aroach would be

    to bet on the outcome redicted by the classifier, but this is not the best strategy.

    #hat we really want is to maximie our gain according to the robabilities

    redicted by the classifier and the boo"ma"er odds. /n order to do so, we can use

    the otion UtilityFunction , which sets the utility function of the

    classifier. This function defines our utility for each air of actual0redicted

    classes. /n order to ma"e a decision, the classifier maximies the exected utility.

    'y default, the utility is 1 when an examle is correctly classified, and

    otherwise! therefore, the most li"ely class is redicted. /n our case, the utility

    should be our money gain2 if we do the correct rediction, it will be the betting

    odds for the corresonding outcome, and otherwise it will be . (ere is how we

    can construct such a ut ility function using associations2

    ow let*s say that the odds of witerland vs. France 4une - are2

    6 witerland2 .

    6 8raw2 3.3

    6 France2 .5

    The redicted robabilities are2

    9nd the redicted outcome is that France will win2

    (owever, if we add the betting odds in the utility, the decision is the oosite2

    /t thus seems reasonable to bet on witerland. ow, should we blindly follow

    the decision of the classifier& #ell, there are some counterarguments. First, this

    method does not ta"e into account our ris" aversion2 it will choose the maximum

    exected utility no matter what the ris"s are. This strategy is winning in the long

  • 8/12/2019 Wolfram WK2014

    11/16

    run, but might lead to severe loss of money at a given time. We also have to

    consider the quality of the predictions: are they better than bookmakers odds?

    Betting odds reflect what people think, and people often put feelings into their bet

    e.g. they have a tendency to bet for their favorite team!. "n that sense, a cold

    machine learning algorithm will perform better. #n the other hand, many betters

    already use algorithms to bet and they are probably more sophisticated than this

    one. $o use at your own risk%

    &osted in: Wolfram 'anguage

    (ewer

    Wolfram &rogramming )loud "s 'ive%

    #lder

    Wolfram *echnology )onference +-: /egister (ow%

    /0'1*02 $*$

    World )up 3ollow45p: 5pdate of Winning &robabilities and Betting /esults

    6une +7, +-

    Wolfram &rogramming )loud "s 'ive%

    6une +8, +-

    9ow the Wolfram 'anguage easures 5p

    6une , +-

    26 Comments

    ery interesting% 9owever, some of the teams, eg $pain or )ameroon, have already been disqualified% "s there any way to update the algorithm to

    ake info like this into account?

    &osted by onte )arlo 6une +, +- at -:- am

    /eply

    *hank you for your comment% *hese predictions were done before the first match< we will publish a follow4up post after the group phase with updated

    predictions.

    &osted by *he Wolfram *eam 6une +, +- at --:= am

    /eply

    )ameroon is still in champ

    &osted by chris 6une +8, +- at 8:8= pm

    /eply

    reat%

    ut $pain +nd by your model< guess you better rerun the model. B*W would be great to use the new Wolfram )loud for this assessment

    &osted by 9ans4>erlach Woudboer 6une +, +- at -:7 am

    /eply

  • 8/12/2019 Wolfram WK2014

    12/16

    Taking the 4 top teams, the probability that all 4 would survive the first round is only ~56% (by my estimate from the bar graph) Thus not anomalous

    for one of the 4 to be out !f "ourse given that one would be out, there was a priori only a #5% "han"e it would be $pain ut that "ould be said of any

    of the 4

    &osted by 'i"k aartman une ##, #*+4 at +*-5 pm

    'eply

    weet .ob guys/

    &osted by theguyundertheskylight une #*, #*+4 at +*54 am

    'eply

    i"e .ob integrating many features of 1athemati"a +* $pain should have read this blog entry, however, before de"iding to e2it the tournament with

    woeful performan"e

    &osted by $eth 3handler une #*, #*+4 at +#5 pm

    'eply

    reat work tienne 7 Tali///

    &osted by 1ike $ollami une #*, #*+4 at ++# pm

    'eply

    ol8spain

    &osted by $obres une #*, #*+4 at -44 pm

    'eply

    ould you provide the raw dataset9

    &osted by :ernando 1eyer une #*, #*+4 at ;5# pm

    'eply

    Thank you for your "omment/

  • 8/12/2019 Wolfram WK2014

    13/16

    ow the Classifier function decides what method to use? Thanks.

    Posted by H June 20, 201 at !"#2 $m

    %e$ly

    Thank you for your comment& 'n its current state, the Classify function first uses the number of e(am$le, number of features, ty$e of data etc. to

    determine $ossible models. Then, the best model is selected by cross )alidation" the models are trained on a $art of the data, and tested on another $art

    *the o$eration mi+ht be re$eated usin+ a different data s$lit to im$ro)e the statistical rele)ance.

    Posted by The -olfram Team June 2, 201 at 12"10 $m

    %e$ly

    reat rticle&

    owe)er this kind of $rediction cant e)aluate beforehand stron+ teams that didnt $lay well historically, like Costa %ica and e)en Chile. a+er to

    ee ne(t $ost&

    Posted by duardo 3an4io June 21, 201 at !"1# $m

    %e$ly

    s the notebook you made for this $ost a)ailable for download?

    Posted by Cameron June 21, 201 at !"#1 $m

    %e$ly

    Thank you for your comment& 5nfortunately, the notebook for this $ost is not a)ailable for download.

    Posted by The -olfram Team June 2, 201 at 12"06 $m

    %e$ly

    m$ressi)e model& ' must say its e(actitude is im$ressi)e, for 3$ain which had 21.#7 chances of final )ictory is eliminated after two +ames.

    Posted by Jean 8c9au+h June 22, 201 at 10" am

    %e$ly

    really dont understand how the 53 is su$$osed to $erform better at all than 5ru+uay, 'taly, %ussia or 8e(ico. )en Ja$an, who e)erybody

    nows as a sure loser in the first round, scores better than %ussia, 8e(ico and Costa %ica *who seem to ha)e a +ood team as of late. 'm +uessin+

    hat there is some sort of confoundin+ effect" some of the $rocessed )ariables may be irrele)ant or not wei+ht at all as much as you think they do.

    hen a+ain 3$ain was :uickly eliminated after $erfomin+ much worse than e($ected, so statistcs can only $redict so much in the end. nyhow, 'uess that ;ra4ilrance, n+land, 5ru+uay or Costa %ica is a +ood bet for the final. Teams from outside atin

    merica or uro$e are out almost by default, barrin+ the odd frican sur$rise.

    ot that ' follow football *borin+ but near e)erybody around does, so in the end you +et some info that may be missin+ in the al+orithm.

  • 8/12/2019 Wolfram WK2014

    14/16

    Posted by Maju June 22, 2014 at 4:43 pm

    Reply

    occer world cups happen once eery 4 years, they are !u"c# and not ery thorou$h, and nat"onal teams hae a h"$h turnoer rate, so there "s ery

    "ttle reason to bu"ld stat"st"cs on the h"story o% &'taly( or &Spa"n(, as all teams are way too d"%%erent %rom one tournament to the ne)t*

    he %act that your pred"ct"ons hae been de%"ed so cons"stently, "n my "ew, lends more cred"b"l"ty to soccer as a proper sport, where psycholo$"cal

    es"l"ence, rehearsals and athlet"c cond"t"on "% we could measure them mean"n$%ully- should be much better pred"ctors %or a match.s outcome than

    "stor"cal analyses*

    or "nstance, the emer$ence o% Spa"n as a ser"al trophy w"nner oer the last %ew years has been e)pla"ned w"th the"r pecul"ar style o% play"n$ &t"#"

    a#a(-, wh"ch used to con%use opponents* pparently, the surpr"se e%%ect just %aded oer the years, they d"d not eole %ast enou$h, and now what

    nce was con%us"n$ has become too pred"ctable* +h"s "s ery lo$"cal and stra"$ht%orward, "t has a pred"ctably hu$e "mpact on results, but "t "s just

    o"se %or a purely h"stor"cal analys"s

    Posted by ndrea "n" June 23, 2014 at :0 pm

    Reply

    ou sa"d: &%or th"s 6orld 7up, the only home team "s 8ra"l*(

    h"s "s not true* +here are lots "m tal#"n$ crowds o% 30 9 40*000- o% t"c#et pay"n$ %ans that traelled %rom r$ent"na, 7h"le, 7olomb"a and ru$uay,

    ot count"n$ those that already l"e "n 8ra"l* +hese teams w"ll hae the home adanta$e a$a"nst anyone e)cept when play"n$ a$a"nst 8ra"l-* Just

    atch any o% the"r $ames*

    Posted by Rodr"$o June 24, 2014 at 1:3; pm

    Reply

    ny al$or"thm and any sort o% data can not pred"ct who w"ll w"n the world cup *

    ccord"n$ to ur $raph, +he w"nn"n$ probab"l"ty o% Spa"n "s 2nd amon$ 32 teams*

    ust see b"$ %aor"tes Spa"n already #noc#ed out o% tournament a%ter play"n$ 2 matches *

  • 8/12/2019 Wolfram WK2014

    15/16

  • 8/12/2019 Wolfram WK2014

    16/16

    Astronomy (8)

    Data Analysis and Visualization (33)

    Design (13)

    Education (37)

    Finance (15)

    Geosciences ()

    !ig"#$er%ormance &om'uting (7)

    mage $rocessing (5)

    Mathematica*e+s (3)

    Mathematica,-A (1)

    .at"ematics (/0)

    t"er A''lication Areas (15)

    2as'erry $i (/)

    2ecreational &om'utation (58)

    4o%t+are Deelo'ment (1)

    4ystem.odeler (16)

    ol%ram &loud ()

    ol%ram Demonstrations $roect (5)

    ol%ram 9anguage (16)

    ol%ram *e+s (156)

    4earc" t"e log

    : 61/ Aout ol%ram ol%ram ;log ol%ram