2016 u.s. presidential election - semantic visions...and analyses publish facts, detailed...
TRANSCRIPT
©2016SemanticVisions.Allrightsreserved.www.semantic-visions.com 1
2016U.S.PresidentialElectionBigDataAnalysis|December15,2016
Weundertook the following analysiswith the aim of retesting the thesis that there exists a closecorrelationbetweenpublicopinionandacriticalamountofcontentinonlinemedia,whichwasthesubjectofourinitialanalysis.
StartingPoints
ThisstudyleveragestwostartingpointsrepeatedlyprovenbySemanticVisions.JudgingfromtheBigData,which is transformed intoSmartData inoursemanticsystem,twoprinciplefactors influencetheelectionresult:
a) Frequencyofmentionsoftherespectivecandidates(thisessentiallyamountstotheextentofthemediaprofileofthecandidates).Candidateswithasignificantly lowermediaprofiledonot have a chance of success, whereas candidates with a significantly higher amount ofmentionsinthemediahavefarhigherchances.
b) Whenthemediaprofilesofcandidatesarerelativelyequal,SentimentBalanceisthedecisivefactorwithtrendsthereofplayinganimportantroleinthefinalweeksanddayspriortothevote.
TypesofSourcesAnalyzed
In addition to articles from established online media sources, Semantic Visions also collects andanalyzes content ofwebpageswhich publish news reports focused on specific topics: politics, theeconomy,business,security,scienceamongothers.Generallyspeaking,theauthorsofsucharticlesandanalysespublishfacts,detailedinformationandanswerstoquestionssuchas“who,what,when,where,whyandhow“.
Logically structured informativearticlesof this type containanaverageof3,100 characters.Whenprocessed by Semantic Visions‘ sematic analytical system, such articles provide much moreinformativecontentforanalysisthansimpletweetswhichoftenlacklogicalstructureandwhichhavean average length of between 70 to 120 characters (source: MIT - Massachusetts Institute ofTechnology).
For the SemanticVisions, online social networks are an indivisible part of cyberspace, but provideonlyalimitedamountofinformationusefulforthepurposesofdeeperanalysis.Wemonitoronlinesocialnetworks includingFacebookandTwittermoreeffectivelybyusingcollectiveknowledgeandintelligenceofhundredsofthousandsofeditorsandauthorsofarticleswhodecidewhatisimportantandwhatisnot.
However, in order to better understand the results of the US presidential electionwe conductedadditionalreverseanalysisofTwitter,whichproducedsomesurprisingresults.
1. InputDataPeriodofdatacollectionandanalysis:March1,2016–November7,2016
NumberofEnglish-languagedocumentsanalyzed:116,291,957
Numberofsourcesmonitored:277,604
2016U.S.PresidentialElection-BigDataAnalysis
©2016SemanticVisions.Allrightsreserved.www.semantic-visions.com 2
2. Methodology
Throughout the analysis period Semantic Visions processed over 232 million documents in 11languages.
ThisreportfocusesonEnglish-languagesourcesonlyandalldocumentsacquiredweresemanticallyprocessed in the Semantic Visions system. The output semantic metadata enabled us to conductthoroughanalysisof thedocumentswhichwererelevant forourpurposes; in thiscasethesubjectbeingtheUSpresidentialelection,whichwedetectedwithourpredefinedsemanticconcept.
Thereportalsocomprisesquantitativeanalysisbaseduponthetotalofso-called“fragments”abouttheindividualcandidates.Sentencesandphrasesincloseproximitytothepersonsubjecttoanalysisqualify as fragments. Several fragments can be found in a single document and therefore thequantityoffragmentsismorerelevantthanthequantityofdocuments.
3. AnalysisTheanalysisperiodincludestheprimariesofthetwomainpoliticalpartiesintheUS,theDemocratsandRepublicans.Theprimariesofbothpartiesculminatedintheconventionsofbothparties,whichwereheldinJuly2016andatwhichthepresidentialcandidatesofbothpartieswerenominated.Thecandidates forVicePresidentwerealsonominatedat theconventions (it is traditionalpractice forthe conventions to nominate the vice presidential candidates proposed by the nominatedpresidentialcandidate).Thefollowingpartoftheanalysiswasthepre-electioncampaign(includingdetailedanalysisofthepresidentialdebates)andElectionDay.
ThePrimaries
The four-monthmarathonof theparties’primariesbeganonFebruary1,2016with ralliesofboththeRepublicanandDemocratpartiesinIowa,whichwasthefirstrealcomparisonofstrengths.Theprimariesare conducteddifferentlybyeachpartyandalsohavedifferentprocedureson the statelevel. The aim of the primaries is to select delegates who will vote for the party’s presidentialcandidateattherespectivepartyconventions inJuly.Assuchtheaspiringdelegatesproclaimtheirsupportfortheircandidateofchoice.Inadditiontothedelegates,so-calledsuperdelegatesalsovoteforthepresidentialcandidatesattheconventionsandthelatterarefreetovoteforthecandidateoftheirchoiceattheirowndiscretion.Superdelegatesincludecongressmen,governorsandotherpartyfunctionaries.OfthetwomainpartiesitisthesuperdelegatesoftheDemocraticPartywhohaveagreaterinfluenceinselectingtheirparty’spresidentialcandidate.
In this year’sprimaries, theRepublicanshadmore candidates, though from theoutset therewerethreeclearfavorites:DonaldTrump,TedCruzandMarcoRubio.WiththeDemocrats,thereweretwomaincontenders:HilaryClintonandBernieSanders.
Already in the first week of the primaries in Iowa the real chances of the individual candidatesbecameapparentandfollowingtheprimariesinNewHampshire,thefirstcandidatesdroppedoutoftheraceanddeclaredtheirsupportforapartycolleaguestillintherace.
The nextmilestone in the primarieswas so-called Super Tuesday,March 1, 2016,when 15 statesselectedtheircandidateofchoice.
2016U.S.PresidentialElection-BigDataAnalysis
©2016SemanticVisions.Allrightsreserved.www.semantic-visions.com 3
Graph1–NumberofFragments-DonaldTrump,TedCruzandMarcoRubio(March1,2016–July21,2016)
TheRepublicansCarlyFiorinaandJebBushhadalreadydroppedout inFebruaryandasaresultofSuperTuesday,BenCarsonrenouncedhiscandidacy.Onthebasisof thedocumentsanalyzed, it isevident thatmedia coverageof a candidate falls considerably after renouncing their candidacy. Intermsofamountofmediacoverage,DonaldTrumpwastheleaderamongRepublicansforthewholeperiodoftheprimaries.
Graph2–SentimentBalance-DonaldTrump,TedCruzandMarcoRubio(March1,2016–July21,2016)
DespitethefactthattheresultsofsentimentanalysisshowoverallcoverageofDonaldTrumpwasnegative,whileespeciallyTedCruzenjoyedmorepositivecoverage,CruzwasultimatelyunsuccessfulandrenouncedhiscandidacyonMay4,2016.
2016U.S.PresidentialElection-BigDataAnalysis
©2016SemanticVisions.Allrightsreserved.www.semantic-visions.com 4
ThebattleforthepresidentialcandidacyamongtheDemocratswaslimitedtotwocandidates,HilaryClintonandBernieSanders.TheDemocraticPartyprimarieswereconsiderablycloserthanthoseoftheRepublicansandwerenotdecideduntilthefinalstageswhenSanderseventuallydroppedoutonJune17,2016.
Graph3–NumberofFragments-HillaryClintonaBernieSanders(March1,2016–July21,2016)
Fromthisgraph it isevidentthatHilaryClintonreceivedgreatermediacoveragethanSanders,butfortheprimariesoverall,thisadvantagewasnotas largeasthatofDonaldTrumpcomparedtohisRepublicanrivals.
Graph4–SentimentBalance-HillaryClintonandBernieSanders(March1,2016–July21,2016
2016U.S.PresidentialElection-BigDataAnalysis
©2016SemanticVisions.Allrightsreserved.www.semantic-visions.com 5
The primaries concludewith the party conventionswhere the parties' presidential candidates areelected.
RepublicanConvention
July18–July21,2016(Cleveland,Ohio)
CandidateforthePresidencyoftheUSA–DonaldJohnTrump
CandidateforVicePresident–MikePence
DemocratConvention
July25–July28,2016(Philadelphia,Pennsylvania)
CandidateforthePresidencyoftheUSA–HillaryDianeRodhamClinton
CandidateforVicePresident–TimothyMichaelKaine
OtherCandidates
OthercandidatescampaignedfortheUSpresidencybutwithlittlechanceofsuccess:
GaryJohnson–LibertarianParty
JillStein–GreenParty
DarrellCastle–ConstitutionParty
EvanMcMullin–Independent
ThisanalysisfocusesonthecandidatesofthetwomainpoliticalpartiesintheUSA-DonaldTrumpandHillaryClinton.
HillaryClintonvs.DonaldTrump
From the analysiswe can observe that quantity ofmedia coveragewas a decisive factor.FromthebeginningofMarchthroughtoElectionDay,DonaldTrump'smediapresencewassignificantly higher than that ofHillary Clinton. And in the preceding primaries this factorproveddecisive.
This trend can also be observed during the campaign proper following the national partyconventions.Intermsofquantityofmediacoverage,HillaryClintontrailedDonaldTrumpfortheentirecampaignexcept for the finaldayswhenbothcandidates receivedprettymuchthesameamountofcoverage.
For the entire period analyzed fromMarch 1, 2016 to November 7, 2016, Donald TrumpreceivedalmosttwiceasmuchmediacoveragethanHillaryClinton.
2016U.S.PresidentialElection-BigDataAnalysis
©2016SemanticVisions.Allrightsreserved.www.semantic-visions.com 6
Graph5-Clintonvs.Trump–TotalAmountofFragments(March1,2016–November7,2016)
Thefollowinggraphdisplaysthedevelopmentofmediacoverageofthetwomaincandidatesovertheentireanalysisperiod.
Graph 6 - Clinton vs. Trump – Media Coverage According to Number of Fragments for individual Months and Total(March1,2016–November7,2016)
2016U.S.PresidentialElection-BigDataAnalysis
©2016SemanticVisions.Allrightsreserved.www.semantic-visions.com 7
ThefollowinggraphsillustratethedevelopmentofpositiveandnegativesentimentandtheresultingSentimentBalance(percentagedifferencebetweenpositiveandnegativesentiment).
Graph7-Clintonvs.Trump–NegativeSentiment(March1,2016–November7,2016)
During the national party conventions both main parties experienced a large growth in positivesentiment, but aweek after the conventions ended, the sentiment returned to previous levels. Asimilarscenarioisidentifiableduringthepresidentialdebatesbutinthiscasetherewasagrowthinnegativesentiment.
Asforthe“positivesentimentpeaks”forDonaldTrump,wecan identifytheperiodaroundMay4,2016whenhewasfirstnamedastheleadingRepublicancandidate.ThecasewassimilarforHillaryClinton around June 8, 2016,when shewas tipped as the victor of theDemocrat Party primaries.Bothcandidateswereofficiallynominatedastheirparties'candidatesattheirrespectiveconventionsinJuly.
Graph8-Clintonvs.Trump–PositiveSentiment(March1,2016–November7,2016)
2016U.S.PresidentialElection-BigDataAnalysis
©2016SemanticVisions.Allrightsreserved.www.semantic-visions.com 8
FortheentireanalysisperiodDonaldTrumpreceivedmorementions,butwithafewexceptionshereceivedmorenegativesentimentthanHillaryClinton.Herewecanobserveanobjectivereflectionof the fraught nature of the election campaign in which the supporters of both candidates wereextremelycriticaloftheopposition.
Graph9-Clintonvs.Trump–SentimentBalance(March1,2016–November7,2016)
Sentiment analysis returned positive values for both Hillary Clinton and Donald Trump during theDemocrat Party and Republican Party conventions respectively, and also for the latter at thebeginningofSeptemberwhenstudiesfirstemergedabouthispotentialvictory.Atthatpointvariouspollsandstudiesindicatedthatpre-electionpreferencesevened.
DevelopmentofthePre-ElectionCampaign
The three debates between the main candidates and the one between the two vice-presidentialcandidates are integral elements of the US election campaign. Candidates polling over 15%participate in the debates though in this year's campaign only Donald Trump and Hillary Clintonpassedthisthreshold.
FirstPresidentialDebate
September26,2016-HofstraUniversity,Hempstead,NewYork
The debate was hosted by Lester Holt and the candidates responded to questions concerningnational security, the future course of the USA, and the prosperity of the USA. According to themainstreammedia,HillaryClintonwonthisdebate.Theresultsofouranalysis including“long-tail”webnewsyieldedasimilarresult,correspondingwiththepredominantopinionofmainstreammediaanalystsandcommentators.
2016U.S.PresidentialElection-BigDataAnalysis
©2016SemanticVisions.Allrightsreserved.www.semantic-visions.com 9
Graph10-Clintonvs.Trump–1stDebate–SentimentBalancebyHour
Thefirstdebatepushedsentimentforbothcandidatesintonegativefigures.Twenty-fourhourslaterhowever, the impactof thedebates subsidedand sentiment returned topre-debate values, albeitwithaslightriseinpositivesentimentforHillaryClintonandaslightlymorenegativesentimentforDonaldTrump.InthisregardHillaryClintoncanbeconsideredasthewinnerofthefirstdebate.
Takingacloserlookwecananalyzehoweachissuediscussedinfluencedsentimentduringthecourseofthedebates.
Graph11-Clintonvs.Trump–1stDebate–SentimentBalancebyMinute
Takingasanexamplethesubjectofnationalsecurityandrelatedcyber-security,whichwasraisedinthe 62nd minute of the debate, we can see from the graph that the subject caused a growth innegativesentimenttowardsbothcandidates.
2016U.S.PresidentialElection-BigDataAnalysis
©2016SemanticVisions.Allrightsreserved.www.semantic-visions.com 10
SecondPresidentialDebate
October9,2016–WashingtonUniversity,St.Louis,Missouri
Theseconddebate,hostedbyMarthaRaddatzandAndersonCooper,washighlyfraughtwithDonaldTrump being forced to face criticism arising from the publication of recordings of his vulgarcomments about women, while Hillary Clinton had to face accusations about her using a privateemailserverforworkpurposeswhenshewasSecretaryofState.Othersubjectsincludedhealthcarereform, taxation, national security and the threatof cyber-attacks.According to themedia,HillaryClintonwasagainthewinner.
The second presidential debate again pushed both candidates' sentiment ratings into negativefigures.Wecandeducethatthisdebatecausedreactionsbeforeitbegan,whichinturnindicatesaleveloftenseanticipationgreaterthanpriortothefirstdebate.Sentimentreturnedmoreorlesstopre-debatelevelsagainafter24hours.
Graph12-Clintonvs.Trump–2ndDebate–SentimentBalancebyHour
Wecanobservethatintheseconddebateseveralsubjectscausedgreaterreactions.Thesentiment“peak”aroundthe30thminuterelatestothereopeningoftheissueofHillaryClinton'sprivateemailserver,whichcorrelatestothegrowthinnegativesentimenttowardsher.
Graph13-Clintonvs.Trump–2ndDebate–SentimentBalancebyMinute
2016U.S.PresidentialElection-BigDataAnalysis
©2016SemanticVisions.Allrightsreserved.www.semantic-visions.com 11
ThirdPresidentialDebate
October19,2016–UniversityofNevada,Paradise,Nevada
ThedebatewashostedbyChrisWallace.Themainsubjectwasimmigration.AnotherkeyissuewasHillaryClinton'semailspublishedrecentlybeforehandbyWikileaks,whichsheattemptedtodeflectbycriticizingVladimirPutinandRussia.DonaldTrump,however,usedthistocriticizeHillaryClinton'sforeignpolicywhenshewasSecretaryofState.
A key element in this debate was the change in tone from Donald Trumpwho avoided personalattacksonHillaryClintonandinsteademphasizedthatshehadbeeninpoliticsfor30yearsalreadyandthushadhadplentyoftimeto implementherprogram.DonaldTrumpmanagedtocoherentlyformulatehismainmessagetovoters:callingforreforminWashingtonhepresentedhimselfastheforcerequiredtobringchangetopolitics.Mediapolls,however,againindicatedthatHillarywasthewinnerofthedebate.
Graph14-Clintonvs.Trump–3rdDebate–SentimentBalancebyHour
The development of the graph of resulting sentiment indicates that Donald Trumpwas portrayedmore negatively than Hillary Clinton by the media both during the debate and in the immediateperiodthereafter.Nevertheless,wecandeducethatinthespaceofonedayDonaldTrump'sratingsreturnedtolevelsonaparwiththoseofHillaryClinton.
2016U.S.PresidentialElection-BigDataAnalysis
©2016SemanticVisions.Allrightsreserved.www.semantic-visions.com 12
Graph15-Clintonvs.Trump–3rdDebate–SentimentBalancebyMinute
Inthethirddebatewecanalsoidentifysubjectswhichhadagreaterinfluenceuponsentiment.Forexamplewecanshowthesegmentafterthe45thminutewhenhostChrisWallacereopenedtheissueof the recording of Trump's vulgar comments about women and which resulted in a growth innegativesentimenttowardsthelatter.Thediscussionabouttheinvasionof Iraqataroundthe70thminutehadasimilarlylargeimpactonsentimentinmediareports.
EveofElections
InthefinaldaysfollowingthethirddebateHillaryClintongainedpositivesentimentratings.
However, following the announcement by the Director of the FBI of the reopening of theinvestigation intoheruseofaprivateemail server forworkpurposes,her sentiment ratingsagainfell. Several days thereafter, growth in Clinton’s positive sentiment ratings resumed and againreachedapositiveaggregate.
Graph16-Clintonvs.Trump–SentimentResults(October15,2016–November7,2016)
A rise in positive sentiment for Donald Trump can also be observed in the last 14 days of thecampaign.Althoughhedidnotattainapositiveaggregateinthisperiod,therisingtrendofsentimenttowardshimisclear.
2016U.S.PresidentialElection-BigDataAnalysis
©2016SemanticVisions.Allrightsreserved.www.semantic-visions.com 13
ElectionDay
Both candidates began Election Day with close positive and negative sentiment ratings. Afundamentalshiftoccurredshortlyafter19:00EDTwhenaverysharpgrowthinpositivesentimentinthemediaforDonaldTrumpbegan.
Graph17-Clintonvs.Trump–ElectionDay–PositiveSentimentbyHour
Graph18-Clintonvs.Trump–ElectionDay–SentimentBalance–byHour
Andthishadamajorimpactuponoverallsentimentwhichfollowedasimilarpatterni.e.,whilethedevelopment of negative sentimentwas similar for both Hilary and Donald Trump, the growth inpositivesentimentforTrumpwascrucial.
Graph19-Clintonvs.Trump–SentimentResults(September1,2016–November9,2016UTC)-IncludingPost-Election
2016U.S.PresidentialElection-BigDataAnalysis
©2016SemanticVisions.Allrightsreserved.www.semantic-visions.com 14
SocialMedia
InhiscampaignDonaldTrumpeffectivelydeclaredwarontraditionalmedia.FromtheoutsetofthebattlefortheWhiteHousethemediafavoredHillaryClinton,andTrumpreactedbymakinggreateruseofsocialmedia.
In our analysis of social media networks we focused on Twitter and individual tweets which thecandidates posted from their accounts in the final four weeks prior to the election. In allapproximately850tweetsweresentfromeachofthemaincandidates’accounts.Thesetweetswereanalyzedonthebasisoffrequencyofwordsandphrasesused.
Significantwordsandphrasesmostusedbythecandidates:
Graph20-Clintonvs.Trump–FrequencyofWordsandPhrasesinTweetsinLast4WeeksofCampaign
Thestatistics for themost-usedwordsandphrasesshowthatDonaldTrump’scampaignopted forpositive messages such as “Join me“, “Thank you“, “Make America Great Again“, unlike HillaryClintonwhosecoremessagewas“Don’tvoteforDonaldTrump”.
We consider that both candidates used Twitter as their primary tool for communicating theirmessagestovotersdirectlywithoutpassingthroughthetraditionalmedia,whichnaturally leadstodegreesofdistortion,andtoasignificantextent-theimpositionoftheopinionsofjournalists,ortheleaningsofagivenmediaoutlet.
2016U.S.PresidentialElection-BigDataAnalysis
©2016SemanticVisions.Allrightsreserved.www.semantic-visions.com 15
4. ConclusionsOn the basis of the graphs and information presented above we are able to draw the followingconclusions:
-Whenmonitoringalargequantityofonlinenewssources,inaggregatetheypublisharticlesalmostasquicklyasTwitterconversationsdevelop.
- To effectively analyze the presidential debates,which typically play amajor role in the electioncampaign, it is importanttomonitorsentimentnotonlyduringthecourseofthedebates,butalsothe“reverberations”lastinganumberofhoursthereafter.Thisisbecausethedebatestakeplaceineveninghoursanddetailedanalysisandtypicallymoredetailedreportsarenotpublishedbeforethefollowingmorning.
-OurresultsbasedpurelyonBigDatafromthepresidentialdebatescorrelatewiththeconclusionsofanalystsandcommentatorswithinthemainstreammedia.
- Contrary to the widely reported conclusions of analysts, Hilary Clinton appeared as moreinconsistentanddivisivethanDonaldTrump.
- Following the debates, the sentiment balance for both candidates soon returned to their pre-debatevalues; this indicates that in thisyear’selection thedebatesdidnothaveany fundamentalinfluenceuponthefinalresult.
-FromtheexampleschosenittranspiresthatscandalssuchasthereopeningoftheFBIinvestigationinto Hillary Clinton’s private email server and the publication of the sexist recording of DonaldTrump, only have a short-term influence on sentiment towards both candidates. While thesescandals resulted in a growth in negative sentiment, the effectwas short-termand the sentimentsoonreturnedtopriorlevels.
-SemanticVisions’methodology (whereby in thecaseofavery largequantityofsimilarmentions,theresultingsentimentandthetrendthereofinthefinalstageofthecampaignisthedecisivefactor)our analytical data, which takes the USA as a single entity (as opposed to amodel based on theresultsinindividualstates),indicatedthatHillaryClintonwouldwinbyaslimmargin.Andindeedshedidwinthepopularvotebyover2.5millionvotes,althoughshelostthebattlefortheWhiteHouseduetothesystembasedontheElectoralCollege.
- Further analysis of Twitter activity by Donald Trump and Hillary Clinton shows the fundamentaldifference instyleandcontentof thetwocandidates; inouropinionthisdifferencegreatlyhelpedTrump to win the election. While Hillary Clinton’s tweets were for the most part aimed againstDonaldTrump(hertweetswereessentiallynegative),bycontrastDonaldTrump’stweetsweremorepositiveandhiscoremessagewas"JoinmeandmakeAmericagreatagain".
-Webelievethatthevotersgenerallypreferthebearerofapositivemessage,andthatthiswasalsothereasonwhyDonaldTrumptriumphedintheU.S.PresidentialElection.