microsimulation, big data and predictive analyticsdec 20, 2016 · mark birkin professor of spatial...
TRANSCRIPT
Mark Birkin Professor of Spatial Analysis and Policy, School of Geography, University of Leeds Director, ESRC Consumer Data Research Centre Director, Leeds Institute for Data Analytics
Microsimulation, Big Data and Predictive Analytics
• National statistical authorities • UK Government departments • Research Institutes • World-wide data archives
• Identifying suitable data • Negotiating access • Identifying a safe data
access setting
Administrative Data Research Network
Phase1 Phase2 Phase3
Social Media Data & Third Sector Data
Further
announcements soon
ESRCBigDataNetwork
Multilateral data
sharing
Case studies & academic publications
Metadata & provenance
Business engagement & awareness
raising
Providers
Partners
Prospects
Participants
CDRCDataPartners
CDRCDataPartners
HealthSimula=on
Spa$alMicrosimula$on(2011)
GenderxAgexEthnicity
GenderxAgexIllness
GenderxAgexNSSEC
GenderxAgexCarownership
ELSAWave5
HybridMicrosimula$on(2011to2031)
AdjustedProjec$ons
GenderxAgexEthnicity
ETHPOP2011to2031Projec$ons
CHD;stroke;diabetes;cancer;respiratoryillness;
arthri$sanddepression
HazardModel
2011Census
ELSAWaves1to6
Takeaccountoffutureethniccomposi=onofthelocalauthoritypopula=on
ClarkS.,BirkinM.,HeppenstallA.(2014)Subregionales$matesofmorbidi$esintheEnglishelderlypopula$on,Health&Place.
Health(2)
Health(3)
Health (4): App Data
Clockschange Clockschange
Weeklyac$vityreadingsfromtheBountsapp:
Key:
Transport: Cycling
Blue = commuter cycling potential Green = travel to school cycling potential
Source:hZp://rpubs.com/RobinLovelace/245696
Transport: Trains
Transport: Trains
Journeytowork
Transport: Journey Planning
Infrastructure
Infrastructure
Zuoetal(2014)Geospa$alInforma$onScience,17,3,153-169.
Energy
Retailing/Consump=on
Retailing/Consump=on
RecaponExamples
“anewmethodofpushingforwardthefron$ersofknowledge,enabledbynewtechnologiesforgathering,manipula$ng,analyzinganddisplayingdata”
TheFourthParadigm…
Thousand years ago, science was empirical, describing natural phenomena
Last few hundred years, theoretical branch, using models, generalizations
Last few decades, a computational branch, simulating complex phenomena Today, data exploration (eScience)
synthesizing theory, experiment and computation with advanced data management and statistics à new algorithms!
(Alex Szalay)
Predic=veAnaly=cs
• Dataisgrowing,butwhataboutyourabilitytomakedecisionsbasedonthosehugevolumesofdata?
• SuccesswilldependonhowquicklyyoucandiscoverinsightsfromallthatdataandusethoseinsightstodrivebeZerac$onsacrossyouren$reorganiza$on
• That’swherepredic$veanaly$cs,datamining,machinelearninganddecisionmanagementcomeintoplay.
– Predic$veanaly$cshelpsassesswhatwillhappeninthefuture.– DatamininglooksforhiddenpaZernsindatathatcanbeusedtopredictfuture
behaviour.Businesses,scien$stsandgovernmentshaveusedthisapproachforyearstotransformdataintoproac$veinsights.
– Decisionmanagementturnsthoseinsightsintoac$onsthatareusedinyouropera$onalprocesses.
– (sas.com)
A Note on Synthesis
• Twodefini$onsofSynthe.c:– devised,arranged,orfabricatedforspecialsitua$onstoimitateorreplaceusualreali$es
– madebycombiningdifferentsubstances:notnatural
• Thesecondisrelevantaswellasthefirst!
Three problems for MSM
• Calibra$on– andvalida$on?– especiallyinreal-$me
• Behaviour– Fromdemographicstructuretoac$vi$esandimpacts
• Predic$veanaly$cs– robustnessandrelevance
• Microsimula=onneedsBigData!
Three problems for Big Data • Representa$onbias
– Inferencesforthepopula$on,notthesample
• Synthesis– Contextisalwayscrucial
• Privacy,confiden$ality,ownershipandtrust– Conflictbetweengranularityanddisclosure
• BigDataneedsMicrosimula=on!!
Conclusions
• StrongsynergybetweenMSMandBigData• Dynamicsovermul$plescalesincreasinglyimportant
• Considersynthe$cpopula$onsassynthesisingaswellassynthesised