big data for development: opportunities and challenges, summary slidedeck

Click here to load reader

Post on 13-Dec-2014




5 download

Embed Size (px)


Summary points from UN Global Pulse White Paper "Big Data for Development: Opportunities & Challenges." See:


  • 1. Big Data for Development:Challenges and OpportunitiesMay 2012United Nations Global PulseDownload the full paper

2. Innovations in technology and greater affordability of digitaldevices have ushered in an Age of Big Data, an umbrellaterm for the explosion in the quantity and diversity of highfrequency digital data. Big Data comes from everywhere: sensors used to gatherclimate information, social media, digital pictures and videosposted online, transaction records of online purchases, andfrom cell phone GPS signals Real-time data can provide a snapshot of collective behaviorchanges at high frequency, high degrees of granularity andfrom a wide range of angles, which was never beforepossible This data can be analyzed in real-time, which ultimatelycreates the potential for more efficient and effective decisionmaking 3. It is time for the development community and policymakers around the world to recognize and seize this historic opportunity to address 21stCentury challenges, including the effects of globalvolatility, climate change and demographic shifts, with 21st Century tools. 4. I. OPPORTUNITYThe world is experiencing a data revolution, or data deluge. the stock of digital data is expectedto increase 44 times between 2007 and 2020, doubling every 20 months 5. Relevance: Even in the developing world there is anincreasing amount of mobile phones, social media andinternet traffic Intent: With increasing global volatility, policymakersmust respond more quickly and more efficiently tocrises; knowing what is happening, in real-time, canhelp policymakers respond to mitigate the effects ofdisasters. Capacity: Big Data analytics can be used to discovertrends in large, digital data sets A Growing Body of Evidence: Social-scienceresearch has shown that Twitter, Mobile phone dataanalysis, and Google Analytics can reveal issues andtrends of concern to global development, such asdisease outbreaks or mobility patterns. 6. A loose taxonomy of types of digital data sources that are relevant to global development:Data ExhaustPassively collected -- mobile phone data, These digital services create networkedtransactional data, purchases, websensors of human behaviorsearches etc. and/or real-time datacollected by UN agencies, NGOs andother aid organizations to monitor theirprojects (e.g. stock levels or schoolattendance).Online InformationWeb content such as news media andThis approach considers web usage andsocial media interactions (e.g. content as a sensor of human intent,blogs, Twitter), newsentiments, perceptions and wantarticles, obituaries,e-commerce, job postingsPhysical SensorsSatellite or infrared imagery of changing This approach focuses on remote sensinglandscapes, traffic patterns, light of changes in human activityemissions, urban development andtopographic changes, etc.Citizen Reporting or Crowd-sourced Data Information actively produced or submitted While not passively produced, this is a keyby citizens through mobile phone-based information source for verification andsurveys, hotlines , user-generated mapsfeedbacketc 7. Big Data analytics refers to tools and methodologies that aim to transform massivequantities of raw data into data about data for analytical purposes.Some kinds of new digital data sources, particularly transactional records such as thenumber of microfinance loan defaults, number oftext messages sent, or mobile phone-based foodvouchers activated are as close as it gets to indisputable, hard data. 8. II. CHALLENGES 9. Privacy The most sensitive issue, with conceptual, legaland technological implications Access and Sharing While some new data sources whichmay be relevant for development are accessible on the openweb, there is still a great deal of data that is privately held bycorporations and not available for analytical purposes Analysis & Interpretation What type of data is beinganalyzed? Who is a representative sample of the population?How do we avoid misunderstanding relationships within data?(for example, correlation does not necessarily meancausation) Anomaly Detection Characterizing (ab)normality in humanecosystems is very challenging. Must develop ways tocharacterize and detect socioeconomic anomalies in context. 10. Because privacy is a pillar of democracy, we must remain alert to the possibility that it might be compromised by the rise of new technologies, andput in place all necessary safeguards.Another challenge lies partly in the fact that asignificant share of the new data sources in question reflects perceptions, intentions, and desires. Understanding the mechanisms by whichpeople express perceptions, intentions and desiresas well as how they differ between region or linguistic culture, and change over timeis hard. 11. III. APPLICATIONChanges in the number of Tweets mentioning the price of rice and actual food price inflation(official statistics) in Indonesia proved to be closely correlated in a recent collaborative researchproject between Global Pulse and social media analytics company Crimson Hexagon. 12. It is important to know your data. Big Data for Developmentare certainly not perfect data, but their value is tremendous ifthey are both properly understood and analyzed. The promise of Big Data for Development is, and will be, bestfulfilled when its limitations, biases and features areadequately understood and taken into account wheninterpreting data. Properly analysed, Big Data offers the opportunity for animproved understanding of human behaviour that can supportthe field of global development: 1) Early warning, 2) Real-timeawareness and 3) Real-time feedback. 13. Contextualization is key:1) Data context. Indicators should not be interpreted in isolation. If one is concerned with anomaly detection, it is not so much the occurrence of one seemingly unusual fact or trend that should be concerning, but that of two, three or more.2) Cultural context. Knowing what is normal in a country or regional context is prerequisite for recognising anomalies. Cultural practices vary widely around the world and these differences certainly extend to the digital world. There is a deeply ethnographic dimension in using Big Data. Different populations use services in different ways, and have different norms about how they communicate publicly about their lives. 14. Download the full paper