r a longhorn presentation at taiwan open data forum, taipei, 9 july 2014
DESCRIPTION
Big Data Meets Open Data: Challenges and Issues presentation of Roger Longhorn, Operations & Communications Manager, GSDI Association, delivered at the Taiwan Open Data Forum, 9 July 2014 in TaipeiTRANSCRIPT
OPEN DATA meets BIG DATA: Issues and Challenges
Roger LonghornOperations & Communications Director,
GSDI AssociationFounder Member, IGS
2014 Open Data Forum, Taipei 2
The Presentation
What is Open Data?
Open Data Challenges
What is Big Data?
Big Data Key Challenges
Research needs
09 July 2014
2014 Open Data Forum, Taipei 3
What is Open Data?
Open Definition from the Open Knowledge Foundation:
principles that define “openness” in relation to data and content,
precisely defines “open” in the terms “open data” and “open content”,
ensures interoperability (shared access) between different collections of open material.
http://opendefinition.org/okd/ http://okfn.org/
09 July 2014
2014 Open Data Forum, Taipei 4
What is Open Data?
“A piece of data or content is open if anyone is free to use, reuse, and redistribute it —
subject only, at most, to the requirement to attribute and/or share-alike.”
http://opendefinition.org/od/ http://okfn.org/
09 July 2014
2014 Open Data Forum, Taipei 5
OKN’s Open Data DefinitionThe Open Knowledge Foundation’s definition covers:
• Access• Redistribution• Reuse• Absence of Technological Restriction• Attribution• Integrity• No Discrimination Against Persons or Groups• Distribution of License• License Must Not Be Specific to a Package• License Must Not Restrict the Distribution of Other Works
http://opendefinition.org/od/
09 July 2014
2014 Open Data Forum, Taipei 6
Open Data Census
Global Census Facts at 2014
Number of countries = 70
Number of datasets = 700
Number of open datasets = 84
Percentage open = 12%
From the Open Data Indexhttps://index.okfn.org/
09 July 2014
2014 Open Data Forum, Taipei 7
G8 Open Data Charter
09 July 2014
Principle 1 – Open Data by default
Principle 2: Quality and Quantity
Principle 3: Usable by All
Principle 4: Releasing Data for Improved Governance
Principle 5: Releasing Data for Innovation
2014 Open Data Forum, Taipei 8
Open Data Challenges
09 July 2014
1. What data should be made public?
2. How to make data publicly ‘open’?
3. How to efficiently implement and monitor Open Data policy?
4. How to judge the effectiveness of an Open Data policy?
2014 Open Data Forum, Taipei 9
What Data Should Be Public?
09 July 2014
1. Economic drivers Recent studies reveal the value to economies of
opening up public datasets for unrestricted use, including commercially.
2. Principles for governance of society Reactive versus proactive release of government
data? Privacy concerns Existing regulations
2014 Open Data Forum, Taipei 10
Making Data Publicly ‘Open’
09 July 2014
1. Agreeing data (& service) standards• … and introducing them.
2. Setting appropriate policies• … and enacting them.
3. Promulgating regulations• … and enforcing them.
Lessons learned from the EU’s PSI Re-use Directive(s) (2003 and 2013)
2014 Open Data Forum, Taipei 11
Monitoring Policy
09 July 2014
1. Monitoring Open Data
Should you monitor Open Data policy
Can you monitor Open Data policy?
2. Implementing policy
Voluntary v. mandatory Regulations? Handling infringements
3. Technology
For monitoring and reporting
2014 Open Data Forum, Taipei 12
Judging Effectiveness
09 July 2014
1. How to judge the effectiveness of a government’s Open Data policy?
2. Defining ‘effectiveness’• Benefits for government, society and
businesses• Cost-Benefit Analysis – feasible?• Identifying tangible v. intangible benefits
3. What ‘indicators of success’ to use• Some will be financial• Many (intangibles) will be difficult to
measure
2014 Open Data Forum, Taipei 13
What Is Big Data?
“2,500,000,000,000,000,000 Bytes (2.5 x 1018)of data are created every day!” (2012_
or8,000,000,000,000,000,000
“(7 exabytes) of new data were stored globally by enterprises in 2010”
Source: McKinsey Global Institute
09 July 2014
Big Data is BIG!
2014 Open Data Forum, Taipei 14
The Big Data Landscape
09 July 2014
2014 Open Data Forum, Taipei 15
The 5 Dimensions of Big Data
09 July 2014
2014 Open Data Forum, Taipei 16
Value of Big Data
• In 2012, the world-wide Big Data market reached US$11.59 billion (exceeding previous forecasts).
• 2013 a growth rate of over 60% was predicted, leading to a global Big Data market value of US$18.1 billion.
• For 2012-2017, a 31% Compound Annual Growth Rate (CAGR) was calculated.
• Predicting global Big Data market to exceed US$47 billion by 2017
Sources:Jeff Kelly et al: Big Data Vendor Revenue and market Forecast 2012-
2017 (2013) International Data Corporation (IDC): Worldwide Big Data Technology
and Services 2012-2016 Forecast (2012)
09 July 2014
2014 Open Data Forum, Taipei 17
Value of Big Data
• Big Data is “the next frontier for innovation, competition and productivity”.
• The impact of Big Data provides huge potential for competition and growth for individual companies.
• The right use of Big Data can increase productivity, innovation, and competitiveness for entire sectors and economies.
McKinsey Global Institute, Big Data: The next frontier for innovation, competition and
productivity
09 July 2014
2014 Open Data Forum, Taipei 18
Big Data Challenges
Three Big Data Challenges
• Data Challenges• Process Challenges• Management Challenges
09 July 2014
2014 Open Data Forum, Taipei 19
Data Challenges
• Volume• Variety• Velocity• Veracity• Data discovery• Quality and relevance• Data comprehensiveness• Personally identifiable information• Data dogmatism• Scalability 09 July 2014
2014 Open Data Forum, Taipei 20
Process Challenges
• Capturing data• Aligning data from different sources• Transforming the data into a form
suitable for analysis• Modelling it, either mathematically
or via simulation• Understanding the output
– visualizing and sharing the results,
– how to display complex analytics on a mobile device.
09 July 2014
2014 Open Data Forum, Taipei 21
Management Challenges
• Skills development• Data privacy• Security & Governance• tracking how the data is used, transformed
and derived• Ethical issues– ensuring that data is used correctly– abiding by its intended uses and relevant
laws• Managing its lifecycle
09 July 2014
2014 Open Data Forum, Taipei 22
Big Data meets Open Data
• Identifying the ‘right’ Big Data to provide as Open Data, – Why is this data needed?– Who needs it?– How can it be processed and used?
• Overcoming data access and connectivity challenges– Especially relating to interoperability issues for
multiple Big Data datasets (including those collected in real time)
– Especially if these are not all fully open or follow different Open Data policies;
09 July 2014
2014 Open Data Forum, Taipei 23
Big Data meets Open Data
• Making best use of Big Data– Working across multiple functions (IT,
engineering, finance, procurement)– Overcoming the fragmented ownership of Big
Data (custodianship, IPR, licensing, etc.);
• Resolving security concerns– Data protection (for data owners)– Privacy (for personal data)– Potential misuse (which can raise liability
issues)
09 July 2014
2014 Open Data Forum, Taipei 24
The Research Agenda
• European Big Data Value Strategic Research & Innovation Agenda – “The objective of the SRIA is to describe the main research
challenges and needs for advancing Big Data Value in Europe in the next 5 to 10 years.”
• USA - Big Data Research Initiative– “cross-agency plans and research efforts to extract knowledge
and insights from large and complex collections of digital data.”
• NSF to direct efforts to– develop new methods to derive knowledge from data;– construct new infrastructure to manage, curate and serve
data to communities; and– forge new approaches for associated education and training.
09 July 2014
2014 Open Data Forum, Taipei 25
Thank You!
Roger LonghornOperations & Communications
Director, GSDI AssociationFounder Member, IGS
09 July 2014