evolution of big data in usa yang, haiqin 2013-04-22 1

23
Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

Upload: stephany-obrien

Post on 23-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

1

Evolution of Big Data in USA

YANG, Haiqin2013-04-22

Page 2: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

2

Outline

• Birth: 1880 US census• Adolescence: Big Science• Modern Era: Big Business• Future Landscape• Conclusion

Page 3: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

3

The First Big Data Challenge

• 1880 census• 50 million people• Age, gender (sex),

occupation, education level, no. of insane people in household

Page 4: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

4

The First Big Data Solution

• Hollerith Tabulating System

• Punched cards – 80 variables

• Used for 1890 census• 6 weeks instead of 7+

years

Page 5: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

5

Manhattan Project (1946 - 1949)

• $2 billion (approx. 26 billion in 2013)

• Catalyst for “Big Science”

Page 6: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

6

Space Program (1960s)

• Began in late 1950s

• An active area for Big Data nowadays

Page 7: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

7

Adolescence: Big Science

Page 8: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

8

Big Science

• The International Geophysical Year– An international scientific

project– Last from Jul. 1, 1957 to Dec.

31, 1958

• A synoptic collection of observational data on a global scale

• Implications– Big budgets, Big staffs, Big

machines, Big laboratories

Page 9: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

9

Summary of Big Science

• Laid foundation for ambitious projects– International Biological Program– Long Term Ecological Research Network

• Ended in 1974• Many participants viewed it as a failure• Nevertheless, it was a success– Transform the way of processing data– Realize original incentives– Provide a renewed legitimacy for synoptic data

collection

Page 10: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

10

Lessons from Big Science

• Spawn new Big Data projects– Weather prediction – Physics research (supercollider data analytics)– Astronomy images (planet detection)– Medical research (drug interaction)– …

• Businesses latched onto its techniques, methodologies, and objectives

Page 11: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

11

Modern Era: Big Business

Page 12: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

12

Big Science vs. Big Business

• Common– Need technologies to work with data– Use algorithms to mine data

• Big Science– Source: experiments and research conducted in controlled

environments– Goals: to answer questions, or prove theories

• Big Business– Source: transactions in nature and little control– Goals: to discover new opportunities, measure efficiencies,

uncover relationships

Page 13: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

13

Current Status• IDC reports

– 2.7 billion terabytes in 2012, up 48 percent from 2011 – 8 billion terabytes in 2015

• Sources– Structured corporate databases– Unstructured data from webpages, blogs, social networking messages,

…– Countless digital sensors

• Business sectors– Retailers: Walmart, Kohl– Logistics companies: UPS– Telecommunication: AT&T, T-Mobile– …

Page 14: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

14

Understanding of Big Data (1)

• An avalanche of data available increasing exponentially

• Google CEO Erik Schmidt said“Every two days we create as much information as we did from the dawn of civilization up until 2003. That’s something like five exabytes of data.”

• Farnam Jahanian kicked off a May 1, 2012 briefing, calling data“a transformative new currency for science, engineering, education, and commerce.”

Page 15: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

15

Understanding of Big Data (2)

• Farnam Jahanian (NSF)“Big Data is characterized not only by the enormous volume of data but also by the diversity and heterogeneity of the data and the velocity of its generation.”

• Nuala O’Connor Kelly (GE)“it’s the volume and velocity and variety of data… to achieve new results for …”

• Nick Combs (EMC)“It’s needle in a haystack or connecting the dots.”

• Arvind Krishna (IBM) added the fourth V: – Veracity: data in doubt– Describe 'contradictory data,' or noisy data

Page 16: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

16

Implications

• Big Science ? – Big budgets, Big staffs, Big machines, Big laboratories

• Farnam Jahanian (NSF)– To drive the creation of new IT products and services– To accelerate the pace of discovery in almost every SE

discipline– To solve the nation’s most pressing challenges

• Response: $200 million Big Data R&D initiative in 2012– Advance in foundational techniques and technologies– Cyberinfrastructure to manage, curate, and serve data to SE

research and education communities– New approaches to education and workforce development– Nurturance of new types of collaborations

Page 17: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

17

Future Landscape

Page 18: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

18

Data Bases’ View

OLTP / operational

BI / reporting

• After• DB space 2000 - 2010

scalable nonrelational

(“nosql”)

OLTP / operational

BI / reporting

Page 19: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

19

Big Medicine• Information

– Related people: Patients, service providers, nurses, physicians, hospital administrators, government, insurance agencies

– A mixture of structured and unstructured data

• Technologies– Dashboard technologies and analytics,

business intelligence, clinical intelligence, revenue cycle management intelligence

• Other factors– Decision support, ease of information

accessibility, quality of care, physician-patient relationship

Page 20: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

20

Changes in Algorithms

• Efficiency vs. Effectiveness• Flexible learning algorithms to remove bias• Big Data is at an evolutionary juncture to

improve/replace human judgment• Businesses are seeing the value, but thwarted

by the cost of storage, slower processing speeds, and the flood of the data themselves.

Page 21: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

21

Big Data at NASA

• NASA Open Government Plan ver. 2– Managing and processing– Storage– Archiving and Distribution– Analysis– Visualization – Commercial cloud computing services

• Strategy: push from top down and bottom up

Page 22: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

22

Conclusions

• The first challenge• The first solution• What is adolescent age?• What is modern era?• What are characteristics?• What is future landscape?• What does NASA do?

Big Business

Big ScienceCensus

Page 23: Evolution of Big Data in USA YANG, Haiqin 2013-04-22 1

23

References• Frank J. Ohlhorst, Big Data Analytics: Turning Big Data into Big

Money, Wiley, 2012.• 1880 census: http://www.1880census.com/• Herman Hollerith: http://en.wikipedia.org/wiki/Herman_Hollerith• Manhattan Project: http://en.wikipedia.org/wiki/Manhattan_Project• Space exploration: http://en.wikipedia.org/wiki/Space_exploration• Big Science: http://en.wikipedia.org/wiki/Big_Science• IBM Research: http://

ibmresearchalmaden.blogspot.hk/2011/09/ibm-research-almaden-centennial.html

• NASA: http://open.nasa.gov/blog/2012/10/04/what-is-nasa-doing-with-big-data-today/