introduction to big data for labdug

14
Introduction to Big Data Daniel D. Gutierrez, Data Scientist AMULET Analytics March 2014

Upload: amuletc

Post on 24-Jun-2015

287 views

Category:

Technology


0 download

DESCRIPTION

My slides for the 1st meeting of the LA Big Data User Group meetup event: "Introduction to Big Data"

TRANSCRIPT

Page 1: Introduction to Big Data for LABDUG

Introduction to Big Data

Daniel D. Gutierrez, Data Scientist

AMULET Analytics

March 2014

Page 2: Introduction to Big Data for LABDUG

/ page 2

Page 3: Introduction to Big Data for LABDUG

/ page 3

Not Everyone Likes the “Big Data” Hype

Page 4: Introduction to Big Data for LABDUG

/ page 4

Volume is a Big Reason for Big Data

Page 5: Introduction to Big Data for LABDUG

/ page 5

Page 6: Introduction to Big Data for LABDUG

/ page 6

Economist

February 27, 2010

Profiled “Big Data”

Page 7: Introduction to Big Data for LABDUG

/ page 7

Page 8: Introduction to Big Data for LABDUG

Big Data

– “large data sets so big that commonly-used software tools are unable to capture,

curate, manage, and process the data within a tolerable elapsed time.”

Hadoop Dominates Big Data market

– Used widely by some of the world's largest websites,

such as Facebook, eBay, Amazon and Yahoo

– Moving into the enterprise

– Invented by developers at Yahoo!

/ page 8

What is Big Data?

Apache Hadoop

Page 9: Introduction to Big Data for LABDUG

/ page 9

Page 10: Introduction to Big Data for LABDUG

/ page 10

Characteristics of Big DataComponent Parts

Big Data is facilitated by Data Science

Data Science is facilitated by Machine Learning

Machine Learning is a confluence of disciplines: computer science,

mathematical statistics, probability theory, visualization, etc.

What is the “New” Part of Big Data

“Big” is new, more data to manage than ever before

Traditional data content is now coupled with internal and external sources of

unstructured data via social media

New forms of analysis such as sentiment and credibility analysis

Bubble Brewing?

Circa 2000 and the Internet bubble event. Will it occur again?

A bubble may occur, but not because of Big Data

Page 11: Introduction to Big Data for LABDUG

/ page 11

Applications for Big Data

Smarter Healthcare

Multi-channel sales

Financial Services

Log Analysis

Homeland Security

Traffic Control

Telecom

Search Quality

Manufacturing

Trading Analytics

Fraud and Risk

Retail: Churn

“Big Data is the definitive source of

competitive advantage across all

industries. For those organizations

that understand and embrace the new

reality of Big Data, the possibilities

for new innovation, improved agility,

and increased profitability are nearly

endless.”

Source: Wikibon 2012

Page 12: Introduction to Big Data for LABDUG

/ page 12

Page 13: Introduction to Big Data for LABDUG

© 2014 AMULET Analytics. All rights reserved.

Page 14: Introduction to Big Data for LABDUG

Thank you!

Follow me: @AMULETAnalytics

Contact me: [email protected]

www.amuletanalytics.com