big data – trends to watchmedia.govtech.net/.../2012/...to_big_data_netapp.pdf · despite the...

22
Big Data Trends Bill Peterson to Watch NetApp September, 2012 1

Upload: others

Post on 25-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Big Data – Trends to Watchmedia.govtech.net/.../2012/...to_Big_Data_NetApp.pdf · Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private

Big Data – Trends

Bill Peterson

gto Watch

NetApp

September, 2012

1

Page 2: Big Data – Trends to Watchmedia.govtech.net/.../2012/...to_Big_Data_NetApp.pdf · Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private

Bill Peterson@th bill@thebillp

Page 3: Big Data – Trends to Watchmedia.govtech.net/.../2012/...to_Big_Data_NetApp.pdf · Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private

What I hope to accomplish todayWhat I hope to accomplish todayaccomplish today...accomplish today...

Page 4: Big Data – Trends to Watchmedia.govtech.net/.../2012/...to_Big_Data_NetApp.pdf · Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private

...and avoid this....and avoid this.

Page 5: Big Data – Trends to Watchmedia.govtech.net/.../2012/...to_Big_Data_NetApp.pdf · Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private

What is “Big Data”?“Big Data” refers to datasets whose volume, speed and complexity is beyond the ability of typical tools to capture, “Big Data” refers to datasets whose volume, speed and complexity is beyond the ability of typical tools to capture,

Complexity

store, manage and analyze.store, manage and analyze.

Coined by Francis yDiebold, professor of economics at the University of PA in 2000, when “Big” meantwhen Big meant Gigabytes / day1

VolumeSpeed

5

Page 6: Big Data – Trends to Watchmedia.govtech.net/.../2012/...to_Big_Data_NetApp.pdf · Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private

Quantifying The Big Data Challenge

Estimated size of the digital universe in

60 Zettabytes

5 Billi

Growth Over the Next Decade:

Servers (Phys/VM): 10xdigital universe in 20205 Billion

smart phones

Data/Information: 50x#Files: 75xIT Professionals: <1.5xSource: Gantz, John and Reinsel, David, “Extracting Value from Chaos”, IDC IVIEW J 2011 4

30 Billionpieces of new content to Facebook per month

IDC IVIEW, June 2011, page 4.

SSensorsVideoMusicLocationW blWeblogs

6

Page 7: Big Data – Trends to Watchmedia.govtech.net/.../2012/...to_Big_Data_NetApp.pdf · Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private

The Big Data Push

Tier 1 BP

Hig

h

OLTP

Tier-2 BP OLTPre

DSS/DWCollaboration

App DevWeb Infra

No SQLColumnar

DBs

a S

truct

ur

Sat G d

IT Infra

Content

Dat

a Ground Stations

FMVDVS

Content Repositories

Performance S

Low

HPCTech Comp Home

Dirs

7

Performance Small Block, Random I/O

(100s KIOPS)

Large Block, Sequential I/O100s GB/sec

Page 8: Big Data – Trends to Watchmedia.govtech.net/.../2012/...to_Big_Data_NetApp.pdf · Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private

What Does This Mean to You?

Information Becomes

on V

eloc

ity

Information Becomes a Propellant to the Organzation

ess

or M

issi

o

InflectionPoint Data Becomes a

Burden to IT Infrastructure

Bus

ine

2010 2020

You are also at an Inflection Point: You also have a decision to make as “business as usual” may not cut it!

8

decision to make, as business as usual may not cut it!

Page 9: Big Data – Trends to Watchmedia.govtech.net/.../2012/...to_Big_Data_NetApp.pdf · Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private

Dispelling the Misconceptions pAbout Big Data

9

Page 10: Big Data – Trends to Watchmedia.govtech.net/.../2012/...to_Big_Data_NetApp.pdf · Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private

Big Data Is NOT New

30 PB of New Data

10

Annually

Page 11: Big Data – Trends to Watchmedia.govtech.net/.../2012/...to_Big_Data_NetApp.pdf · Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private

Big Data = Big Analytics = Hadoop?

That’s What The Media Hype Implies, but it is NOT true!Traditional analytics (BI/DSS/DW) dominates the analytics markety ( ) yLike other technologies vying to gain broad adoption in Enterprise IT (e.g., Traditional Analytics, HPC & Cloud), it shows promise

HadoopHadoop$77 M$77 M

Analytics Analytics $35 B$35 B HPC HPC Cloud Cloud

BPaaSBPaaS$87 B$87 B

Enterprise Enterprise IT $3.6TIT $3.6T

$77 M$77 M $35 B$35 B$29 B$29 B $23 B$23 B

11

Page 12: Big Data – Trends to Watchmedia.govtech.net/.../2012/...to_Big_Data_NetApp.pdf · Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private

12

Page 13: Big Data – Trends to Watchmedia.govtech.net/.../2012/...to_Big_Data_NetApp.pdf · Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private

Why Decision Support Systems are important?

A BusinessCustomers A Business

Management

Customers

Products & Services ManagementProducts & Services

OLTP DW$$$$

DSS enables businesses to run “Closed Loop”, ultimately improving their business through the use of feedback mechanisms.

Page 14: Big Data – Trends to Watchmedia.govtech.net/.../2012/...to_Big_Data_NetApp.pdf · Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private

Big Analytics – An Emerging MarketCloud & Cyber

Open Source Distributors

Integration Services

Legacy DBs

NoSQL / Column DBs

Middleware & Apps

Compute StorageNetwork

14

Page 15: Big Data – Trends to Watchmedia.govtech.net/.../2012/...to_Big_Data_NetApp.pdf · Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private

Analytics & Enterprise Apps Environment

Reporting/Dashboard/Visualization

A l ti

Applications

OLAP

Data Management

Analytics

ETL

OLAP

OLTP

OLAPETL

Mobile Devices

Storage File SystemsOLTP

Storage DataManagement

Sensors

Logs

Location/GPS

Storage(All other storage i e internal DAS)

Content Repositories

Shared StorageInfrastructureOther

Data Sources

Management

15

Applications (All other storage, i.e. internal DAS)

Page 16: Big Data – Trends to Watchmedia.govtech.net/.../2012/...to_Big_Data_NetApp.pdf · Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private

What Does Hadoop Look Like Today?

Runs on a collection of cheap, commodity servers, in a distributed,

HDFS

shared nothing architecture

Two key components

NameNode

Secondary NameNode

– HDFSHadoop Distributed File System

Map Reduce DataNodes /File System

– MapReduceProgramming model for processing

Reduce DataNodes /TaskTracker

:JobTracker

g g p gand generating large datasets

DataNodes /TaskTracker

16

Page 17: Big Data – Trends to Watchmedia.govtech.net/.../2012/...to_Big_Data_NetApp.pdf · Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private

Ethernet’s Relentless March

100000

Data will be growing by 50x, but bandwidth only by 10x!

1000

10000

c

10

100

MB

/Sec

0 1

1iS

CSI

FCIP

FCoE

iWar

p

pNFS

0.1

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

2004

2005

2006

2007

2008

2009

2010

2011

2012

2013

2014

2015

2016

Time

17

SCSI/FCP Infiniband ATM FDDI Ethernet

Page 18: Big Data – Trends to Watchmedia.govtech.net/.../2012/...to_Big_Data_NetApp.pdf · Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private

Why Should You Care?It’s the Value of your dataIt s the Value of your data

Top line revenueLeverage their data– Leverage their data assets into business advantage

5 Billion RecordsAnywhere, AnytimeFaster time to market50% Increase in Revenue

Bottom Line savings– Lower the cost of

complianceO 1PB f d t compliance– Manage ever growing

data efficiently

Over 1PB of dataGrowth of 175% YOY90 days of data within24 hours of a failure y24 hours of a failure

18

Page 19: Big Data – Trends to Watchmedia.govtech.net/.../2012/...to_Big_Data_NetApp.pdf · Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private

AutoSupport: Hadoop Use Case at NetApp

“Call‐home” service for all NetApp® systems Foundation of NetApp proactive support strategies 

CHALLENGE NETAPP SOLUTION BENEFITS

Machine‐generated data doubles every 16 months

CHALLENGE NETAPP SOLUTION BENEFITS

4 weeks to run a query on24 billion nstr ct red

Time reduced from 4 eeks to 10 5 ho rs24 billion unstructured

records4 weeks to 10.5 hours

10-node HadoopImpossible to run a query: 240 billion unstructured records

10 Node Hadoop Cluster

Previously impossible, now achievable in just 18 hours

10-node Hadoop Cluster w/

sharedStoragerecords 18 hours

19

“NetApp ASUP is a mission-critical application”

Page 20: Big Data – Trends to Watchmedia.govtech.net/.../2012/...to_Big_Data_NetApp.pdf · Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private

Analytics of Tomorrow

Traditional & Big Analytics side-by-side for years to comeHadoop moves to shared, virtualized infrastructure, forHadoop moves to shared, virtualized infrastructure, for better efficiency and ease of management:– Hadoop remains logically distributed, shared nothing, but runs

on a virtualized shared everything architecture (e g FlexPodon a virtualized shared everything architecture (e.g., FlexPod for Vmware + eSeries)

– Same as above, except Hadoop becomes logically shared everything as HDFS is replaced by a parallel file systemeverything, as HDFS is replaced by a parallel file system (e.g., Lustre Cluster, StorNext or GPFS)

Enterprise class resiliency (no SPoF) and reliability with HPC lik f ( d f t i li )HPC-like performance (no need for triplicas)Use of a single copy of data for the map phase (higher storage utilization)Natural intersection with Cloud (Analytics as a Service)

20

Page 21: Big Data – Trends to Watchmedia.govtech.net/.../2012/...to_Big_Data_NetApp.pdf · Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private

Summary

Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private companies have

l d i h Bi D f d d )struggled with Big Data for decades) Analytics: Traditional BI/DSS analytics still dominate. Importance of newer NoSQL & Columnar DB applications, enabled by MapReduce will grow with the growth of multi-structured dataBig Data applications, such as Hadoop, will need to adopt shared, virtualized infrastructure (and its management benefits) if they are to be widely adopted by Enterprise IT

21

Page 22: Big Data – Trends to Watchmedia.govtech.net/.../2012/...to_Big_Data_NetApp.pdf · Despite the hype, Big Data is not new and is more than just analytics! (Many agencies and private

YOU’VE GOTYOU VE GOT

I’VE GOTI VE GOTrambling responses that sound like

@thebillp or [email protected]