utilizing big data analytics with hadoopdownload.101com.com/pub/tdwi/files/sas 041714.pdfsas ®...

35
Utilizing Big Data Analytics with Hadoop Fern Halper @fhalper TDWI Research Director for Advanced Analytics April 17, 2014

Upload: others

Post on 08-Jul-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Utilizing Big Data Analytics

with Hadoop

Fern Halper @fhalper

TDWI Research Director for Advanced Analytics

April 17, 2014

Page 2: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Sponsor

Page 3: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

3

Speakers

Fern Halper Research Director for

Advanced Analytics,

TDWI

Tapan Patel Product Marketing Manager,

SAS

Page 4: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Agenda

• The evolving big data ecosystem

• Status of big data, analytics,and hadoop

• Considerations for getting started

4

Page 6: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

An evolving ecosystem

6

Hadoop

Big data

Advanced Analytics

in-memory

Page 7: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Examining the pieces: Big Data

7

Social

M2M/IoT

Text

Mobile/Location Volume

Formats

Page 8: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

70% of those respondents

using or currently using predictive

analytics are utilizing big data

(source: TDWI Predictive Analytics Best Practices Report, 2014)

8

Page 9: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Examining the pieces: Analytics The Analytics Spectrum

Excel Dashboards and Reports

Other BI Visualization Advanced Analytics

9

Page 10: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Advanced Analytics

10

Advanced analytics provides algorithms for

complex analysis of either structured or unstructured

data. It includes sophisticated statistical models,

machine learning, text analytics, advanced

visualization, and other advanced

data mining techniques.

Page 11: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Examining the pieces: Hadoop

• HDFS/MapReduce

• Schema on read

• Ecosystem of tools

• Commercial distributions

11

Page 12: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

In-memory analytics

• Performance

• Interactivity

12

Page 13: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Status: Evolving architectures

13

Source: (TDWI Evolving Data Warehouse Architectures In the Age of Big Data, 2014) n=1688 responses

What technical issues or practices are driving change in your DW architecture?

Select all that apply.

Page 14: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Status: Big data pieces

14

Page 15: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Status: Analytics pieces

15

Page 16: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Considerations

16

• Defining the problem

• Data preparation

• Analyzing the data

• Making it work (i.e., the team)

• Governance

Page 17: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Data preparation

• ETL vs. ELT

• Data quality

• Metadata

17

Page 18: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Data exploration

18

• Query

• Visualization

• Descriptive statistics

Page 19: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Analysis

19

• Data mining

– Supervised

– Unsupervised

• Other analytics

Page 20: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Operationalize

20

• Business process

• In-database scoring

Page 21: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Skills

21

• Computing

• Analytic modeling

• Creative thinker

• Communicator

Page 22: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Big Data:

The Big Data Maturity Model

22

Page 23: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Poll Question

Are you making use of Hadoop for advanced

analytics

• Yes

• No, but we’re thinking about it

• No, and no plans to do so

• Don’t know

23

Page 24: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Copyr i g ht © 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .

UTILIZING BIG DATA ANALYTICS

WITH HADOOP

TAPAN PATEL, PRODUCT MARKETING MANAGER, SAS

Page 25: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Copyr i g ht © 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .

DATA TO DECISION LIFECYCLE

TEXT COMPETITIVE

ADVANTAGE

PREPARE

DATA

EX

PL

OR

E

DA

TA

DEVELOP

MODELS

DE

PL

OY

&

MO

NIT

OR

Page 26: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Copyr i g ht © 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .

ACCESS TO HADOOP

HADOOP

Hive QL

SAS SERVER

Push some of SAS processing to Hadoop 1

Key Offerings: SAS/Access to Hadoop

SAS/Access to Cloudera Impala

Page 27: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Copyr i g ht © 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .

EMBEDDED PROCESS FRAMEWORK

HADOOP

SAS Data Step & DS2

SAS SERVER

Push SAS processing to Hadoop with MapReduce 2

Key Offerings: SAS Scoring Accelerator for Hadoop

SAS Data Quality Accelerator for Hadoop

SAS Code Accelerator for Hadoop

SAS Data Management

Page 28: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Copyr i g ht © 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS®

IN-MEMORY ANALYTICS AND HADOOP

In-memory processing; use Hadoop for storage persistence and commodity computing 3

SAS® LASR ANALYTIC

SERVER

SAS® IN-MEMORY

SAS® IN-MEMORY

SAS® IN-MEMORY

SAS® IN-MEMORY

SAS® IN-MEMORY

HADOOP WEB CLIENTS APPLICATIONS ERP

SCM

CRM

Images

Audio

and Video

Machine

Logs

Text

f Web and

Social

Data Discovery and Visualization

Statistics and Predictive Analytics

Data Management

Text Analytics

Page 29: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Copyr i g ht © 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS®

VISUAL

STATISTICS INTERACTIVE PREDICTIVE ANALYTICS

EXPLORE AND

DISCOVER PREDICT AND

REFINE

DEPLOY AND

MONITOR

Page 30: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Copyr i g ht © 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS®

VISUAL

STATISTICS INTERACTIVE PREDICTIVE ANALYTICS

Page 31: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Copyr i g ht © 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS®

IN-MEMORY

STATISTICS FOR

HADOOP

WHAT IS IT

• Provides a single interactive programming environment

for Hadoop to perform:

• analytical data manipulation

• variable transformations

• exploratory analysis

• statistical modeling and machine learning

• integrated modeling comparison and scoring

• Takes advantage of distributed in-memory computing

optimized for analytical workloads

TEXT

MANIPULATE

DATA

EX

PL

OR

E

DA

TA

DEVELOP

MODELS

SC

OR

E

Page 32: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

Copyr i g ht © 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS®

IN-MEMORY STATISTICS FOR HADOOP

PRODUCT DEMONSTRATION

Page 33: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

33

Questions?

Page 34: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

34

Download a free

copy of the report

• Download the report as a PDF

file at:

http://tdwi.org/research/2014/03/

checklist-utilizing-big-data-

analytics-with-hadoop

Feel free to distribute the PDF file

of any TDWI Checklist Report

Page 35: Utilizing Big Data Analytics with Hadoopdownload.101com.com/pub/tdwi/Files/SAS 041714.pdfSAS ® IN-MEMORY ANALYTICS AND HADOOP 3 In-memory processing; use Hadoop for storage persistence

35

Contact Information

If you have further questions or comments:

Fern Halper, TDWI [email protected]

Tapan Patel, SAS [email protected]