utilizing big data analytics with hadoopdownload.101com.com/pub/tdwi/files/sas 041714.pdfsas ®...

Post on 08-Jul-2020

5 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Utilizing Big Data Analytics

with Hadoop

Fern Halper @fhalper

TDWI Research Director for Advanced Analytics

April 17, 2014

Sponsor

3

Speakers

Fern Halper Research Director for

Advanced Analytics,

TDWI

Tapan Patel Product Marketing Manager,

SAS

Agenda

• The evolving big data ecosystem

• Status of big data, analytics,and hadoop

• Considerations for getting started

4

An evolving ecosystem

6

Hadoop

Big data

Advanced Analytics

in-memory

Examining the pieces: Big Data

7

Social

M2M/IoT

Text

Mobile/Location Volume

Formats

70% of those respondents

using or currently using predictive

analytics are utilizing big data

(source: TDWI Predictive Analytics Best Practices Report, 2014)

8

Examining the pieces: Analytics The Analytics Spectrum

Excel Dashboards and Reports

Other BI Visualization Advanced Analytics

9

Advanced Analytics

10

Advanced analytics provides algorithms for

complex analysis of either structured or unstructured

data. It includes sophisticated statistical models,

machine learning, text analytics, advanced

visualization, and other advanced

data mining techniques.

Examining the pieces: Hadoop

• HDFS/MapReduce

• Schema on read

• Ecosystem of tools

• Commercial distributions

11

In-memory analytics

• Performance

• Interactivity

12

Status: Evolving architectures

13

Source: (TDWI Evolving Data Warehouse Architectures In the Age of Big Data, 2014) n=1688 responses

What technical issues or practices are driving change in your DW architecture?

Select all that apply.

Status: Big data pieces

14

Status: Analytics pieces

15

Considerations

16

• Defining the problem

• Data preparation

• Analyzing the data

• Making it work (i.e., the team)

• Governance

Data preparation

• ETL vs. ELT

• Data quality

• Metadata

17

Data exploration

18

• Query

• Visualization

• Descriptive statistics

Analysis

19

• Data mining

– Supervised

– Unsupervised

• Other analytics

Operationalize

20

• Business process

• In-database scoring

Skills

21

• Computing

• Analytic modeling

• Creative thinker

• Communicator

Big Data:

The Big Data Maturity Model

22

Poll Question

Are you making use of Hadoop for advanced

analytics

• Yes

• No, but we’re thinking about it

• No, and no plans to do so

• Don’t know

23

Copyr i g ht © 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .

UTILIZING BIG DATA ANALYTICS

WITH HADOOP

TAPAN PATEL, PRODUCT MARKETING MANAGER, SAS

Copyr i g ht © 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .

DATA TO DECISION LIFECYCLE

TEXT COMPETITIVE

ADVANTAGE

PREPARE

DATA

EX

PL

OR

E

DA

TA

DEVELOP

MODELS

DE

PL

OY

&

MO

NIT

OR

Copyr i g ht © 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .

ACCESS TO HADOOP

HADOOP

Hive QL

SAS SERVER

Push some of SAS processing to Hadoop 1

Key Offerings: SAS/Access to Hadoop

SAS/Access to Cloudera Impala

Copyr i g ht © 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .

EMBEDDED PROCESS FRAMEWORK

HADOOP

SAS Data Step & DS2

SAS SERVER

Push SAS processing to Hadoop with MapReduce 2

Key Offerings: SAS Scoring Accelerator for Hadoop

SAS Data Quality Accelerator for Hadoop

SAS Code Accelerator for Hadoop

SAS Data Management

Copyr i g ht © 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS®

IN-MEMORY ANALYTICS AND HADOOP

In-memory processing; use Hadoop for storage persistence and commodity computing 3

SAS® LASR ANALYTIC

SERVER

SAS® IN-MEMORY

SAS® IN-MEMORY

SAS® IN-MEMORY

SAS® IN-MEMORY

SAS® IN-MEMORY

HADOOP WEB CLIENTS APPLICATIONS ERP

SCM

CRM

Images

Audio

and Video

Machine

Logs

Text

f Web and

Social

Data Discovery and Visualization

Statistics and Predictive Analytics

Data Management

Text Analytics

Copyr i g ht © 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS®

VISUAL

STATISTICS INTERACTIVE PREDICTIVE ANALYTICS

EXPLORE AND

DISCOVER PREDICT AND

REFINE

DEPLOY AND

MONITOR

Copyr i g ht © 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS®

VISUAL

STATISTICS INTERACTIVE PREDICTIVE ANALYTICS

Copyr i g ht © 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS®

IN-MEMORY

STATISTICS FOR

HADOOP

WHAT IS IT

• Provides a single interactive programming environment

for Hadoop to perform:

• analytical data manipulation

• variable transformations

• exploratory analysis

• statistical modeling and machine learning

• integrated modeling comparison and scoring

• Takes advantage of distributed in-memory computing

optimized for analytical workloads

TEXT

MANIPULATE

DATA

EX

PL

OR

E

DA

TA

DEVELOP

MODELS

SC

OR

E

Copyr i g ht © 2014 , SAS Ins t i tu t e Inc . A l l r ights reser ve d .

SAS®

IN-MEMORY STATISTICS FOR HADOOP

PRODUCT DEMONSTRATION

33

Questions?

34

Download a free

copy of the report

• Download the report as a PDF

file at:

http://tdwi.org/research/2014/03/

checklist-utilizing-big-data-

analytics-with-hadoop

Feel free to distribute the PDF file

of any TDWI Checklist Report

35

Contact Information

If you have further questions or comments:

Fern Halper, TDWI fhalper@tdwi.org

Tapan Patel, SAS Tapan.Patel@sas.com

top related