data wrangling on hadoop - olivier de garrigues, trifacta

11
Hadoop User Group London: Data Wrangling on Hadoop September 8 2016 Olivier de Garrigues, EMEA Solutions Lead

Upload: huguk

Post on 19-Jan-2017

186 views

Category:

Technology


0 download

TRANSCRIPT

Hadoop User Group London: Data Wrangling on Hadoop September 8 2016

Olivier de Garrigues, EMEA Solutions Lead

Creating radical productivity for people who analyze data.

JEFFREY HEER Co-Founder & CXO

VISUALIZATION

JOE HELLERSTEIN Co-Founder & CSO

BIG DATA

SEAN KANDEL Co-Founder & CTO

HUMAN-COMPUTER INTERACTION

3

3,000+ Companies 10,000+ Users

What is Data Wrangling?

4

QUESTION ANALYZE INSIGHT DISCOVER STRUCTURE CLEANSE ENRICH VALIDATE PUBLISH

The Bridge Between Raw Data & Analysis

5

v

Ingestion Storage Processing

ANALYSIS & VISUALIZATION

LOB CLEANING ENRICHMENT DISTILLATION STRUCTURING DISCOVERY

End-User Capabilities

IT GOVERNANCE INTEGRATION AVAILABILTIY SCALABILITY SECURITY

Technical Capabilities

Conventional Approaches Inhibit User Empowerment

Hand-Coding Technical Workflow Mapping

Trifacta Approach: It’s All About The Experience

Interact Predict

Preview

Data Wrangling for Financial Fraud

TRIFACTA

DATA WRANGLING WORKFLOW

Trifacta. Confidential & Proprietary.

Sample Scale Up

Refine Sample

Results

Identify/Register Data

1. Predictive Interaction

2.

Co

nsu

me

Schedulers

Monitor and Adjust

3.

Schedule

Visualization & Analysis

Secure Access

Ingestion Processing Storage

ANALYSIS & CONSUMPTION

v

Discover Structure Clean Enrich Distill

LOB

IT

News Topics Time

Trades Tickers Date

$

eMails Recipients

Topics

Phone Logs Call Details Recipients

Corporations Company Relations

Individuals

Financial Services use case: Trader Fraud

Data Wrangling Benefits

➔  Empower the people who know the data best

➔  Accelerate time to value

➔  Lower business risk with more accurate data

➔  Unlock innovation using a wider variety of data