pentaho google hangout - simplifying analytics architecture for big data

22
PENTAHO Simplifying Analytics Architecture for Big Data 22 nd Feb, 2016 Presenter - Sandeep Khuperkar [email protected] Presenter – Sameer Goswami [email protected]

Upload: ashnikbiz

Post on 13-Apr-2017

394 views

Category:

Software


1 download

TRANSCRIPT

Page 1: Pentaho Google Hangout - Simplifying Analytics Architecture for Big Data

PENTAHO – SimplifyingAnalytics Architecture forBig Data

22nd Feb, 2016

Presenter - Sandeep [email protected]

Presenter – Sameer [email protected]

Page 2: Pentaho Google Hangout - Simplifying Analytics Architecture for Big Data

Data Pipeline

Co

nfi

den

tial

info

rmat

ion

, fo

r in

tern

al u

se o

nly

2

Page 3: Pentaho Google Hangout - Simplifying Analytics Architecture for Big Data

Making the Big Data Blend easy and in reality

Co

nfi

den

tial

info

rmat

ion

, fo

r in

tern

al u

se o

nly

3

Page 4: Pentaho Google Hangout - Simplifying Analytics Architecture for Big Data

Telco Customer Experience Analytics

Co

nfi

den

tial

info

rmat

ion

, fo

r in

tern

al u

se o

nly

4

Page 5: Pentaho Google Hangout - Simplifying Analytics Architecture for Big Data

Co

nfi

den

tial

info

rmat

ion

, fo

r in

tern

al u

se o

nly

5

Pentaho Product Overview

Page 6: Pentaho Google Hangout - Simplifying Analytics Architecture for Big Data

Pentaho Product Components

Pentaho Data Integration

Pentaho Dashboards

Pentaho Data Mining / Predictive Analytics

Pentaho Enterprise and Interactive Reports

Pentaho for Big Data MapReduce & Instaview

Pentaho Analyzer

Page 7: Pentaho Google Hangout - Simplifying Analytics Architecture for Big Data

❯ Simple, easy-to-use visual data exploration

❯ Web-based thin client; in-memory caching

❯ Rich library of interactive visualizations

• Geo-mapping, heat grids, scatter plots, bubble charts, line over bar and more

• Pluggable visualizations

❯ Java ROLAP engine to analyze structured and unstructured data, with SQL dialects for querying data from RDBMs

❯ Pluggable cache integrating with leading caching architectures: Infinispan (JBoss Data Grid) & Memcached

Pentaho Interactive Analysis & Data DiscoveryHighly Flexible Advanced Visualizations

Page 8: Pentaho Google Hangout - Simplifying Analytics Architecture for Big Data

❯ Web-based thin client

❯ Drag & drop, easy-to-use

❯ Supports any database, and data model

❯ Simple, powerful query capabilities for business users

• Filtering, formatting, group summary

❯ Powerful function library for calculated columns

❯ Sharing and distribution

❯ Row-level security

❯ Localization capabilities

Pentaho Interactive ReportingSimple Ad Hoc Reporting for Business Users

Page 9: Pentaho Google Hangout - Simplifying Analytics Architecture for Big Data

• Rich desktop report designer, pixel perfect reports

• High-volume, highly-formatted enterprise reporting

• Run locally or publish to the server

• Broad data support – relational, big data, flat files, PDI transformations

• Output options include

• HTML, Excel, CSV, PDF and RTF

• 100% Java

• Embeddable / white-labeling

Pentaho Enterprise ReportingHighly Polished Reports for Scalable Distribution

Page 10: Pentaho Google Hangout - Simplifying Analytics Architecture for Big Data

Dashboard Designer

• Web-based thin client

• Simple, drag & drop, easy-to-use

• Template-based dashboard creation

• Design rich interactivity with data

• Mashup all Pentaho and 3rd party content

Dashboard Framework

• Create highly customized dashboards & interactive web applications

• Collection of visualization and filter control widgets

• Extensible (Java, JavaScript)

Pentaho DashboardsSelf-Service for Business Users | Customizable for App Developers

Page 11: Pentaho Google Hangout - Simplifying Analytics Architecture for Big Data

Pentaho Data Integration

Easy to Use, Highly Scalable

• Graphical ETL designer

• Data agnostic

• Structured, unstructured, web

services, packaged apps (Google,

SAS, SFDC, etc.), big data sources,

traditional sources, JSON, XML,

HL7, etc.

• Batch, low-latency & real time

processing

• Scale-out architecture, deployable to

PDI clusters, Hadoop clusters

• 100% Java engine; plug-in architecture

for extensibility

• Workflow, alerting, monitoring

Integration, Manipulation & Enrichment

Use Cases:

Classic ETL – data warehouse creation, population &

maintenance

Information Delivery – extraction from multiple data

sources, transformation and streaming to a report

MapReduce Applications – implementing “code-free”

transformation pipelines within Hadoop

Extensibility – adding 3rd-party functionality that

automatically works within any of the above use cases.

Page 12: Pentaho Google Hangout - Simplifying Analytics Architecture for Big Data

Pentaho Big Data Analytics Accelerate the time to big data value

• Full continuity from data

access to decisions –

complete data integration

& analytics for any big

data store

• Faster development,

faster runtime – visual

development, distributed

execution

• Instant and interactive

analysis – no coding and

no ETL required

Page 13: Pentaho Google Hangout - Simplifying Analytics Architecture for Big Data

Major sponsor of the open source project Weka

Data exploration/visualization, model construction and export, preliminary evaluation

Numerous classification/regression and clustering algorithms

Integration with Pentaho Data Integration

• Import 3rd-party models using Predictive Modeling Markup Language (PMML)

• Operationalize models inside or outside of a Hadoop Cluster

• Incorporate algorithms into Pentaho visual interface; store and version models using the Pentaho repository

Pentaho Predictive Analytics

Full Predictive Analytics Lifecycle Support

Page 14: Pentaho Google Hangout - Simplifying Analytics Architecture for Big Data

Co

nfi

den

tial

info

rmat

ion

, fo

r in

tern

al u

se o

nly

14

Leverage new/enhanced features

Page 15: Pentaho Google Hangout - Simplifying Analytics Architecture for Big Data

Inline Model Editing

Model Shared with other users

Page 16: Pentaho Google Hangout - Simplifying Analytics Architecture for Big Data

Co

nfi

den

tial

info

rmat

ion

, fo

r in

tern

al u

se o

nly

16

Data Service and Power Blending

Page 17: Pentaho Google Hangout - Simplifying Analytics Architecture for Big Data

Co

nfi

den

tial

info

rmat

ion

, fo

r in

tern

al u

se o

nly

17

Data Lineage Analysis

• Understanding data origins each time it’s

executed

• What happens to it overtime

• Where data moves

Page 18: Pentaho Google Hangout - Simplifying Analytics Architecture for Big Data

Co

nfi

den

tial

info

rmat

ion

, fo

r in

tern

al u

se o

nly

18

Visual MapReduce

Page 19: Pentaho Google Hangout - Simplifying Analytics Architecture for Big Data

Co

nfi

den

tial

info

rmat

ion

, fo

r in

tern

al u

se o

nly

19

Lifecycle Management

Minimize disruption between major versions

• Content backup & restore

• Support for backward compatible components (Spring, Java)

• Additional effort on upgrade transparency for 5.x users

Scope of capability: Backup and restore all content within the enterprise repository

• Data sources

• Schedules,

• Reports and report outputs,

• Transformations,

• Metastore.

Page 20: Pentaho Google Hangout - Simplifying Analytics Architecture for Big Data

Co

nfi

den

tial

info

rmat

ion

, fo

r in

tern

al u

se o

nly

20

Enhanced Enterprise Security

Page 21: Pentaho Google Hangout - Simplifying Analytics Architecture for Big Data

Co

nfi

den

tial

info

rmat

ion

, fo

r in

tern

al u

se o

nly

Co

nfi

den

tial

info

rmat

ion

, fo

r in

tern

al u

se o

nly

Putting Big Data to Work

21

Page 22: Pentaho Google Hangout - Simplifying Analytics Architecture for Big Data

Please write to us on [email protected]

Follow us on

Find this presentation on our youtube/slideshare

Co

nfi

den

tial

info

rmat

ion

, fo

r in

tern

al u

se o

nly

22