enterprise data science at scale meetup - ibm and hortonworks - oct 2017

22
© Hortonworks Inc. 2011 – 2017. All Rights Reserved Hortonworks Premier Inside Out – Introducing Data Science Experience (DSX)

Upload: hortonworks

Post on 21-Jan-2018

1.035 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

© Hortonworks Inc. 2011 – 2017. All Rights Reserved

Hortonworks Premier Inside Out – Introducing Data Science Experience (DSX)

Page 2: Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

© Hortonworks Inc. 2011 – 2017. All Rights Reserved

Presenters

Page 3: Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

© Hortonworks Inc. 2011 – 2017. All Rights Reserved

à #1 Pure Open Source Hadoop Distribution

à 1000+ customers and 2100+ ecosystem partners

à Employs the original architects, developers and operators of Hadoop from Yahoo!

à Best-in-class 24x7 customer support

à Leading professional services and training

à #1 Data Science Platform (Source: Gartner)

à OpenPOWER performance leadership

à Flexible, software defined storage

à #1 SQL Engine for complex, analytical workloads

à Leader in On-premise and Hybrid Cloud solutions

+

IBM + Hortonworks = Unlocking Actionable Insights

Page 4: Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

© Hortonworks Inc. 2011 – 2017. All Rights Reserved

Data Science Lifecycle

Page 5: Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

© Hortonworks Inc. 2011 – 2017. All Rights Reserved

Next Generation Data Science ProblemsMultiple data sources & clusters

Data ScientistsWhere is the data I need to answer the business questions?

Data EngineersHow do I move that data into a central repository?

How do I transform and cleanse that data?

Page 6: Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

© Hortonworks Inc. 2011 – 2017. All Rights Reserved

Next Generation Data Science ProblemsToo many tools and technologies

Data ScientistsHow do I learn the latest library/ technique?

I don’t (want to) know Hadoop/ Hive etc.

How do I bring my familiar R/ Python library to the new data science platform?

Page 7: Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

© Hortonworks Inc. 2011 – 2017. All Rights Reserved

Next Generation Data Science ProblemsSocializing insights is challenging

Data ScientistsHow do I collaborate and share my work with others in the organization?

Business AnalystHow do I move that data into a central repository?

What is the best visualization to tell my story?

Page 8: Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

© Hortonworks Inc. 2011 – 2017. All Rights Reserved

Next Generation Data Science ProblemsGoing from prototype to production is cumbersome

Data ScientistsI created this awesome Machine Learning Model, how do I put it into production?

Data Scientists/ Data EngineersHow are my Machine Learning Models performing & how to improve them?

Page 9: Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

© Hortonworks Inc. 2011 – 2017. All Rights Reserved

Data Science Experience

Explore & Learn Model & Evaluate

Deploy & Predict Monitor & Measure

The leading data science platform that allows you to easily collaborate across teams, use the top open source tools and scale at the speed your business requires.

Page 10: Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

© Hortonworks Inc. 2011 – 2017. All Rights Reserved

Data Science Solution

Community Open Source Scale & Enterprise Security

• Find tutorials and datasets• Connect with Data Scientists• Ask questions• Read articles and papers• Fork and share projects

• Code in Scala/Python/R/SQL• Zeppelin & Jupyter Notebooks• RStudio IDE and Shiny• Apache Spark• Your favorite libraries

• Data Science at Scale• Run Spark Jobs on HDP Cluster• Secure Hadoop Support• Ranger Atlas Support for Data• Support for ABAC

Model Management

• Data Shaping Pipeline UI• Auto-data preparation & modeling• Advanced Visualizations• Model management & deployment• Documented Model APIs

Data Science Experience

Page 11: Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

© Hortonworks Inc. 2011 – 2017. All Rights Reserved

EnterpriseDataScienceAtScale

EnterpriseSecured,

governedandmanaged

ToolsLeverageyourfavoritetools,technologiesandlibraries

DeploymentFrompilottoproduction

DataBuildmodelsusingallthe

data

Page 12: Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

© Hortonworks Inc. 2011 – 2017. All Rights Reserved

DEMO

Page 13: Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

© Hortonworks Inc. 2011 – 2017. All Rights Reserved

Demo ScenarioSensors monitoring Trucks

• Stored long term sensor data about various truckers driving behavior

• New sensor data coming from trucks as they are driving in various conditions

• Predict a driving violation before they happen

• Alert the driver | manager

• Business monitors the driver performance

Page 14: Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

© Hortonworks Inc. 2011 – 2017. All Rights Reserved

Demo FlowInsights from Data Science to Production

Data ScientistsWhere is the data I need to answer the business questions?

Business Users Where is the insight & predictions from the data?

HDP Cluster

Knox

AdminsHow do I meet SLA, Performance, .., Feature needs?

Page 15: Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

© Hortonworks Inc. 2011 – 2017. All Rights Reserved

Demo ScenarioProblems Solved

• Data Scientist collaborate, learn new tools & frameworks

• Choice of tools, notebooks and languages

• Run favorite notebook on all data in the HDP Cluster

• Deploy the model to production

• Leverage the production model to deliver insights to business

• Monitor models and retrain models as new data comes in

Page 16: Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

© Hortonworks Inc. 2011 – 2017. All Rights Reserved

DSX with HDP RoadmapSummary plan

DSX install with AmbariDSX Ambari Install, DSX in HDP, Improve Enterprise readinessInstall DSX with Ambari, DSX runs on YARN node labeled nodes, Ranger, Atlas integration for Model Management, SSO

Improve YARN integration, Model Scoring on YARNDSX scales on all YARN nodes, Model Scoring and Notebooks run on YARN

Deeper DSX YARN integration

Page 17: Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

© Hortonworks Inc. 2011 – 2017. All Rights Reserved

Q & A

Page 18: Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

© Hortonworks Inc. 2011 – 2017. All Rights Reserved

Customer Briefings coming to a City Near You!

24OCT

Silicon Valley

25OCT

Salt Lake City

26OCT

Dallas1

NOV

Chicago

2NOV

Toronto7

NOV

Tysons8

NOV

New York City

9NOV

Boston

Page 19: Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

© Hortonworks Inc. 2011 – 2017. All Rights Reserved

Join us for a Meetup SessionEnterprise Data Science at Scale MeetupSilicon Valley 10/30San Francisco 11/14Chicago 11/08 (*)

Dallas 11/09 (*)

Toronto 11/09 (*)

NYC 11/15 (*)

Washington DC 11/16 (*)

London 11/24 (*)

Boston 12/01 (*)

(*) Tentative

Page 20: Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

© Hortonworks Inc. 2011 – 2017. All Rights Reserved

Page 21: Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

© Hortonworks Inc. 2011 – 2017. All Rights Reserved

Thank You

Page 22: Enterprise Data Science at Scale Meetup - IBM and Hortonworks - Oct 2017

© Hortonworks Inc. 2011 – 2017. All Rights Reserved

Announcement 13 Jun 2017:IBM and Hortonworks extend partnership to bringData Science to HDP

Great Data + Great Data Science = Great Decisionsà IBM chooses Hortonworks Data Platform (HDP®) as their Hadoop distributionà Hortonworks Data Platform (HDP) combining IBM DSX (Data Science Experience)

& IBM Big SQL into new integrated solutions