pivotal digital transformation forum: data science technical overview

19
Data Science A Technical Overview Dr Carsten Riggselsen Principal Data Scientist Pivotal

Upload: pivotal

Post on 24-Jan-2017

452 views

Category:

Data & Analytics


3 download

TRANSCRIPT

Page 1: Pivotal Digital Transformation Forum: Data Science Technical Overview

Data Science A Technical Overview

Dr Carsten Riggselsen Principal Data Scientist Pivotal

Page 2: Pivotal Digital Transformation Forum: Data Science Technical Overview

2 © Copyright 2015 Pivotal. All rights reserved.

Page 3: Pivotal Digital Transformation Forum: Data Science Technical Overview

3 © Copyright 2015 Pivotal. All rights reserved.

Pivotal in a Nutshell

Page 4: Pivotal Digital Transformation Forum: Data Science Technical Overview

4 © Copyright 2015 Pivotal. All rights reserved.

Platform to Solution Level

�  Ready to be incorporated into scalable, platform-independent applications: – Pivotal Cloud Foundry platform-as-a-service – Pivotal Labs gold standard for modern software development – World-class Data Science capability

Solution Level

�  Pivotal Big Data Suite is the only enterprise-grade software distribution that contains all elements of the λ architecture

�  Wrapped in a flexible and consumption-based commercial offer

�  Truly enables enterprises to make the best decisions when it matters

Platform Level

Platform Component

Level

�  GemFire is the leading in-memory data grid, processing 2 million events per second

�  Hawq brings 100% ANSI compliant SQL, hundreds of concurrent queries and JDBC & ODBC compliance to link to legacy databases

�  Spring XD streaming workflow builds performant pipelines to consume and consolidate data from a variety of endpoints

Page 5: Pivotal Digital Transformation Forum: Data Science Technical Overview

5 © Copyright 2015 Pivotal. All rights reserved.

Discover insights Create analytics algorithms

Digital Transformation

Deploy analytic apps and automation at scale

Store any type and size of data

Page 6: Pivotal Digital Transformation Forum: Data Science Technical Overview

6 © Copyright 2015 Pivotal. All rights reserved.

Value of Data and Information

Time

Value of Data

µs ms s hour day month year yr+ Spring XD

Pivotal HD

Pivotal GemFire

Information

Page 7: Pivotal Digital Transformation Forum: Data Science Technical Overview

7 © Copyright 2015 Pivotal. All rights reserved.

•  Translate business problems into a

mathematical/statistical problem

•  Combining appropriate Data Science techniques

•  Infer and estimate parameter of statistical models

•  Rely heavily on Pivotal’s stack and OSS

What We Do

Page 8: Pivotal Digital Transformation Forum: Data Science Technical Overview

8 © Copyright 2015 Pivotal. All rights reserved.

Data Science Toolkit

Pivotal Big Data SuitePivotal HDPivotal GreenplumDatabase

P L A T F O R M

Pivotal CF ®

KEY TOOLS KEY LANGUAGES

SQL

Spring XD

Page 9: Pivotal Digital Transformation Forum: Data Science Technical Overview

9 © Copyright 2015 Pivotal. All rights reserved.

R E A LT I M E DASHBOARD Destination Prediction

Page 10: Pivotal Digital Transformation Forum: Data Science Technical Overview

10 © Copyright 2015 Pivotal. All rights reserved.

Understanding Driving Behavior Vehicle Speed (km/h)

Distance (m)

•  Same car •  Same roads •  Different drive

styles

0

10

20

30

40

50

60

- 200 400 600 800 1,000 1,200 1,400

Page 11: Pivotal Digital Transformation Forum: Data Science Technical Overview

11 © Copyright 2015 Pivotal. All rights reserved.

Pivotal HD

Spring XD

Realtime Evaluation Batch Training

Data Persistence

ODB/Bluetooth

Page 12: Pivotal Digital Transformation Forum: Data Science Technical Overview

12 © Copyright 2015 Pivotal. All rights reserved.

Example: Pivotal’s Connected Car (λ Arch)

Spring XD

Pivotal HD

Pivotal GemFire

Speed Layer Serving Layer

Batch Layer

Page 13: Pivotal Digital Transformation Forum: Data Science Technical Overview

13 © Copyright 2015 Pivotal. All rights reserved.

Video Recording

Page 14: Pivotal Digital Transformation Forum: Data Science Technical Overview

14 © Copyright 2015 Pivotal. All rights reserved.

Pivotal’s Scalable Video Analytics Architecture

Page 15: Pivotal Digital Transformation Forum: Data Science Technical Overview

15 © Copyright 2015 Pivotal. All rights reserved.

How We Do It

•  Agile principles and practices

•  Data Science enablement •  Collaboration is key

Page 16: Pivotal Digital Transformation Forum: Data Science Technical Overview

16 © Copyright 2015 Pivotal. All rights reserved.

•  Agile is an iterative approach that enables you to quickly change the kind of analysis you are doing, depending on what the data is telling you.

•  Frequent interactions and pairing

with the customer ensure that the project stays on track and de-risks.

What Agile Data Science Means

Model Evaluation

Feature Review

Model Building

Feature EngineeringData Review

Operationalization

Scoping

Agile Data Science

Page 17: Pivotal Digital Transformation Forum: Data Science Technical Overview

17 © Copyright 2015 Pivotal. All rights reserved.

Frequent Feedback Removes

Risk

Page 18: Pivotal Digital Transformation Forum: Data Science Technical Overview

18 © Copyright 2015 Pivotal. All rights reserved.

Pivotal’s Software stack enables Data Science

Real-time, Interactive and Batch operations

Different architectures can be realized

Data Science is still a matter of thinking

Collaborate and interact

Conclusions

Page 19: Pivotal Digital Transformation Forum: Data Science Technical Overview

Digital Transformation Forum

Disrupt or Be Disrupted 19 OCTOBER · BMW WELT EVENT CENTRE · MUNICH