diventare aziende data -driven: rendere pervasiva …...from my bank 9:00 pm relax & enjoy 11:00...

37
Diventare aziende data-driven: rendere pervasiva l’adozione di analytics in ogni organizzazione 11/04/2019 - PADOVA

Upload: others

Post on 18-Jul-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

Diventare aziende data-driven: rendere pervasiva l’adozione di analytics in ogni organizzazione11/04/2019 - PADOVA

Page 2: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

RELATORI

Carlo Arioli EMEA Marketing Manager @ [email protected]+39-346-2256423

Gianluigi ViganòEMEA PRESALES @ [email protected]+39-335-7483447

Page 3: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

Time for the real title

2005

Page 4: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

6:15 amMorning Run

3:00 pmShopping

7:00 pmDinner with Bio

9:00 amTrip to Work

1:30 pmBooking trip, ’cause of Ads

4:01 pmAlert recived

from my Bank

9:00 pmRelax & Enjoy

11:00 amEntertain

Page 5: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

Data Efficiency

Strategic

DrivenSpeedProjects

VolumeCosts

Tools/Process

Silos / Skills ValueOutcomes

Page 6: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

The «Data-Driven Company» value chain

6

New datasources

Volume

Descriptiveanalytics

Classical predictive

statistics

AdvancedMachineLearning

CognitiveModeling

HorizontalScalability

Analyticalprogram

languages

Speed ofanalytics

Culturalchange

Data enabledDecision making

Role profiles

Analyticstalents

Adaptationof business

processes

Automationbusiness

processes

Agileprocesses

DATA ANALYTICS IT PEOPLE PROCESSES

Technical foundations

Optimization

Orchestrationof data

Data Security

Unstructureddata

CorporateIT stacks

Organization Crossfunctio-

nality

Cloudworkloads

MultiplicativeIt as good as the weakest link

Business foundation

x x x x =

Strategy and (Analytic) VisionSOU

RCE:

«Ac

hiev

ing

busin

ess i

mpa

ct w

ith d

ata”

McK

inse

y Di

gita

l

Data analytics

governance

Value captured

Page 7: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

You Don’t Need Big Data — You Need the Right Analytics …for all

* RIGHT = in right place, accessible for right people in the right way and right time to help make right business

decisions at the right cost

What if ? Pervasive data-enabled decision making

Page 8: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

ExaByte*proven scale *1EB = 1 000 000 000 GB

The Industry’s only infrastructure agnostic, Unified Advanced Analytics Platform

5-1000xfaster query response certified

Analyze in the Right Place

Strong Reliable Performance at Exabyte Scale

In-Database Analytics &

Machine Learning

Freedom from Underlying

Infrastructure

Point of view: NO compromise

Page 9: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

6xC• Column-Oriented• Cluster based (MPP)• Compression & encoding (TCO)• Cloud proven with EON mode• Complementary to Open Source.• Compliant with ANSI SQL

Why 6? 6 is a perfect number

according number theory!

How ?

Page 10: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

Advanced Analytics & MLRich SQL Analytics and In-database Machine Learning

Extremely fast, scalable & cost effectiveColumnar DB with Multi Parallel Processing Architecture

Easy to use & develop Standard SQL, certified integration with all BI & ETL tools

Streaming Analytics with Kafka

Extended Data Science Spark integration, Java, C, R & Python & V-Python

Analyze on existingData LakesAnalyze data in place withwith SQL on Hadoop and SQL on Amazon S3

Certified Multi-CloudCertified: Azure, AWS & Google

Vertica – Technical Value Proposition

Page 11: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

Data Transformation

Messaging & ETL

BI & Visualization

R Java Python

ODB

C, JD

BC,

ADO

.NET

Geospatial

Event Series

Time series

Text Analytics

Pattern Matching

Regression

User-Defined Functions

SQL

Machine Learning

C++

Vertica Unified Analytics with Open Architecture

Row/Column Security, Masking, FPE, LDAP, Kerberos

ROS JSON{} CSV

Geospatial Real-Time Text Analytics

Event Series

Pattern Matching

Time Series

Machine Learning Regression

1

2

3

6

5

7

Page 12: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

Column Oriented1 column = 1 file on disk (or more)

Ideal for load-/read-intensive workloads with dramatic reduction of disk I/O

Only reads the columns involved in the query from disk instead of every row and column

Reads and writes in very large block sizes

SELECTavg(price)FROMtickstoreWHEREsymbol = 'AAPL'ANDdate = '5/06/09'

5/05/095/06/095/05/095/06/09

Column Store - Reads 3 columns

Row Store - Reads all columns

NQDS NYSE NYSE NYSE NQDS NYSE NYSE NYSE NQDS NYSE NYSE NYSE

NQDS

NQDS NYSE NYSE NYSE NQDS NYSE NYSE NYSE NQDS NYSE NYSE NYSE NQDS

NQDS NYSE NYSE NYSE NQDS NYSE NYSE NYSE NQDS NYSE NYSE NYSE NQDS

NQDS NYSE NYSE NYSE NQDS NYSE NYSE NYSE NQDS NYSE NYSE NYSE NQDS

NQDS NYSE NYSE NYSE NQDS NYSE NYSE NYSE NQDS NYSE NYSE NYSE NQDS

NQDS NYSE NYSE NYSE NQDS NYSE NYSE NYSE NQDS NYSE NYSE NYSE NQDS

NQDS NYSE NYSE NYSE NQDS NYSE NYSE NYSE NQDS NYSE NYSE NYSE

NQDS

NQDS NYSE NYSE NYSE NQDS NYSE NYSE NYSE NQDS NYSE NYSE NYSE NQDS

NQDS NYSE NYSE NYSE NQDS NYSE NYSE NYSE NQDS NYSE NYSE NYSE NQDS

AAPLAAPLBBYBBY

143.74143.75

37.0337.13

5/05/095/06/095/05/095/06/09

NQDS

NYSE

NYSE

NYSE

AAPLAAPLBBYBBY

143.74143.75

37.0337.13

NQDS

NYSE

NYSE

NYSE

NQDS

NYSE

NYSE

NYSE

NQDS

NYSE

NYSE

NYSE

NQDS

NYSE

NYSE

NYSE

NQDS

NYSE

NYSE

NYSE

NQDS

NYSE

NYSE

NYSE

NQDS

NYSE

NYSE

NYSE

NQDS

NYSE

NYSE

NYSE

Page 13: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

Compression & Encoding

8:1

30:1

20:1

60:1

20:1

5:1

10:1

10:1

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

CDR

Consumer

Marketing

Network Logs

SNMP

Trading, IoT (float)

IoT (int)

Clickstream

RatioCompression Results

Just-In-Time Decoding

Engine:Encoded

blocks

Buffer Pool: De-compress

only

Network:Encoded blocks+ Optional LZO

Disk:Encoding +

Compression

Results Decoded Just-In-Time

Page 14: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

Cluster Based

14

Vendor-agnostic, MPP, Shared-Nothing, Scale-Out

8-40 core

8-16 GB / core

24x HDD / SSD

Physical Rack Servers

Linux (RHEL/Ubuntu/Oracle…)

8-64 vCPU

4-8 GB / vCPU

SAN / S3

Virtual / Cloud Servers

Linux (RHEL/Ubuntu/AWS…)Intranet1/10 GbE

Private Network10 GbE

Client

Vertica Cluster

Page 15: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

Platform agnostic …

15

The data storage decisions you make today won’t impact your ability to execute in the future

SQL Database

++Analytics & ML

Access one unified analytics engine and license across all infrastructure choices

Choose your Deployment Choose your Consumption

On Prem

Choose your Cloud

Compute StorageHardware AgnosticHybrid Cloud

Query Engine

Page 16: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

… included Eon

AmazonMicrosoft Azure

Google Cloud

Amazon

S3

First GenerationUsing the cloud as a data center (IaaS)

Second GenerationSeparation of compute and storage

Page 17: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

17

Cultural change

Today‘Right Time’ is…(Near) Real Time…

NOW

NOW NOW

NOW

6K concurrentanalysts

RT distanceprice discrimination

I don’t need NRT / I do need RT

car testing

Page 18: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

Analytic maturity journey: what leaders do better

Ultra Fast Ad-hocDashboard& Analytics

TCO Effective E-DWH

& reporting

Easy Enterprise ScalePredictive Analytics

Automated, Complex Predictive

Analytics

1 2 3 4

Source: McKinsey, The need to lead in data and analytics

communicate simply Build strong capabilities

Process & Technology Metrics

Page 19: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

What if ? Less Guess work !

“When we did the first queries, they were done so fast, we thought they

were broken.”

- Michael Relich, Guess

1hr3.6 sec

8hrs(overnight)

< 30 sec

Bettersales tracking and customerservice in stores.

ImprovedMerchandise allocation and distribution across location.

Page 20: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

20

Take the GUESS work out: what? why? what now ?

Page 21: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

21

Customer 360 you want to «pay for»: self-service, «ad-hoc», ultra-fast

Catch Media’s cloud based B2B Analytics & Engagement Intelligence Platform enables content owners and distributors to keep their consumers satisfied and loyal by

understanding consumer

behavior and acting upon it at

the right time

Mission: The Right Analytics

Powered by

Page 22: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

Business Understanding

Data Analysis &

UnderstandingData

Preparation Modeling Evaluation Deployment

Machine Learning

Speed

ANSI SQL

Scalability

Massively Parallel

Processing

Deploy Anywhere

Outer Detection

Normalization

ImbalancedData

Processing

Sampling

Missing Value Imputation

And More…

Support Vector

MachinesRandom Forests

Logistic Regression

Linear Regression

Ridge Regression

Naive Bayes

Cross Validation

And More…

Model-level Stats

ROC Tables

Error Rate

Lift Table

Confusion Matrix

R-Squared

MSE

In-Database Scoring

Speed

Scale

Security

Pattern Matching

Date/Time Algebra

Window/Partition

Date Type Handling

Sequences

And More…

Sessionize

Time Series

Statistical Summary

SQL SQLSQL SQLSQL

Analytics & in-DB Machine Learning Process Flow

DS cycle

@ columnar MPP speed

80% 20%

Page 23: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

Leveraging OpenSource … at scale

23

Building Predictive Analytics into the Core of Vertica

I I I I I I

I I I I I I

Page 24: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

Simpler Data Science at MPP speed & scale

Style Prediction ?

Page 25: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

Personalized Mobile Messages

25

Online Personalized Recommendation

Predictive Analytics

what will ?

Page 26: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

Style Prediction Automation

Dialogue with dataSQL

Page 27: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

Prepare your next IoT evolutionSpeed & Integration

Predictive Maintenance

«”We calculated 17 different statistical functions on 2 billion data points in less thana minute, which is faster than our previous

system or any other system I’m aware ofwould have taken just to retrieve the data»

https://youtu.be/IZkkoy5ZT1M

«A significant reduction in the operational cost (351 % ROI) »

https://youtu.be/QZ5vWqblVXU

TCO

Predictive Maintenance

Easy to do

«The agility of Vertica is core for … a non-IT organization like

Suunto»

https://youtu.be/BTIee0tYq9E

Wearables for B2C

Page 28: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

IoT - Predictive Maintenance – Listem Data Driven leaders

28

Philips Aims for Zero Unplanned Downtime with Predictive Analytics

Featuring:Dr. Mauro BarbieriSenior ScientistPhilips Research

VisionOrganizationTechnology

https://www.brighttalk.com/webcast/8913/351928

Page 29: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

First of all It’s a very fast database actually the fastestest I’ve been working with … and the speed is not only in querying the data, is also in loading the data. You can answer complex queries on hundreds of billions of rows …

However there is one more aspect: it’s the learning curve for the development organization and the consumers of the data. Time to market, development cost are extremely important and especially in this domain, if you want to develop new features fast, that means making new predictive models and also find out which ones works and which do not work, you need to be able to load and process and integrate data fast, to make dataset available at the higher speed … and that’s what Vertica allowed us to do.

… And of top of it, it support standard SQL, it can be interfaced to everything, deployed everywhere. All engineers knows to some degree SQL, so there is a very low threshold for people to start using the data, and when they start using the data, they realize the value and they are happyt to contribute and that’s the value of Vertica”

29

Dr. Mauro Barbieri, Senior Scientist Philips ResearchTime to Results

Page 30: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

Cross-functional, accurate, agile & transparent

• Thousands live MRIs• Trillions data-points• NRT Dashboarding• Predictive Models

Page 31: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

IoT Vertica based reference architecture

31

Rental charges

Power consumption

Machine sensors

Access logs

Optional Spark/Storm inclusion - convert all

currencies to $

Geospatial to track usage and failures by location

Machine learning to categorize, classify, and predict

Multidimensional to aggregate by dimensions

Real-time dashboards

To Ops as parts replacement recommendations

To Finance as lease buyout recommendations

Time Series to interpolate missing values. Event Series Joins to blend feeds.

Log Text Analytics and Pattern Match to understand errors

Join with data in several other data ponds in many formats

Many live streams in many formats

CEF

Page 32: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

Flex Zone

Semi-Structured Data Flex Table Instant view with Vertica BI executive dashboards

Structure on demand or “schema on need”. Mitigate the volubility and variety of machine data

Now Avro and CSV added to the growing list of open-source parsers

Page 33: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

Vertica Unified Analytics follow your maturity journey

33

HDFS ( months / years )

Vertica HOT LAYER (columnar / MPP / compression )

Fast Ad-hocDashboard& Analytics

Data Access / ExportJDBC

/ODBCREST

API

Kafka – Message Bus

Web Services

TCO EffectiveE-DWH

& reportingNRT Data Driven Custom App

Data Visualizatoin & Mining Layer

Off the Shelf tools

Logi, PowBI,Microstrategy

VSQL-on-Hadoop

VerticaIn-dBML

UdxC++Java

Data science collab tools

TableauQlikView

Vertica KafkaConnector

(spped)

…..

Vertica Flex Table

Vertica copy

Enterprise ScalePreditcive Analytics

ExtDWH

Vertica Ingestion• JSON• DELIMITED• PAIR DELIMITED• AVRO• CSV• CEF• REGEX & SDK

OLTP

TransationalData

GEOspatial

Sensors

Events

Logs & Text

Sensors EventsProbes

Batch + Micro Batch + Stream

RTanalytics

Real TimeCEPconn

Automated, Complex Predictive Analytics

RT

1 2 3 4 5

Traditional ETL

Page 34: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

The Lastminute speech: simplicity !

34

• Data Science• BI and DWH• Clickstream Analytics

«Vertica si caratterizza per la facilità con la quale si è integrato nell’ecosistema BI esistente ..

….. e per l’efficiente scalabilità orizzontale con cui riesce a gestire la crescita dei dati e delle analisi di lastminute.com group ….

• Campaign Management

• CRM Optimization

… perché ritenuto possedere un modello di pricing più efficace per gli obiettivi dell’organizzazione»

CIO Lastminute Group

Page 35: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

Try it free (till 1 TB) !

35

FREE Vertica Communitiy edition: www.vertica.com/try

Page 36: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

Take outs: accelerate “data-driven” transformation

36

Take advantage of advances of tools for modern data pipeline

Deploy fast core “unified” analytics engine to leverage existing and to scale up easily in enterprise

Embrace “governed” self-service analytics with more raw, ad-hoc and the explosion of external data

Employ machine learning and automation at scale with enterprise wide simplicity, accuracy & org transparency

Mobilize the organization

Democratize data access Focus on 1 to 2 areas in the

organization with defined use cases

Change workflows and extend skills to leverage automated analytics

Launch a cultural transformation through training, competitions, and communications

+ =Big Impactfrom simpler

EnterprisewideAnalytics

Source: adapted from McKinsey “Getting big impact from big data”

Page 37: Diventare aziende data -driven: rendere pervasiva …...from my Bank 9:00 pm Relax & Enjoy 11:00 Entertain Data Efficiency Strategic Driven Projects Speed Volume Costs Tools/Process

Data Analytics without limits

see more at:

vertica.com