1-3 keller pivotal apps & data @ dell emc forum vienna...
TRANSCRIPT
GLOBAL SPONSORS
Schaffen von KundenwertenMit Cloud Native Apps & Analytics
© Copyright 2017 Pivotal Software, Inc. All rights Reserved.
Schaffen von Kundenwerten mitCloud Native Apps & Analytics
September 2017Martin [email protected]
DIE
DIGITALISIERUNG
ÄNDERT ALLES.
WANN ÄNDERT
SICH DIE
POLITIK?
Delivering information in context..
..personalized..
..in real-time
“Companies need to learn how to catch people or things in the act of doing something and affect the outcome”
Great software companies leverage analytics and insights – how do they accomplish that?
Open Source Innovation
Parallel Processing
Cloud Native Continuous
Delivery
Loosely-coupled Microservices
Data Science and Machine Learning
6© 2016 Pivotal Software, Inc. All rights reserved.
Smart Data Driven AppsLogistics Logistics
Pivotal 2017
Important Capabilities
• Ability to store and integrate volumes of data from multiple sources
• Moving beyond basic business intelligence and reporting to more sophisticated data science and predictive modeling techniques
• System must deliver insights about likely next actions in ways advisors can consume and take action on them
• Results of these actions must be fed back into the system to continually improve the predictive models
Data Architecture Pivotal Inc.
DATA FEEDS
DATASOURCES ANALYTIC APPS
Fast Ingest / PipeliningPipelines to consume streaming and batch data from various endpoints
Raw Data Landing Zone
Distributed Memory-based Processing
Realtime Data Insights
Statistical ToolsExpert SystemMachine Learning
Advanced Analytics /MPP
Hadoop Data Lakes
Massively Parallel Architecture
Public Cloud Data Lakes
Predefined Libraries
Programmatic
GPText
Parallel Configurable Data Load
High Speed Ingestion
AnalyticalData to cache
In-Memory Data Grid
Parallel Data Load and External Tables
Pivotal Data Suite
In-DB Predictive Analytics
Col
dH
otW
arm
Dat
a Te
mpe
ratu
re
PIVOTAL GEMFIRE
PIVOTAL GREENPLUM
(Data Warehouse)
Pivotal Data Suite
PIVOTAL GREENPLUMData warehouse database
based on open source Greenplum Database
PIVOTAL GEMFIREOpen source application and transaction data grid based on Apache Geode
Open source data management portfolio
CompletePlatform
MissionCritical
DeploymentOptions
OpenSource
Flexible Licensing
Advanced DataAnalytics
Complete platform
Based on open source
Deployment options
Hadoop native SQL
Flexible licensing
Advanced data services
Pivotal Data SuiteOpen data management portfolio
Pivotal Data Suite
OSS Support Spring XD & Spring Cloud Data Flow OSS Support PostgreSQL
Spielwiese .. Connected Cars
Anwendungsbeispiel
13© Copyright 2015 Pivotal. All rights reserved.
Connected Car Demo youtube linkCONNECTED
CAR
P R E D I C T T H EDESTINATION
P R E D I C T T H E
RANGE
© Copyright 2013 Pivotal. All rights reserved.
Real-time car telematics
• Driving data from in-car OBD2 port• In-depth view on driving • Framework to train models on batch
data and using for real-time prediction• Predict journey destination and fuel
consumption• Build app in collaboration with Pivotal
Labs
Roa
ds
Cars
1 many
1m
any✓
Pivotal Offerings“Companies need to learn how to catch people or things in the act of doing something and affect the outcome”
Data Suite:• Spring Cloud Data Flow - open source data management • GemFire: In-Memory Data Grid• Greenplum: Data Warehouse
Pivotal Cloud Foundry (PCF)• Industry’s Leading Cloud-Native Platform
Pivotal Container Service (PKS) -• Production-Grade Kubernetes
Spring Boot, Cloud and Data Flow• Modern-Java microservices framework
Pivotal Labs & Data Science• Build a smart app end-to-end• Focus on a specific analytical model / data-microservice
16© 2016 Pivotal Software, Inc. All rights reserved.
Pivotal Cloud Cache ExplainedIn-memory caching as an on-demand, managed service on PCF
Pivotal Cloud Cache: In-Memory PerformanceIn-memory performance with cloud-native scalability and availability
Horizontally scalable architecture
High volume transactionsBlazing fast reads10-100x faster than disk
High AvailabilityAcross application and caching layers
In-memory cloud-native cache
Microservices Need Performance and ScalabilityMicroservices with large, frequently accessed data sets need a cache layer
Performance and scalability of data● Add servers to a shared
Pivotal Cloud Cache cluster● Reduces the pressure to scale rigid
backing stores● Enables availability and resilience
App Instance
1
Prepackaged for Simple Consumption
• Easy accessibility through Marketplace
• Instant Provisioning
• Bind to apps through easy to use interface
• Common access control and audit trails across services
MySQL New Relic
Single Sign-On RabbitMQ
Config Server
ServiceDirectory
Circuit Breaker
Signal Sciences
Crunchy PostgreSQL AND
MORE
Services Marketplace
Pivotal Cloud Cache
Redis
Developers get self-service access to Pivotal Cloud Cache on Pivotal Cloud Foundry
20
21
Rio SãoPaulo
Web Application
GemFire Cluster
Oracle RAC Mainframe
Web Application
GemFire Cluster
Oracle RAC Mainframe
WANData Sync
22© 2015 Pivotal Software, Inc. All rights reserved.
GemFire GemFire
Distributed, in-memory NoSQL data grid for big data apps that need: Scale-out performance Consistent database operations across globally distributed nodes High availability, resilience, and global scale Powerful & Standards-based developer features Easy administration of distributed nodes Based on Apache Geode (incubating)
Pivotal GemFire – Usage By the Numbers
23
China Railways• 5,700 train stations• 4.6 million tickets per day• 20 million daily users• 3TB operational data in-memory• 40,000 visits per second• >1,500,000,000 Hits per day
Indian Railways• 7,000 stations• 23 million passengers daily• 120,000 concurrent users• 10,000 transactions per minute• >1,200,000,000 Hits per day
World: ~7,349,000,000~37% of the world population
© Copyright 2017 Pivotal Software, Inc. All rights Reserved.
Pivotal Greenplum
Run Anywhere, Mature, OSS, Analytical MPP
AN OPEN SOURCE DATA WAREHOUSE
BATTLE TESTED IN PRODUCTION
BUILT FOR DIVERSE ANALYTICAL USE CASES
AVAILABLE ANYWHERE YOU NEED IT
WHAT IS GREENPLUM?
26© Copyright 2013 Pivotal. All rights reserved.
The Pivotal Greenplum Database is…
A Highly-Scalable, Shared-Nothing Database
• Leading MPP architecture, including a patented next-generation optimizer
• Optimized architecture and features for loading and queries
• Start small, scale as needed• Polymorphic storage,
compression, partitioning
A Platform for Advanced Analytics on Any (and All)
Data
• Rich ecosystem (SAS, R, BI & ETL tools)
• In-DB Analytics (MADlib, Custom, languages: R, Java, Python, PERL, C, C++)
• High degree of SQL completeness so analysts can use a language they know
• Domain: Geospatial, Text processing (GPText)
An Enterprise Ready Platform Capable of Flexing
With Your Needs
• Available as needed – either as an appliance or software
• Secures data in-place, in flight, and with authentication to suit
• Capable of managing a variety of mixed workloads
Functions
Linear Systems• Sparse and Dense Solvers• Linear Algebra
Matrix Factorization• Singular Value Decomposition (SVD)• Low Rank
Generalized Linear Models• Linear Regression• Logistic Regression• Multinomial Logistic Regression• Ordinal Regression• Cox Proportional Hazards Regression• Elastic Net Regularization• Robust Variance (Huber-White),
Clustered Variance, Marginal Effects
Other Machine Learning Algorithms• Principal Component Analysis (PCA)• Association Rules (Apriori)• Topic Modeling (Parallel LDA)• Decision Trees• Random Forest• Conditional Random Field (CRF)• Clustering (K-means) • Cross Validation• Naïve Bayes• Support Vector Machines (SVM)• Prediction Metrics• K-Nearest Neighbors
Descriptive StatisticsSketch-Based Estimators• CountMin (Cormode-Muth.)• FM (Flajolet-Martin)• MFV (Most Frequent Values)Correlation and CovarianceSummary
Utility ModulesArray and Matrix OperationsSparse VectorsRandom SamplingProbability FunctionsData PreparationPMML ExportConjugate GradientStemmingSessionizationPivot
Inferential StatisticsHypothesis Tests
Time Series• ARIMA
Jan 2017
Path Functions• Operations on Pattern Matches
Graph• Single Source Shortest Path• Page Rank
Procedural Languages
• User Defined Types
• User Defined Functions
• User Defined Aggregates
• Import of libraries from open source
Greenplum Geospatial Big DataCurrent Key Features:• Points, Lines, Polygons,
Perimeter, Area, Intersection, Contains, Distance, Long/Lat
Spatial Indexes & Bounding Boxes
Round earth calculations
Raster Support
Integrated Text Analytics
GPText: SQL Warehousing + Text• Leveraging Apache Solr and GPDB• 5 years commercial production experience• Madlib integration for machine learning on text data• PL/Python and PL/Java integration for Natural Language Processing
Use Cases• Communications compliance and monitoring• Customer Sentiment analysis• Document Search and Query• Social Media Processing, etc.
Fragen?DANKE!
Backup
Greenplum Hadoop & Cloud Connectors
Dat
a Te
mpe
ratu
reW
ar mH
ot
Operational Analytics & SQL
Data Lake & Cold Storage
SLA Driven & Iterative
Parallel High Speed SQL Transfer
War m
Col
d
Public & PrivateData Lakes
Batch & AdHoc
Gemfire Greenplum Connector (GGC)
Dat
a Te
mpe
ratu
reW
ar mH
ot
Custom Apps
App 1App 1App 1
App 2App 2App 2
Data science, analytics & ML
TransactionalNative API
Rest / HTTP
AnalyticalANSI SQL
PushUpdates
Parallel ConfigurableData Load
Transactionaldata
Write behind
AnalyticalData
to cache