from spark to ignition: fueling your business on real-time analytics
TRANSCRIPT
Fueling Your Business on Real-Time Analytics
Eric Frenkiel, MemSQL CEO
March 30, 2015 • Las Vegas, NV
From Spark to Ignition:
What’s in Store For This Presentation?
1. MemSQL:
A real-time database for transactions and analytics
2. Spark Use Cases
3. Example: Geospatial Enhancements
MemSQL Snapshot
Experienced Leadership
• Microsoft, Facebook, Oracle, Fusion-io
Inspired by Enterprise architecture gap
A real-time database for transactionsand analytics
• In-memory, distributed, SQL
Broad customer adoption across verticals
Top tier investors
Four Ways Your DBMS is Holding You Back
ETL (Extract, Transform, Load)
Analytic Latency
Synchronization
Copies of data
Source: Gartner Hybrid/Transactional/Analytical Processing Will Foster Opportunities for Dramatic Business Innovation
The Real-Time Database for Transactions and Analytics
Highest Value
Hot DataHigh Value
Cold Data
Analytics
Tra
nsa
ction
s
Data Loading and Queries
Aggregator Nodes
AvailabilityGroup 1
AvailabilityGroup 2
Cluster
Gartner Identifies Emerging Category:
HTAP (Hybrid Transactional/Analytical Processing)
“HTAP will enable business
leaders to perform…much
more advanced and
sophisticated real-time
analysis of their business
data than with traditional
architectures.”
Download at: memsql.com/gartner
Simple
Standard SQL
Transactions and analytics in one database
Behind the firewall or on the cloud
Flexible integrations (Hadoop, Spark, SQL)
Fast
Extremely low-latency queries
Massive parallel transaction capacity
Lock-free, shared-nothing architecture
Scalable
Scales out on cloud and commodity hardware
Deploys to thousands of machines
True linear scaling
Spark Data Processing Framework
Intuitive, concise, and expressive operations needed for analytics
Spark
SQL
Spark
Streaming
Mllib
(machine
learning)
GraphX
(graph)
Apache Spark
MemSQL and Spark Use Cases
Operationalize models built in Spark
Stream and event processing
Live dashboards and automated reports
Extend MemSQL analytics
Operationalize Models Built in Spark
Process in Spark, persist to MemSQL
Go to production and iterate faster
Enterprise
ConsumptionData into Spark
Model Creation Model Persistence
Stream and Event Processing
Structure event data on the fly
Pass to MemSQL for persistent, queryable format
Enterprise
Consumption
Real-time
Streaming Data
Data TransformationPersistent,
Queryable Format
Enterprise
Consumption
Events
KafkaApp
Singer Secor
Spark, MemSQL, Application
Real-Time Analytics at Pinterest
Higher performance event logging
Reliable log transport and storage
Faster query execution on real-time data
Extend MemSQL Analytics
The freshest data for analysis in Spark
Load from MemSQL to Spark and write results on return
Access to Live
Production DataReal-time Replica
Applications,
Data Streams
Interactive
Analytics,
Machine Learning
Live Dashboards and Automated Reports
Serve live dashboards from MemSQL
Run custom reports on live data with Spark
Live
Dashboards
Custom
Reporting
Access to Live
Production Data
SQL Transactions
and Analytics
Geospatial Challenge
Commercial applications now geo-enabled
Location is everywhere
Lots of insight possible
Traditionally geo is processed separately
Real need for integrated geospatial at scale
MemSQL Geospatial
BILLIONS of objects
Sub-second latency
Geo data is first-class citizen
Geo + Simplicity + Speed + Scale
Real-Time Geospatial Location Intelligence
Sample from 170 million taxi trips
Real-time ingest
Concurrent queries in fractions of a second
Unlimited number of geographic views
Simple queries while simultaneously ingesting data