imc summit 2016 breakout - amit golander - the benefits of memory and storage convergence to...
TRANSCRIPT
IMC BENEFITS FROM
MEMORY & STORAGE CONVERGENCE
DR. AMIT GOLANDER PLEXISTOR, CTO
Data
set
ABSTRACT
In-memory compute gave up on Storage and moved the working set to Memory.
This brings tremendous performance gains, but also:
1. Consumes expensive DRAM resources
2. Puts data at risk
3. Suffers from slow recovery time when power failures occur
…
The big Question:
How will IMC look like when Memory and Storage converge?
Working
set
2
Agenda:
History & The convergence of Memory & Storage
Benefits – Out-of-the-box
Benefits – That require some work
3
A LONG TIME AGO…
Ideal
Storage
Requirements for Ideal Storage:
1. Low latency reads
2. High volume persistent writes
3. Reasonable cost
4. Transparent & easy to use
Co
st
L
atency
Persistency
DRAM
HDD
SSD Unfortunately such Storage (#2) did not exist
Big Data Middleware
4
SO MIDDLEWARE DEVELOPERS & USERS COMPROMISED
Commit
Log
Memory
Table
Storage
Table Persistent,
Pretty Fast
Cheap
Fast
Sear
ch
acce
lera
tion
1. Storage had Horrible latency for persistent writes,
but not as bad if sequentially written
2. So IMC middleware compensated by using:
- Sequential writes at the expense of read latency
- Async writes at the risk of data loss
- Caching like crazy at the expanse of HW cost (DRAM)
- Write amplification at the expanse of HW cost (Storage)
- Compaction at the expense of HW cost (CPU)
Original requirements Vs. IMC reality:
1. Low latency reads
2. High volume (eventual) persistent writes
3. Reasonable cost
4. Transparent & easy to use 5
WHAT HAS CHANGED?
Memory & Storage are converging:
New HW - Persistent Memory (PM, e.g. NVDIMM-N)
New SW - Software Defined Memory (SDM)
Persistency
DRAM
HDD
SSD
PM
PM+SDM delivers:
1. Low latency reads
2. High Volume persistent writes
3. Reasonable cost
4. Transparent & easy to use Co
st
L
atency
SDM
SDM-ephemeral delivers:
1. Low latency reads
2. High volume persistent* writes
3. Reasonable cost
4. Transparent & easy to use**
* Persistent on orderly shutdowns, not power failures
** Easy to use within share nothing architectures Persistency
DRAM
HDD
SSD
Co
st
L
atency
SDM-ephemeral
6
HOW TO LEVERAGE SDM?
SDM
Scenario II
New Middleware / Some work to existing
Scenario I
Existing Middleware – Out of the box
SDM SDM
SDM
7
Agenda:
History & the convergence of Memory & Storage
Benefits – Out-of-the-box
Benefits – That require some work
8
OUT OF THE BOX INTEGRATION
DRAM/PM FLASH DISK
I/O Path Memory Path
Fast Storage Huge Memory
Data Services
Virtual Memory HDFS POSIX
Plexistor FS (Multi Tier, DAX)
Linux
1. Download & Install SDM
2. Mount m1fs
3. Run your application
9
OOB BENEFIT 1: LARGE WORKING SETS
Work set 2x Memory size
SDM at 17,000 ops/sec
XFS at 2,000 ops/sec
Performance is highly sensitive to
Working set size > Aggregated memory size
Working set size is dynamic and hard to predict
Large clusters are expensive
Cassandra v3.0.2
I2.4xlarge instance on AWS
Data
set
Working
set
10
OOB BENEFIT 2: PERSISTENCY
Performance is highly sensitive to persistency/durability requirements
Replication/Mirroring between nodes without persistency is vulnerable to Power Failures
Data loss risk is often not well explained. Confusion leads to wasteful behavior (#copies, Network)
0
30,000
60,000
90,000
120,000
150,000
180,000
Op
s /
sec
The Traditional Tradeoff
(B) Balanced (D) Durable
MongoDB v3.2
E5-2650v3, CloudSpeed SSD
*
(*) – This actually writes two persistent copies: in Memory Table and in Commit Log
11
OOB BENEFIT 3: LONG RE-BUILD TIMES
Nodes occasionally fail in large clusters
Re-build take many hours to complete
due to extra pressure on the storage layer Clients Clients
Couchbase
server
Couchbase
server
Couchbase
server
Couchbase
server
Couchbase
server X
Couchbase v4.5 beta
E5-2650v4, CloudSpeed SSD
12
OOB BENEFIT 4: PREDICTABILITY
No hiccups due to separate memory and storage stacks
Highly predictable performance
time
TPS
MySQL v5.6
E5-2680v3, HGST SN150
DB load generator runs at target (not maximal) speed 13
Agenda:
History & the convergence of Memory & Storage
Benefits – Out-of-the-box
Benefits – That require some work
14
BENEFITS THAT REQUIRE WORK AT THE MIDDLEWARE LAYER
A lot of potential for Fast Queries & Simplicity
SDM
Storage
Big Data middleware
File-level FIO
E5-2650v3, CloudSpeed SSD 15
EXAMPLE - AMPOOL
16
• Fast & Standard access throughout
the data pipeline
• 56x faster ingest
3-4x faster OLTP&OLAP than HBase
6x faster Spark than Tachyon
DESIGNING MIDDLEWARE IN THE SDM ERA
1. Realize that you’re a storage/memory billionaire
– focus on your business logic
2. Use standard POSIX API and share files between frameworks (polyglot)
3. Use SDM zero-cost Clones (cp –reflink)
4. Rely on SDM Auto-tiering (If you must – hint via fadvise/madvise)
5. Consider relying on SDM Mirroring capabilities
6. Use SDM monitoring tools to understand your resource consumption
17
SUMMARY
Memory and Storage have already started converging (SDM)
IMC best practices are no longer the “best”
SDM provides value to IMC out-of-the-box
but
There is even greater opportunity for those willing to integrate Efficiency
Simplicity
18
Q & A
Free SDM download - www.plexistor.com/download/
White papers - www.plexistor.com/resources/
Blog - www.plexistor.com/blog/
19
HIGH AVAILABILITY - CLARIFICATION
Almost zero-latency added for having a 2nd copy, providing that high-speed RDMA network is in place
Public cloud deployments – Keep using your current HA strategy
On premise deployments – Can substitute most copies with storage redundancy
App server 1
Plexistor SDM
App server 2
Plexistor SDM
App server N
Plexistor SDM
High-speed
RDMA
Open
Brick 1
Open
Brick M
20
SDM VS. XFS-DAX VS. NVML - CLARIFICATION
Plexistor ext4/xfs
DAX NVML
Scale Out Application
Auto Tiering Application
Snapshots/Clones Application
Legacy Applications
NVML support
High availability Application
IT policy hooks
DRAM/PM
Memory Path
Virtual Memory POSIX
FS w/ DAX support*
Linux
App using
mmap
App using
NVML
(*) Who supports DAX: - Plexistor SDM
- Linux xfs-dax, and ext4-dax (WIP)
- MS ReFS-dax (WIP) 21