self-optimizing caches - snia · static vs. self-optimizing caches. applications have dramatically...

2017 Storage Developer Conference. © CachePhysics, Inc. All Rights Reserved.1

Self-Optimizing Caches

Irfan AhmadCachePhysics


About

Irfan AhmadCachePhysics CofounderCloudPhysics Cofounder

VMware (Kernel, Resource Management),Transmeta,30+ PatentsUniversity of Waterloo

@virtualirfan

CachePhysics

Data Path Monitoring and Modeling SoftwareReal-time Predictive Modeling of Data Access Patterns

Increasing Performance & Cost Efficiency of Existing CachesPowering Next-Generation Self-Learning Caches


Latency versus Cost

•Which apps need low latency?•DRAM high $$ (10x v Flash)•Where is the bottleneck?

Persistent MemoryFLASHHDD

LatencySLAs(ms)

Cost Sensitive$$$$ $$

HFT

Ecommerce

Real-Time Decisions

Analytics

Decision Support

Ad Tech

DRAM


Caches are Critical to Every Applications

Distributed Platform

DBsKV Stores

ContainersApps

ServersCPUs

L1/L2/LLCDRAM

3D XpointFlash

NetworkStorage

…

Network Elastic Cache

Cache PerformanceHit RatioCache Size

65%128GB

Intelligent Cache Management is Non-Existent

• Is this performance good?• Can performance be improved?• How much Cache for App A vs B vs …?• What happens if I add / remove DRAM?• How much DRAM versus Flash?• How to achieve 99%ile latency of X µs?• What if I add / remove workloads?• Is there cache thrashing / pollution?• What if I change cache parameters?

Yet…


Static vs. Self-Optimizing Caches

Applications have dramatically differentoptimal cache size and parameter settings.

Today: parameters chosen based on benchmarking.One size does not fit all.

Static Caches Vulnerable to:• Thrashing, Scan pollution• Gross unfairness, Interference• Unpredictability

⇒ Overprovisioning⇒ Lack of Control

Parameter 1

Parameter 2

Parameter 3

Each point represents optimal cache settingsfor a single application


Learn performance model of applications and cache

Predict the performance of workload as f(cache size, params)

Modeling Performance in Real-TimeLower is better

42 84 128 1700

Cache Size (GB)

Late

ncy

(ms)

Cache PerformanceHit RatioCache Size

65%128GB

12

3


Understanding Cache ModelsLower is better

42 84 128 1700

Cache Size (GB)

Late

ncy

(ms)

12

3

x2 x4 x8 Models help decide useful increments of change.


Understanding Cache Models (2)Lower is better

42 84 128 1700

Cache Size (GB)

Late

ncy

(ms)

12

3

Often, most operating points are highly inefficient.


Understanding Model-based Adaptation

Single Workload. Prediction of performance under different policies.


Sample Models From Production Workloads


LIRS Adaptation Examples


2Q Adaptation Examples


Cliff Removal: New Class of Acceleration

Steer the curve? Interpolate convex hull Need Model (HPCA ’15)

Shadow partitions 𝛼𝛼, 𝛽𝛽 Steer different fractions

of refs to each Emulate cache sizes on

convex hull via hashing

𝛼𝛼

𝛽𝛽


Cliff Reduction Results

69%of potential

gain

48% 38%


Achieving Latency Targets

LatencyTarget (7 ms)

CacheAllocation(>16 GB)

Client target 95th %ile latency is 7 ms

Autoset cache partitions size to 16GB to guarantee avg latency SLOs* Throughput targets

can be implemented similarly

Cache Size (GB)

95th

%ile

Late

ncy

(ms)


Multi-Tier Sizing

* Can model network bandwidth as a function of cache misses from each tier

Tier 0 (DRAM) allocationTier 1 (3D Xpoint)

Network Misses

Tier 2 (Local Flash)Tier 3 (Remote Flash)


Towards a Self-Optimizing Data Path

Remote Tier

Monitoring Auto-Select Policies(dynamic parameters)

Lower is better

42 84 128 1700

Cache Size (GB)

Mis

s R

atio

0.0

0.2

0.4

0.6

0.8

Latency Reduction(Thrashing Remediation)

Latency Guarantees Accurate Tiering

Tier 0 allocation for this clientTier 1 allocation for this clientLatency

Target (7 ms)

CacheAllocation(>16 GB)

Client target IO latency is: 7 msGuarantee avg latency: autosetcache partition to 16GB

Multi-Tenant Isolation

vs

Client 0 Client 1

Part. 0 Part. 1

Results:• Safely quantify

impact of changes

• Often 50-150% cache efficiency improvements ($$)

• Latency SLAs met

• Fewer production fire fights

• Higher consolidation ratios

• Accurate Capacity Planning


[email protected] 650-417-8559 @virtualirfan

CachePhysics

mailto:[email protected]

self-optimizing caches - snia · static vs. self-optimizing caches. applications have dramatically...

Documents