storage and performance, whiptail

Post on 12-Nov-2014

710 Views

Category:

Technology

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

2

STORAGE and PERFORMANCE

Darren WilliamsTechnical Director, EMEA & APAC

3

3 TB SQL – 17 k IOPS

And

Batch – 20 k IOPS

And

OLTP – 10 k IOPS

And…

VDI

HPC

Analytics

OLTP

Database

THE PROBLEM WITH PERFORMANCE

Accelerate Workloads

DecreaseCosts--------

Storage Decisions

-Accelerate Productivity

Resources

Workload11k IOPS0% Write

13k IOPS25% Write

17k IOPS80% Write

A “More Assets” Problem

Space

Energy

Personnel

96 drivesOr

more discsOr

more cacheOr

more arrays

72 drivesOr

discsOr

cacheOr

arrays

60 drives

-3 TB

3 TB

Speed

Productivity

Total Costs

A Demand Solution

Resources

Workload

12 TB

-Scale

-Total Costs

Batch

Email

Video

4

Speed

Design

• 10s of MB/s Data Transfer Rates

• 100s of Write / Read operation per second

• .001s Latency (ms)

• Motors• Spindles• High Energy

Consumption

SINCE 1956, HDDS HAVE DEFINED APPLICATION PERFORMANCE

5

Speed

Design

• 100s of MB/s data transfer rates

• 1000s of Write or Read operations per second

• .000001 Latency (µs)

• Silicon• MLC/SLC NAND• Low energy

consumption

FLASH ENABLES APPLICATIONS TO WRITE FASTER

6USE OF FLASH – HOST SIDE – PCIE / FLASH DRIVE DAS

• PCIe – Very fast and low latency– Expensive per GB– No redundancy– CPU/Memory stolen from host

• Flash SATA/SAS– More cost effective– Cant get more than 2 drives per blade– Unmanaged can have perf / endurance issues

6

7USE OF FLASH – ARRAY BASED CACHE / TIERING

• Array flash cache– Typically read only– PVS already caches most reads– Effectiveness limited by storage array designed for hard

disks

• Automated storage tiering– “Promotes” hot blocks into flash tier– Only effective for READ– Cache misses still result in “media” reads

7

8USE OF FLASH – FLASH IN THE TRADITIONAL ARRAY

• Flash in a traditional array– Typically uses SLC or eMLC media– High cost per GB– Array is not designed for flash media– Unmanaged will result in poor random write

performance– Unmanaged will result in poor endurance

8

9USE OF FLASH – FLASH IN THE ALL FLASH ARRAY

• Optimized to sustain High Write and Read throughput

• High bandwidth and IOPS. Low latency.• Multi-protocol• LUN Tunable performance• Software designed to enhance lower cost NAND

MLC• Flash by optimizing High Write throughput while

substantially reducing wear• RAID protection and replication

10

RACERUNNER OS

11

4K data blocks

Rewritten data block

A physical HDD is a bit-addressable medium! Virtually limitless write and rewrite capabilities.

NAND FLASH FUNDAMENTALS:HDD WRITE PROCESS REVIEW

12STANDARD NAND FLASH ARRAY WRITE I/O

Fabric

ISCSI FC SRP

Unified Transport

NAND Flash x8

NAND Flash x8

NAND Flash x 8

HBA HBA HBA

RAID

2. Write request passes through the transport stack to RAID.

1. Write request from host passes over fabric through HBAs.

3. Request is written to media.

13

2MB NAND Page

1. NAND Page contents are read to a buffer.

2. NAND Page is erased (aka, “flashed”).

3. Buffer is written back with previous data and any changed or new blocks – including zeroes.

NAND FLASH FUNDAMENTALS:FLASH WRITE PROCESS

14UNDERSTANDING ENDURANCE/RANDOM WRITE PERFORMANCE Endurance

Each cell has physical limits (dielectric breakdown) 2K-5K PE’s Time to erase a block is non-deterministic (2-6 ms) Program time is fairly static based on geometry Failure to control write amplification *will* cause wear out in a

short amount of time Desktop workload is one of the worst for write amplification Most writes are 4-8KB

• Random Write Performance– Write amplification not only causes wear out issues, it also

creates unnecessary delays in small random write workloads.– What is the point of higher cost flash storage with latency

between 2-5ms?

14

15

SRP

NAND SSD x 8

RaceRunnerBlockTranslation Layer:

Alignment | Linearization

RACERUNNER OS:DESIGN AND OPERATION

Fabric

iSCSI

Unified Transport

NAND SSD x 8

HBA HBA HBA

2. Write request passes through the transport stack to BTL.

1. Write request from host passes over fabric through HBAs.

4. Request is written to media.

Data integrity Layer

Enhanced RAID

3. Incoming blocks are aligned to native NAND page size.

NAND SSD x 8

FC

16THE DATA WAITING DAYS ARE OVER

INVICTA

2-6 Nodes6TB-72TB

650,000 IOPS7GB/s Bandwidth

INVICTA – INFINITY (Q1/13)7-30 Nodes21TB-360TB

800,000 – 4 Million IOPS40GB/s Bandwidth

Scalability Path

ACCELA1.5TB – 12TB250,000 IOPS

1.9 GB/s Bandwidth

17THE DATA WAITING DAYS ARE OVERACCELA INVICTA INVICTA INFINITY

Height 2U 6U-14U 16U-64U

Capacity 1.5TB-12TB 6TB-72TB 21TB-360TB

IOPS Up to 250K 250K – 650K 800K – 4M

Bandwidth Up to 1.9GB/Sec Up to 7GB/Sec Up to 40GB/Sec

Latency 120µs 220µs 250µs

Interfaces 2/4/8 Gbit/Sec FC1/10 GBE Infiniband

Protocols FC, ISCSI, NFS, QDR

Features RAID Protection & Hot SparingAsync Replication

VAAIWrite Protection Buffer

RAID Protection and Hot SparingLUN Mirroring and LUN Striping

Async ReplicationVAAI

Write Protection BufferOptions vCenter Plugin/INVICTA Node

KitvCenter

Plugin/INFINITY Switch Kit

vCenter Plugin

18MULTI-WORKLOAD REFERENCE ARCHITECTURE

Dell DVD StoreMS SQL Server

1200 Transactions Per Second (Continuous)

4,000 IOPS.05 GB/s

VMWare View

600 Desktops Boot Storm (2:30)

109,000 IOPS.153 GB/s

SQLIOMS SQL Server

Heavy OLTP Simulation100% 4K Writes (Continuous)

86,000 IOPS.350 GB/s

Batch Report Simulation 100% 64K Reads (Continuous)

16,000 IOPS1 GB/s

Workload Engines

• INVICTA • 350,000 IOPS• 3.5 GB/s • 18 TB

• 8 Servers

Workload Type Workload Demand

215,000 IOPS1.553 GB/s

Mercury

Raid 5 HDD Equivalent = 3,800RAID 10 HDD Equivalent = 2,000

In 2012 Mercury traveled to Barcelona, New York, San Francisco, Santa Clara, and Seattle demonstrating the ability to accelerate multiple workloads on to Solid State Storage.

19FASTER DATABASE BENCHMARKING

$13,000 Power Cost Reduction, 35U to 2U

AMD’s systems engineering department needed to bring various database workloads up quickly and efficiently in the Opteron LabEliminate the time spent performance tuning disk-based storage systems

Replaced 480 Short-Stroked Hard Disk Drives with one 6 TB WHIPTAIL Array supporting multiple storage protocols

50x reduction in Latency

40% improvement in database load times

Engineering team improved workload cycle times

20WHAT WHIPTAIL CAN OFFER:

• Performance

• Cost

Highly experienced - 250+ customers since 2009 for VDI, Database , Analytics etc…

Best in class performance at most competitive price

IOPS ……………… 250K – 4m

Throughput ….. 1.9GB/s – 40GB/s

Latency …………. 120µs

Power ……………. 90% less

Floor Space ……. 90% less

Cooling ………….. 90% less

Endurance ……. 7.5yrs Guaranteed

Making Decision faster …. POA

22

THANKYOU

Darren Williams Email Darren.williams@whiptail.comTwitter @whiptaildarren

top related