Download - Data centers Scalability Targets -Storage Account

Getting the most out of Windows Azure StorageJoe GiardinoSenior Development LeadWindows Azure Storage

WAD-B406

AgendaScaleService PerformanceClient PerformanceBlob / Tables / QueuesDebuggingFuture DirectionsSummaryQ & A

North America Region Europe Region Asia Pacific Region

S. Central – U.S. Sub-region

W. Europe Sub-region

N. Central – U.S. Sub-region

N. Europe Sub-region

S.E. AsiaSub-region

E. AsiaSub-region

Data centers

Windows Azure Storage

East – U.S. Sub-region

West – U.S. Sub-region

Windows Azure StorageAbstractionsBlobs – File system in the cloudTables – Massively scalable NoSQL storageQueues – Reliable storage and delivery of messagesDisks – Durable NTFS volumes for Windows Azure VMs

Easy client accessEasy to use REST APIs and Client Libraries

.NET, Java, Node.js and PHP, and morehttp://www.windowsazure.com/en-us/documentation/?fb=en-us

Existing NTFS APIs for Windows Azure VM Disks

http://www.windowsazure.com/en-us/documentation/?fb=en-us

http://www.windowsazure.com/en-us/documentation/?fb=en-us

Scalability Targets -Storage AccountStorage Account level targets Applies to accounts created after June 7th 2012

Capacity – Up to 200 TBs

Transactions – Up to 20,000 entities/messages/blobs per second

Bandwidth for a Geo Redundant storage accountIngress - up to 5 GbpsEgress - up to 10 Gbps

Bandwidth for a Locally Redundant storage accountIngress - up to 10 Gbps Egress - up to 15 Gbps

http://blogs.msdn.com/b/windowsazure/archive/2012/06/08/introducing-locally-redundant-storage-for-windows-azure-storage.aspx

http://blogs.msdn.com/b/windowsazure/archive/2012/06/08/introducing-locally-redundant-storage-for-windows-azure-storage.aspx

Scalability Targets – PartitionPartition level TargetsApplies to accounts created after June 7th 2012

Single Queue – Account Name + Queue NameUp to 2,000 messages per second

Single Table Partition – Account Name + Table Name + PartitionKeyUp to 2,000 entities per second

Single Blob – Account Name + Container Name + Blob NameUp to 60 MB/s

Windows Azure Flat NetworkFlat network storage design“Quantum 10” networkNon-blocking 10Gbps based fully meshed networkMove to software based Load Balancer Enables high bandwidth scenarios such as Windows Azure IaaS disks, HPC, Map Reduce etc.

http://blogs.msdn.com/b/windowsazurestorage/archive/2012/11/04/windows-azure-s-flat-network-storage-and-2012-scalability-targets.aspx






DFS Layer

PartitionServer 1

PartitionServer 2

PartitionServer 3

PartitionServer 4

Master System

Master System

VIP

Partition Master

- RangePartition

- Server Load

Legend

Unassign RangePartitionReassign RangePartition

Key Concept: Automatic RangePartition Load Balancing

Load balancing is triggered based on hot Range Partitions or Partition ServersNo data is moved on disk for the reassignment

Only changing the index assignment for the Partition Servers

FE 1 FE 2 FE 3R1R2

Windows Azure Storage – How is it used?Xbox: Uses Blobs, Tables & Queues for applications like Cloud Game Saves, Halo 4, Music, Kinect data collection etc.

SkyDrive: Uses Blobs to store pictures, documents etc.

Bing: Uses Blobs, Tables and Queues to implement an ingestion engine that consumes Twitter and Facebook public status feeds and provides it to Bing search

Skype: Uses Blobs, Tables and Queues for Skype video messaging

Performance

Demos will be using .NET Storage Client 2.1 RCDeployed in a production clusterCompute is co-located with data (XL VM)Storage Account created after June 12’

Client Performance – Beyond 2.0…2.0 provided dramatic performance improvementsInternal client doing heavy Table workloads migrated from 1.7 to 2.0 table service layer

Observed ~3x performance improvement per VMCPU dropped from 90% to 45%

Storage Client 2.1 – focus on performance and fundamentals

Code Coverage – Up to 91% and countingTest Cases – over 960 publicly available unit testsPerformance – Performance features and regular performance tests passes against production tenants for validationStress – Data correctness tests executed under massive loads…

Available Soon @ http://www.nuget.org/packages/WindowsAzure.Storage

http://www.nuget.org/packages/WindowsAzure.Storage

What’s new in 2.1?Async Task methods with support for cancellationByte Array, Text, File upload / download APIs for blobsIQueryable provider for TablesBuffer PoolingMulti-Buffer Memory Stream for consistent performance when buffering unknown length data.NET MD5 now default (~20% faster than invoking native one)Compiled Expressions for Table EntitiesClient LoggingAnd much more...

Performance – What you can doFollow Best practice for each serviceIdentify what you wish to optimize, latency, throughput, cost, plan accordinglyIdentify Scalability targets and Access Patterns

Plan for growth in your service

Use Timeouts & Retry Policies to control Latency & Fault ToleranceUse latest NuGet package when possible

Tied to latest service release

General Client Best Practices (.NET)Disable Nagle for small messages (< 1400 b)

ServicePointManager.UseNagleAlgorithm = false;

Disable Expect 100-Continue ServicePointManager.Expect100Continue = false;

Avoid Large Object Heap when possibleRe-use buffers if possibleUse Async for apps that scale

General Client Best Practices (.NET GC)Target .NET 4.0, but take advantage of .NET 4.5 GC

GC performance is greatly improved, Background GC : http://msdn.microsoft.com/en-us/magazine/hh882452.aspx

Use Server GC when applicablehttp://msdn.microsoft.com/en-us/library/ee787088.aspx#workstation_and_server_garbage_collectionUses one thread & one heap per Core, Collects all heaps simultaneously

For latency sensitive scenarios use SustainedLowLatency GC

GCSettings.LatencyMode = GCLatencyMode.SustainedLowLatency;Will use more memory and do more frequent smaller duration GC’s

http://msdn.microsoft.com/en-us/magazine/hh882452.aspx

http://msdn.microsoft.com/en-us/library/ee787088.aspx#workstation_and_server_garbage_collection

http://msdn.microsoft.com/en-us/library/ee787088.aspx#workstation_and_server_garbage_collection

.NET or Performance Gotchas!ServicePointManager settings must be done before first call to the service

can be done in app config to avoid racy initialization

Set DefaultConnectionLimit > 2ServicePointManager.DefaultConnectionLimit = 100; (Or More)

.NET Memory Stream performance is bad for services

Async PerformanceDynamic buffer size behaviorAvoid ReadByte / WriteByte methods

Unbounded TPL ParallelismParallel.ForEach – limit parallelism at the application layerhttp://msdn.microsoft.com/en-us/library/ee789351.aspx

http://msdn.microsoft.com/en-us/library/ee789351.aspx



General Server Best PracticesLocate Storage accounts close to users

Co-locate compute in same DC (intra-DC bandwidth is free)For wide spread client apps consider using more than one storage account in different physical DCs

Understand Account Scalability targetsAccount Scalability targets are different depending on Geo-ReplicationGeo Redundant: 5 Gbps ingress / 10 Gbps egressLocally Redundant : 10 Gbps ingress / 15 Gbps egress

Enable Logging & Metrics on each storage serviceCan be done via REST, Client API, or PortalEnables clients to self diagnose issues, including performance related onesData can be automatically GC’d according to a user specified retention interval

General Server Best Practices (cont.)Optimize what you send & receive

Blobs: Range reads, Metadata, Head RequestsTables: Upsert, Merge, Projection, Point QueriesQueues: Update Message, Batch size

Distribute load over many partitionsNote: Table Entity Group Transaction (batch) only applies to entities in the same partition

Control Parallelism at the application layer Unbounded Parallelism can lead to slow latencies and throttling

Blend Traffic TypesIn general Blob workloads tend to be IO bound as Table workloads tend to be CPU limited. By blending traffic in roles you can make the most of your compute resources

Blob Performance Best PracticeA blob is a single partition that can provide 60 Mb/sChoose right BlockSize for your scenario

CloudBlockBlob.StreamMinimumReadSizeInBytes / StreamWriteSizeInBytes

Use seekable streams when possibleSmall non seekable streams will result in 2 REST calls when a single atomic operation could have been used

Use Async for scaleDramatically simplified with Task APIs in 2.1 and async / await

2.x client has efficient downloads with fault tolerance

Blobs: Improvements in the 2.1 release?Byte Array upload/download APIs with offset & lengthAsync Stream open.NET MD5 implementation now default

FISMA compliant clients can re-enable it CloudStorageAccount.V1MD5 = false;

Multi-Buffer Memory Stream to provide performant dynamic buffering of dataIBufferManager works great with .NET BufferManager (System.ServiceModel.dll)

Blob Client Improvements vs. 1.7Sync

Upload Latency decreased by 16.5%Download Latency decreased by 23.2%CPU Utilization decreased by 38.5%

AsyncUpload Latency decreased by 5.3%Download Latency decreased by 37.3%CPU Utilization decreased by 35.1%

2.1 vs. 2.0.xReduces CPU by ~13%

Test Setup:XL VM with Collocated Storage30 concurrent workers uploading a total of 7.5 GB of data (each blob is 256 MB)

0

10

20

30

40

50

60

70

80

1.7

2.0.5.1 2.1 RC

Upload Download CPU

Tim

e (

s)

Blobs - ThroughputA single XL VM can achieve 1 Gbps to the blob store. A single partition canprovide 60 MB/s

Upload Multiple blobs concurrently using minimal blob parallelism

Distributes load over many partitionsLonger connections allow TCP windows to growBlobs under 64 MB can be uploaded as a single put, otherwise multiple block requests & a put block list are requiredAvoids “long tail” problem – 11 MB blob with 1MB blocks, parallelism =10

Concurrency Vs. Blob Parallelism

What pattern is faster to upload N Blobs? 1 concurrent upload with P=3030 concurrent uploads with P=1

Let’s find out…

Concurrency Vs. Blob ParallelismXL VM Uploading 512 256MB Blobs using a 1 MB Blocks (128GB)

P=1, C=1 => Averaged ~ 13.23 MB/sP=30, C=1 => Averaged ~ 50.72 MB/sC=30, P=1 => Averaged ~ 96.64 MB/sTest completed almost twice as fast!Single Blob is bound by the limits of a single partition whereas accessing multiple blobs concurrently can scale

0

2000

4000

6000

8000

10000

12000

P=1, C=1

P=30, C =1

P=1, C=30Tim

e (

s)

So where is parallel download blob?It is not provided in the Client Library

Would require much larger memory footprintDoes not provide significant performance improvement

Experiment 1 GB Blob Get on a XL VMAveraged 130.15 MB/s Note: this is over double the scalability target for a single blob, you should not take a dependency on any performance over these targets due the multi-tenant nature of the system.

Non-Seekable Streams

Blob Non-Seekable Stream ResultsUpload 10,000 10KB blobs

Same stream with CanSeek = true, false

Single Put Blob is better for small blobsNon-Seekable streams force BlobStream use, bad for small blobs

Will pre-buffer dataWill force a PutBlockList request after Block UploadSeekable Stream was ~2x faster

0

40

80

120

160

200

240

Seekable

Non Seekable

Tim

e (

s)

Block Read Size

Blob Read Size Results256MB Block Blob with 4 MB blocks

Single Get averaged 84.46 MB/s 4MB Read averaged 26.9 MB/s…64 KB Read averaged 8.69 MB/s

A BlockBlob can contain 50,000 Blocks

4 MB = Max of 195.31GB 2 MB = Max of 97.65GB…128KB = Max of 6.1 GB

0

4

8

12

16

20

24

256 MB

4 MB 2MB 1MB512KB

256KB

128KB

Tim

e (

s)

Tables

Tables: Improvements in the 2.1 release?Efficient IQueryable implementation

Shortcuts for projection and dynamic query construction

Compiled Serializers / De-Serializers (40-50x faster than reflection)More flexible serialization

Ignore Attribute to omit entity propertyExposed default serialization logic to allow users to easily persist 3rd party objects

Multi-Buffer Memory Stream to provide performant dynamic buffering of dataBuffer Pooling

Table Client Improvements vs. 1.7Complex Single Entity Operations

Upload Latency decreased by 17%Download Latency decreased by 29%Delete Latency decreased by 18%CPU Utilization decreased by 39.75%

Similar CPU reductions for Entity Group Transaction

Test Setup:XL VM with Collocated StorageTime to upload / download / delete 150k complex entities via single operations 0

5

10

15

20

25

30

35

40

45

50

1.7

2.0.5.12.1 RC

Upload Download Delete CPU

Tim

e (

s)

Table Performance Best PracticeKey selection is single most important performance decision you will make

Entity Group TransactionScalabilityEfficient queries

Send and receive the smallest amount of data necessary

Use Merge / Upsert when appropriateServer side projection

Use the Table Service Layer over WCF data services

Query continuations are expensive via WCF data servicesAdditional performance optimizations coming in future releases

Query Design & SelectivityUnderstand Continuation tokens

Returned at 1000 entities4 MB payloadServer boundaries

Table Scans are expensive – avoid them for latency sensitive scenariosPerformance of queries

Point Query (PK, RK) - FastestRow Key Range Query(PK, A < RK< D) Row Key Range Query Open ended (PK, RK > C)Multi Partition Query (A <PK < D, RK …)Query on non-indexed column (Timestamp, user column, etc.)

Tables - ThroughputBatch Size (Entity Group Transaction)

Use Batch to perform atomic actions on a given partition (all entities in the same table with the same PK)Up to 100 entitiesUp to 4 MB in sizeDramatically improves throughput and per entity latencies

Insert operations return the entity back to the client Merge, Replace, and Upserts do not.

Less bytes to read => less bytes to parse => lower NIC / CPU => Higher throughput

Tables - LatencyQuery Selectivity

Smaller take count / page sizes for latency sensitive applications

Projection: Only take what you needAvoid expensive serialization / deserialization

Can use DynamicTableEntity to avoid allocating large objects2.1 includes dramatic improvements over 2.0 / 1.7 clients for clients deriving from TableEntity

Batch Size

What is my optimal batch size?

Batch Size ResultsPer entity latency improves by ~7x Less time spent on authenticationUses connections more efficiently

1 5 10 15 20 25 30 40 50 1000123456789

0

20

40

60

80

100

120

140

Latency

Batch Size

Late

ncy /

En

tity

(m

s)

Insert vs. InsertOrReplace

Insert operations return the entity back to the client.Merge, Replace, and Upserts do not.Less bytes to read = less bytes to parse

Cool Features Demo

IQueryable implementationSerialize 3rd party objectsSimplified ProjectionDynamic type safe query constructionCompiled Serializers

Tables ResourcesTables 2.0 Deep Dive

http://blogs.msdn.com/b/windowsazurestorage/archive/2012/11/06/windows-azure-storage-client-library-2-0-tables-deep-dive.aspx

How to get the most out of Table Storagehttp://blogs.msdn.com/b/windowsazurestorage/archive/2010/11/06/how-to-get-most-out-of-windows-azure-tables.aspx Covers key selectionBatch size





http://blogs.msdn.com/b/windowsazurestorage/archive/2010/11/06/how-to-get-most-out-of-windows-azure-tables.aspx



Queues

Queues Performance Best PracticeNagle is important as many messages may be smallDesign application to update messages rather than deleting and inserting a new oneFor scale use multiple queues, see Multi-Queue ExampleUse Message De-queue count to detect malformed data or “poison” messages

Queues - ThroughputTune Message Count for Peek & Get

Consider failure modes, if a worker gets 32 messages but fails immediately 31 messages must also wait to become visible again

Tune Initial Invisibility Time & Visibility TimeoutUse Multiple Queues

Multi-Queue StrategiesCan use 1 Queue per worker

Simple, but does not always make efficient use of workersScaling is difficult as more queues need to be created / drained and deleted to match size of deployment

Round Robin QueueDefine N Queues with a given prefix i.e. myqueue01, myqueue02…Access Queues in round robin fashion, can be tuned to your scenario, i.e. try 4 queues before returning nullMessages need to track their parent queues for updates / deletesCompute workers are fully utilizedLatency may be higher in the case of empty queues as workers may need to check many queues

Queues - LatencyTune Initial Invisibility Time & Visibility Timeout

Shorter Visibility Timeout will require a more responsive service, but can ensure lower latencies, longer visibility timeouts can allow larger amounts of processing to be done before check pointing

Use smaller Message Count for Peek & GetUse Multiple Queues

Get Messages

What is optimal number of messages to get?

Get Messages ResultsPer entity latency improves by ~11.5x Less time spent on authenticationUses connections more efficiently

1 2 4 8 12 16 24 320

5

10

15

20

25

30

LatencyLatency / Entity

Batch Size

Late

ncy (

ms)

Debugging

Client DebuggingOperationContext is your friendAvailable in an overload of every method that makes a request to the serviceAdd custom headers to every requestStores Information for every request madeClient Request IDSend and Receive events

Client Debugging – Logging (2.1 RC)Opt-in or Opt-out model

Set Maximum Verbosity in App.configCan set System wide verbosity via OperationContext.DefaultLogLevelPerform targeted “vertical” logging for a single API call

OperationContext.LogLevelEnable without redeploying

Debugging Demo

Server side - Storage AnalyticsMetrics provides usage data per abstractionLogging provides per request logging, including client trace id’sLogs / Metrics are stored in your storage account

Can specify a retention policy to auto delete after some period of time

Enable via code or the Azure PortalProvides the ability to monitor and diagnose applicationsEspecially useful when utilizing SAS / SAS provider model

Storage Analytics LogsLog records for requests are stored in Windows Azure Blobs

The Log blobs are text files with one log entry per lineEach blob can contain one to many request records

A request typically appears in the log within 15 minutes after it completes execution

Configure the logging levels separately forBlob, Table and Queuesread (GET), write (PUT/POST/MERGE), delete (DELETE) requests or any combination

Best effort logging

Enable Logging via the portalEach transaction type can be configured separately

Each service is configured separately

Use Retention Policy to auto garbage collect old logs

Storage Analytics – Why turn on Metrics?Provides ability to answer commonly asked questions:

How many transactions did my service issue per hour over the past week?How many anonymous Get Blob requests were issued to my storage account?My application is not performing as expected, what is the availability and performance of storage for a given time period?list goes on and on…

Figure out who is accessing what…

Storage Analytics – MetricsTransaction metrics are provided for every 1 hour time interval stored into Windows Azure Tables

Example MetricsTotal TransactionsAvailability% Success, % Network Errors, % Timeout, % Throttled, etc.Average Latency (Application E2E and Storage Server latency)Total IngressTotal Egress

Blob, Table or Queue Summary and per REST API metrics

Capacity metrics provided for only Blobs at this timeUpdated once a day

Capacity and # of objects

Enable Metrics (Monitoring) via the portalMultiple Verbosity levels

Each service is configured separately

Use Retention Policy to auto Garbage Collect old logs

Storage Analytics SummarySeparate Namespace

LogsStored as blobs in separate Blob Container in the storage account being monitoredhttp://account.blob.core.windows.net/$logs/

MetricsStored as entities in a separate metrics Azure Tables in the storage account being monitoredhttp://account.table.core.windows.net/$Metrics*

Isolation$logs and $Metrics have separate resource limits and throttling from the rest of the storage account traffic

CostCapacity to keep the dataTransactions for generating & accessing analytics dataCan use retention policy on both logs and metrics in terms of daysDeleting data via retention policy does not incur transaction cost

http://account.blob.core.windows.net/$logs/

http://account.table.core.windows.net/$Metrics*

Summary

You are the expert on your scenario!Have a scalable design

How many transactions per partition and accountBandwidth per partition and accountWhat is my strategy for growth in the future?

Deploy best practicesBest practices covered for each service can dramatically improve performanceUse latest Storage Client Libraries via NuGet / GitHub (especially if on 1.x)

How we stack upNasuni recently released their “State of the Cloud Storage Industry” Report for 2013 in February

Windows Azure consistently performed better than all other CSPs in the study, which included Amazon S3, Google Cloud Storage, HP Cloud Object Storage, and Rackspace Cloud Files.Windows Azure delivered the best Write/Read/Delete speeds across a variety of file sizes, with the fastest response times and fewest errors.Windows Azure not only outperformed the competition significantly during the raw performance tests, but it was the only CSP to post zero errors during 100 million reads and writes.

To Read Morehttp://www6.nasuni.com/rs/nasuni/images/2013_Nasuni_CSP_Report.pdfhttp://blogs.technet.com/b/stbnewsbytes/archive/2013/02/19/windows-azure-named-cloud-storage-leader.aspx

http://www6.nasuni.com/rs/nasuni/images/2013_Nasuni_CSP_Report.pdf



http://blogs.technet.com/b/stbnewsbytes/archive/2013/02/19/windows-azure-named-cloud-storage-leader.aspx



Future DirectionsCORS (Cross origin resource sharing) Windows Phone LibraryContinuous performance improvementsExpanding support for additional client platformsGeo-Replication for QueuesRead from secondary DC Expanded Reach – New Region build outs

http://blogs.technet.com/b/microsoft_blog/archive/2013/05/22/microsoft-announces-major-expansion-of-windows-azure-services-in-asia.aspx

And more …




ResourcesGetting Started https://www.windowsazure.com/en-us/develop/overview/

Pricing information https://www.windowsazure.com/en-us/pricing/details/

Storage team blogs http://blogs.msdn.com/b/windowsazurestorage/

Windows Azure Storage: What’s Coming, Best Practices, and Internalshttp://channel9.msdn.com/Events/Build/2013/3-541

.NET Clients encountering Port Exhaustion Bloghttp://blogs.msdn.com/b/windowsazurestorage/archive/2013/05/25/net-clients-encountering-port-exhaustion-after-installing-kb2750149.aspx

SOSP Paperhttp://blogs.msdn.com/b/windowsazurestorage/archive/2011/11/20/windows-azure-storage-a-highly-available-cloud-storage-service-with-strong-consistency.aspx

https://www.windowsazure.com/en-us/develop/overview/

https://www.windowsazure.com/en-us/pricing/details/

http://blogs.msdn.com/b/windowsazurestorage/

http://blogs.msdn.com/b/windowsazurestorage/

http://channel9.msdn.com/Events/Build/2013/3-541

http://channel9.msdn.com/Events/Build/2013/3-541

http://blogs.msdn.com/b/windowsazurestorage/archive/2013/05/25/net-clients-encountering-port-exhaustion-after-installing-kb2750149.aspx




http://blogs.msdn.com/b/windowsazurestorage/archive/2011/11/20/windows-azure-storage-a-highly-available-cloud-storage-service-with-strong-consistency.aspx



Questions?

msdn

Resources for Developers

http://microsoft.com/msdn

Learning

Microsoft Certification & Training Resources

www.microsoft.com/learning

TechNet

Resources

Sessions on Demand

http://channel9.msdn.com/Events/TechEd

Resources for IT Professionals

http://microsoft.com/technet

http://microsoft.com/msdn

http://www.microsoft.com/learning

http://channel9.msdn.com/Events/TechEd

http://microsoft.com/technet

Evaluate this session

Scan this QR code to evaluate this session.

© 2013 Microsoft Corporation. All rights reserved. Microsoft, Windows and other product names are or may be registered trademarks and/or trademarks in the U.S. and/or other countries.The information herein is for informational purposes only and represents the current view of Microsoft Corporation as of the date of this presentation. Because Microsoft must respond to changing market conditions, it should not be interpreted to be a commitment on the part of Microsoft, and Microsoft cannot guarantee the accuracy of any information provided after the date of this presentation. MICROSOFT MAKES NO WARRANTIES, EXPRESS, IMPLIED OR STATUTORY, AS TO THE INFORMATION IN THIS PRESENTATION.

Download - Data centers Scalability Targets -Storage Account

Top Related