Download - Windows Azure for Research Roger Barga, Architect Cloud Computing Futures, MSR

Windows Azure for ResearchRoger Barga Architect

Cloud Computing Futures MSR

IEEE e-Science 2010 Conference

7 - 10 DECEMBER 2010

The Million Server Datacenter

HPC and Clouds ndash Select Comparisons

bull Node and system architectures bull Communication fabricbull Storage systems and analyticsbull Physical plant and operationsbull Programming models (rest of tutorial)

HPC Node Architecture Moorersquos ldquoLawrdquo favored commodity systemsbull Specialized processors and systems falteredbull ldquoKiller microsrdquo and industry standard blades ledbull Inexpensive clusters now dominate

wwwtop500org

HPC Interconnects bull Ethernet for low end (cost sensitive)bull High end expectations bull Nearly flat networks and very large switchesbull Operating system bypass for low latency (microseconds)

wwwtop500org

6

Modern Data Center Network

InternetInternetCR CR

AR AR AR ARhellip

SSLB LB

Data CenterLayer 3

Internet

SS

A AA hellip

SS

A AA hellip

hellip

Layer 2

Keybull CR (L3 Border Router)bull AR (L3 Access Router)bull S (L2 Switch)bull LB (Load Balancer)bull A (20 Server RackTOR)

GigE

10 GigE

HPC Storage Systemsbull Local diskbull Scratch or non-existent

bull Secondary storagebull SAN and parallel file systemsbull Hundreds of TBs (at most)

bull Tertiary storagebull Tape robot(s)bull 3-5 GBs bandwidth

wwwnerscgov

~60 PB capacity



A Tour Around Windows Azure

Azure in Action Manning PressProgramming Windows Azure OrsquoReilly PressBing Channel 9 Windows AzureBing Windows Azure Platform Training Kit ndash November 2010 Updatehttpresearchmicrosoftcomazure xcgngagemicrosoftcom

11

Application Model Comparison

Machines RunningIIS ASPNET

Machines RunningWindows Services

Machines RunningSQL Server

Ad Hoc Application Model

12






Web Role Instances Worker RoleInstances

Azure StorageBlob Queue Table

SQL Azure

Windows Azure Application Model

Key ComponentsFabric Controller

bull Manages hardware and virtual machines for service

Computebull Web Roles

bull Web application front end

bull Worker Rolesbull Utility compute

bull VM Rolesbull Custom compute rolebull You own and customize the VM

Storagebull Blobs

bull Binary objects

bull Tablesbull Entity storage

bull Queuesbull Role coordination

bull SQL Azurebull SQL in the cloud


bull Think of it as an automated IT departmentbull ldquoCloud Layerrdquo on top ofbull Windows Server 2008bull A custom version of Hyper-V called the Windows Azure Hypervisor

bull Allows for automated management of virtual machines




bull Itrsquos job is to provision deploy monitor and maintain applications in data centers

bull Applications have a ldquoshaperdquo and a ldquoconfigurationrdquobull The configuration definition describes the shape of a service

bull Role typesbull Role VM sizesbull External and internal endpointsbull Local storage

bull The configuration settings configures a servicebull Instance countbull Storage keysbull Application-specific settings


bull Manages ldquonodesrdquo and ldquoedgesrdquo in the ldquofabricrdquo (the hardware)bull Power-on automation devicesbull Routers Switchesbull Hardware load balancersbull Physical serversbull Virtual servers

bull State transitionsbull Current Statebull Goal Statebull Does what is needed to reach and maintain the goal state

bull Itrsquos a perfect IT employeebull Never sleepsbull Doesnrsquot ever ask for raisebull Always does what you tell it to do in configuration definition and settings

Creating a New Project

Windows Azure Compute

Key Components ndash ComputeWeb Roles

Web Front Endbull Cloud web serverbull Web pagesbull Web services

You can create the following typesbull ASPNET web rolesbull ASPNET MVC 2 web rolesbull WCF service web rolesbull Worker rolesbull CGI-based web roles

Key Components ndash ComputeWorker Roles

bull Utility computebull Windows Server 2008bull Background processingbull Each role can define an amount of local storagebull Protected space on the local drive considered volatile

storage bull May communicate with outside servicesbull Azure Storagebull SQL Azurebull Other Web services

bull Can expose external and internal endpoints

Suggested Application ModelUsing queues for reliable messaging

Scalable Fault Tolerant Applications

Queues are the application gluebull Decouple parts of application easier to scale independentlybull Resource allocation different priority queues and backend

serversbull Mask faults in worker roles (reliable messaging)

Key Components ndash ComputeVM Roles

bull Customized Rolebull You own the box

bull How it worksbull Download ldquoGuest OSrdquo to Server 2008 Hyper-Vbull Customize the OS as you need tobull Upload the differences VHDbull Azure runs your VM role usingbull Base OSbull Differences VHD

Application Hosting

lsquoGrokkingrsquo the service modelbull Imagine white-boarding out your service architecture with boxes for

nodes and arrows describing how they communicate

bull The service model is the same diagram written down in a declarative format

bull You give the Fabric the service model and the binaries that go with each of those nodes

bull The Fabric can provision deploy and manage that diagram for you

bull Find hardware home

bull Copy and launch your app binaries

bull Monitor your app and the hardware

bull In case of failure take action Perhaps even relocate your app

bull At all times the lsquodiagramrsquo stays whole

Automated Service ManagementProvide code + service modelbull Platform identifies and allocates resources deploys the service

manages service healthbull Configuration is handled by two files

ServiceDefinitioncsdefServiceConfigurationcscfg

Service Definition

Service Configuration

GUI

Double click on Role Name in Azure Project

Deploying to the cloud

bull We can deploy from the portal or from scriptbull VS builds two filesbull Encrypted package of your codebull Your config file

bull You must create an Azure account then a service and then you deploy your code

bull Can take up to 20 minutes bull (which is better than six months)

Service Management API

bullREST based API to manage your servicesbullX509-certs for authenticationbullLets you create delete change upgrade swaphellipbullLots of community and MSFT-built tools around the API- Easy to roll your own

The Secret Sauce ndash The Fabric The Fabric is the lsquobrainrsquo behind Windows Azure

1Process service model1 Determine resource requirements

2 Create role images

2Allocate resources

3Prepare nodes1 Place role images on nodes

2 Configure settings

3 Start roles

4Configure load balancers

5Maintain service health1 If role fails restart the role based on policy

2 If node fails migrate the role based on policy

StorageReplicated Highly Available Load Balanced

Durable Storage At Massive Scale

Blob- Massive files eg videos logs

Drive- Use standard file system APIs

Tables- Non-relational but with few scale limits- Use SQL Azure for relational data

Queues- Facilitate loosely-coupled reliable systems

Blob Features and Functionsbull Store Large Objects (up to 1TB

in size)

bull You can have as many containers and Blobs as you want

bull Standard REST Interfacebull PutBlob

bull Inserts a new blob overwrites the existing blob

bull GetBlobbull Get whole blob or a specific range

bull DeleteBlobbull CopyBlobbull SnapshotBlobbull LeaseBlob

bull Each Blob has an addressbull httpltstorageaccountgtblobcorewindowsnetltContainergtltBlobNamegtbull httpmovieconversionblobcorewindowsnetoriginalsbargampg

Containers

bull Similar to a top level folderbull Has an unlimited capacitybull Can only contain BLOBs

Each container has an access level- Private

- Default will require the account key to access- Full public read- Public read only

Two Types of Blobs Under the Hood

bull Block Blob bull Targeted at streaming

workloadsbull Each blob consists of a

sequence of blocksbull Each block is identified by a Block

ID

bull Size limit 200GB per blob

bull Page Blob bull Targeted at random

readwrite workloadsbull Each blob consists of an

arrayof pagesbull Each page is identified by its offset

from the start of the blob

bull Size limit 1TB per blob

bull You can upload a file in lsquoblocksrsquobull Each block has an idbull Then commit those blocks in any order into a

blobbull Final blob limited to 1 TB and up to 50000

blocksbull Can modify a blob by inserting updating and

removing blocksbull Blocks live for a week before being GCrsquod if not

committed to a blobbull Optimized for streaming

Blocks

Bigmpg1 6 8 3 5 4 7 2

Bigmpg

Brian Prince (DPE)
Fix the animation

Pagesbull Similar to block blobsbull Optimized for random readwrite operations and

provide the ability to write to a range of bytes in a blob

bull Call Put Blob set max size Then call Put Pagebull All pages must align 512-byte page boundariesbull Writes to page blobs happen in-place and are

immediately committed to the blobbull The maximum size for a page blob is 1 TB A

page written to a page blob may be up to 1 TB in size

BLOB Leases

bull Creates a 1 minute exclusive write lock on a BLOB

bull Operations Acquire Renew Release Break

bull Must have the lease id to perform operations

bull Can check LeaseStatus property

bull Currently can only be done through REST

Windows Azure Drive

bull Provides a durable NTFS volume for Windows Azure applications to usebull Use existing NTFS APIs to access a durable

drivebull Durability and survival of data on application failover

bull Enables migrating existing NTFS applications tothe cloud

bullA Windows Azure Drive is a Page Blobbull Example mount Page Blob as X

bull httpltaccountnamegtblobcorewindowsnetltcontainernamegtltblobnamegt

bull All writes to drive are made durable to the Page Blobbull Drive made durable through standard Page Blob

replicationbull Drive persists even when not mounted as a Page

Blob

Windows Azure Drive API

bull Create Drive - Creates a Page Blob formatted as a single partition NTFS volume VHD

bull Initialize Cache ndash Allows an application to specify the location and size of the local data cache for all Windows Azure Drives mounted for that VM instance

bull Mount Drive ndash Takes a formatted Page Blob and mounts it to a drive letter for the Windows Azure application to start using

bull Get Mounted Drives ndash Returns the list of mounted drives It consists of a list of the drive letter and Page Blob URLs for each mounted drive

bull Unmount Drive ndash Unmounts the drive and frees up the drive letter bull Snapshot Drive ndash Allows the client application to create a backup of the

drive (Page Blob) bull Copy Drive ndash Provides the ability to copy a drive or snapshot to another

drive (Page Blob) name to be used as a readwritable drive

BLOB Guidance

bull Manage connection stringskeys in cscfgbull Do not share keys wrap with a servicebull Strategy for accounts and containersbull You can assign a custom domain to your storage

accountbull There is no method to detect container

existence call FetchAttributes() and detect the error if it doesnrsquot exist

Table Structure

Account MovieData

Star WarsStar TrekFan Boys

Table Name Movies

Brian H PrinceJason ArgonautBill Gates

Table Name Customers

Account

Table

Entity

Tables store entities Entity schema can vary in the same table

Windows Azure Tables

bull Provides Structured Storagebull Massively Scalable Tablesbull Billions of entities (rows) and TBs of

databull Can use thousands of servers as traffic

grows

bull Highly Available amp Durablebull Data is replicated several times

bull Familiar and Easy to use APIbull WCF Data Services and ODatabull NET classes and LINQbull REST ndash with any platform or language

Is not relationalCan Not-bull Create foreign key relationships between tablesbull Perform server side joins between tablesbull Create custom indexes on the tablesbull No server side Count() for example

All entities must have the following propertiesbull Timestampbull PartitionKeybull RowKey

Windows Azure Queues

bull Queue are performance efficient highly available and provide reliable message deliverybull Simple asynchronous work

dispatch

bull Programming semantics ensure that a message can be processed at least once

bull Access is provided via REST

Storage PartitioningUnderstanding partitioning is key to understanding

performance

bull Different for each data type (blobs entities queues)Every data object has a

partition key

bull A partition can be served by a single serverbull System load balances partitions based on traffic patternbull Controls entity locality

Partition key is unit of scale

bull Load balancing can take a few minutes to kick inbull Can take a couple of seconds for partition to be available on a

different serverSystem load balances

bull Use exponential backoff on ldquoServer Busyrdquobull Our system load balances to meet your traffic needsbull Single partition limits have been reached

Server Busy

Partition Keys In Each Abstraction

bull Entities w same PartitionKey value served from same partitionEntities ndash TableName +

PartitionKeyPartitionKey (CustomerId) RowKey

(RowKind)Name CreditCardNumber OrderTotal

1 Customer-John Smith John Smith xxxx-xxxx-xxxx-xxxx

1 Order ndash 1 $3512

2 Customer-Bill Johnson Bill Johnson xxxx-xxxx-xxxx-xxxx


bull Every blob and its snapshots are in a single partitionBlobs ndash Container name +

Blob name

bull All messages for a single queue belong to the same partitionMessages ndash Queue Name

Container Name Blob Name

image annarborbighousejpg

image foxboroughgillettejpg

video annarborbighousejpg

Queue Message

jobs Message1

jobs Message2

workflow Message1

Replication Guarantee

bull All Azure Storage data exists in three replicasbull Replicas are created as neededbull A write operation is not complete until it has

written to all three replicasbull Reads are only load balanced to replicas in

syncServer 1 Server 2 Server 3

P1

P2

Pn

P1

P2

Pn

P1

P2

Pn

Scalability TargetsStorage Account

bull Capacity ndash Up to 100 TBsbull Transactions ndash Up to a few thousand requests per secondbull Bandwidth ndash Up to a few hundred megabytes per second

Single QueueTable Partition

bull Up to 500 transactions per second

To go above these numbers partition between multiple storage accounts and partitions

When limit is hit app will see lsquo503 server busyrsquo applications should implement exponential backoff

Single Blob Partition

bull Throughput up to 60 MBs

PartitionKey(Category)

RowKey(Title)

Timestamp ReleaseDate

Action Fast amp Furious hellip 2009

Action The Bourne Ultimatum hellip 2007

hellip hellip hellip hellip

Animation Open Season 2 hellip 2009

Animation The Ant Bully hellip 2006


RowKey(Title)


Comedy Office Space hellip 1999


SciFi X-Men Origins Wolverine hellip 2009


War Defiance hellip 2008


RowKey(Title)













Partitions and Partition Ranges

Server BTable = Movies[Comedy - Max]

Server ATable = Movies[Min - Comedy)

Server ATable = Movies

[Min - Max]

Key Selection Things to Consider

bullDistribute load as much as possiblebullHot partitions can be load balancedbullPartitionKey is critical for scalability

See httpwwwmicrosoftpdccom2009SVC09 and httpazurescopecloudappnet for more information

bull Avoid frequent large scansbull Parallelize queriesbull Point queries are most efficient

bullTransactions across a single partitionbullTransaction semantics amp Reduce round trips

Scalability

Query Efficiency amp Speed

Entity group transactions

Expect Continuation Tokens ndash Seriously

Maximum of 1000 rows in a response

At the end of partition range boundary



Maximum of 5 seconds to execute the query

Tables Recapbull Efficient for frequently used queriesbull Supports batch transactionsbull Distributes load

Select PartitionKey and RowKey that help scale

Avoid ldquoAppend onlyrdquo patterns

Always Handlecontinuation tokens

ldquoORrdquo predicates are not optimized

Implement back-offstrategy for retries

bull Distribute by using a hash etc as prefix

bull Expect continuation tokens for range queries

bull Execute the queries that form the ldquoORrdquo predicates as separate queries

bull Server busybull Load balance partitions to meet traffic needsbull Load on single partition has exceeded the limits

WCF Data Services

bull Use a new context for each logical operationbull AddObjectAttachTo can throw exception if entity is already being tracked

bull Point query throws an exception if resource does not exist Use IgnoreResourceNotFoundException

QueuesTheir Unique Role in Building Reliable Scalable Applicationsbull Want roles that work closely together but are not

bound togetherbull Tight coupling leads to brittlenessbull This can aid in scaling and performance

bull A queue can hold an unlimited number of messagesbull Messages must be serializable as XMLbull Limited to 8KB in sizebull Commonly use the work ticket pattern

bull Why not simply use a table

Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role

GetMessage (Timeout)RemoveMessage

Msg 2Msg 1

Worker Role

Msg 2

POST httpmyaccountqueuecorewindowsnetmyqueuemessages

HTTP11 200 OK Transfer-Encoding chunked Content-Type applicationxml Date Tue 09 Dec 2008 210430 GMT Server Nephos Queue Service Version 10 Microsoft-HTTPAPI20

ltxml version=10 encoding=utf-8gt ltQueueMessagesListgt ltQueueMessagegt ltMessageIdgt5974b586-0df3-4e2d-ad0c-18e3892bfca2ltMessageIdgt ltInsertionTimegtMon 22 Sep 2008 232920 GMTltInsertionTimegt ltExpirationTimegtMon 29 Sep 2008 232920 GMTltExpirationTimegt ltPopReceiptgtYzQ4Yzg1MDIGM0MDFiZDAwYzEwltPopReceiptgt ltTimeNextVisiblegtTue 23 Sep 2008 052920GMTltTimeNextVisiblegt ltMessageTextgtPHRlc3Q+dGdGVzdD4=ltMessageTextgt ltQueueMessagegt ltQueueMessagesListgt

DELETEhttpmyaccountqueuecorewindowsnetmyqueuemessagesmessageidpopreceipt=YzQ4Yzg1MDIGM0MDFiZDAwYzEw

Truncated Exponential Back Off Polling

Consider a backoff polling approach Each empty poll

increases interval by 2x

A successful sets the interval back to 1

60

21

11

C1

C2

Removing Poison Messages

11

21

340

Producers Consumers

P2

P1

30

2 GetMessage(Q 30 s) msg 2


11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21

2 GetMessage(Q 30 s) msg 23 C2 consumed msg 24 DeleteMessage(Q msg 2)7 GetMessage(Q 30 s) msg 1

1 GetMessage(Q 30 s) msg 15 C1 crashed

11

21

6 msg1 visible 30 s after Dequeue30

12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12

2 Dequeue(Q 30 sec) msg 23 C2 consumed msg 24 Delete(Q msg 2)7 Dequeue(Q 30 sec) msg 18 C2 crashed

1 Dequeue(Q 30 sec) msg 15 C1 crashed10 C1 restarted11 Dequeue(Q 30 sec) msg 112 DequeueCount gt 213 Delete (Q msg1)1

2

6 msg1 visible 30s after Dequeue9 msg1 visible 30s after Dequeue

30

13

12

13

Queues Recap

bullNo need to deal with failuresMake messageprocessing idempotent

bull Invisible messages result in out of orderDo not rely on order

bullEnforce threshold on messagersquos dequeue countUse Dequeue count to remove poison messages

bullMessages gt 8KBbullBatch messagesbullGarbage collect orphaned blobs

bullDynamically increasereduce workers

Use blob to storemessage data with

reference in message

Use message countto scale

bullNo need to deal with failures

bull Invisible messages result in out of order

bullEnforce threshold on messagersquos dequeue count


Windows Azure Storage TakeawaysData abstractions to build your applications

Blobs ndash Files and large objectsDrives ndash NTFS APIs for migrating applicationsTables ndash Massively scalable structured storageQueues ndash Reliable delivery of messages

Easy to use via the Storage Client Library

More info on Windows Azure Storage at

httpblogsmsdncomwindowsazurestoragehttpazurescopecloudappnet

Best Practices

Picking the Right VM Size

bull Having the correct VM size can make a big difference in costs

bull Fundamental choice ndash larger fewer VMs vs many smaller instances

bull If you scale better than linear across cores larger VMs could save you money

bull Pretty rare to see linear scaling across 8 cores

bull More instances may provide better uptime and reliability (more failures needed to take your service down)

bull Only real right answer ndash experiment with multiple sizes and instance counts in order to measure and find what is ideal for you

Using Your VM to the MaximumRememberbull 1 role instance == 1 VM running Windowsbull 1 role instance = one specific task for your codebull Yoursquore paying for the entire VM so why not use it

bull Common mistake ndash split up code into multiple roles each not using up CPU

bull Balance between using up CPU vs having free capacity in times of needbull Multiple ways to use your CPU to the fullest

Exploiting Concurrencybull Spin up additional processes each with a specific task or as a

unit of concurrency

bull May not be ideal if number of active processes exceeds number of cores

bull Use multithreading aggressively

bull In networking code correct usage of NT IO Completion Ports will let the kernel schedule the precise number of threads

bull In NET 4 use the Task Parallel Library

bull Data parallelism

bull Task parallelism

Finding Good Code Neighborsbull Typically code falls into one or more of these categories

bull Find code that is intensive with different resources to live togetherbull Example distributed network caches are typically network-

and memory-intensive they may be a good neighbor for storage IO-intensive code

MemoryIntensive

CPUIntensive

Network IO Intensive Storage IO Intensive

Scaling Appropriatelybull Monitor your application and make sure yoursquore scaled appropriately (not

over-scaled)

bull Spinning VMs up and down automatically is good at large scale

bull Remember that VMs take a few minutes to come up and cost ~$3 a day (give or take) to keep running

bull Being too aggressive in spinning down VMs can result in poor user experience

bull Trade-off between risk of failurepoor user experience due to not having excess capacity and the costs of having idling VMs

Performance Cost

Storage Costs

bullUnderstand an applicationrsquos storage profile and how storage billing works

bullMake service choices based on your app profilebull Eg SQL Azure has a flat fee while Windows Azure Tables charges per

transaction

bull Service choice can make a big cost difference based on your app profile

bull Caching and compressing They help a lot with storage costs

Saving Bandwidth Costs

Bandwidth costs are a huge part of any popular web apprsquos billing profile

Sending fewer things over the wire often means getting fewer things from storage

Saving bandwidth costs often lead to savings inother places

Sending fewer things means your VM has time to do other tasks

All of these tips have the side benefit of improving your web apprsquos performance and user experience

Compressing Content

1Gzip all output content

bull All modern browsers can decompress on the flybull Compared to Compress Gzip has much better

compression and freedom from patented algorithms

2Tradeoff compute costs for storage size

3Minimize image sizesbull Use Portable Network Graphics (PNGs)bull Crush your PNGsbull Strip needless metadatabull Make all PNGs palette PNGs

Uncompressed Content

Compressed Content

GzipMinify JavaScript

Minify CCSMinify Images

Best Practices Summary

Doing lsquolessrsquo is the key to saving costs

Measure everything

Know your application profile in and out

Cloud Computing for eScience Applications

NCBI BLAST

BLAST (Basic Local Alignment Search Tool) bull The most important software in bioinformaticsbull Identify similarity between bio-sequences

Computationally intensivebull Large number of pairwise alignment operationsbull A BLAST running can take 700 ~ 1000 CPU hoursbull Sequence databases growing exponentiallybull GenBank doubled in size in about 15 months

Opportunities for Cloud Computing

It is easy to parallelize BLASTbull Segment the input bull Segment processing (querying) is pleasingly parallel

bull Segment the database (eg mpiBLAST)bull Needs special result reduction processing

Large volume databull A normal Blast database can be as large as 10GBbull 100 nodes means the peak storage bandwidth could reach

to 1TB

bull The output of BLAST is usually 10-100x larger than the input

AzureBLAST

bull Parallel BLAST engine on Azure

bull Query-segmentation data-parallel patternbull split the input sequencesbull query partitions in parallelbull merge results together when done

bull Follows the general suggested application model bull Web Role + Queue + Worker

bull With three special considerationsbull Batch job managementbull Task parallelism on an elastic CloudWei Lu Jared Jackson and Roger Barga AzureBlast A Case Study of Developing Science Applications on the Cloud in Proceedings of the 1st Workshop on Scientific

Cloud Computing (Science Cloud 2010) Association for Computing Machinery Inc 21 June 2010

AzureBLAST Task-FlowA simple SplitJoin pattern

Leverage multi-core of one instance bull argument ldquondashardquo of NCBI-BLASTbull 1248 for small middle large and extra large instance size

Task granularity bull Large partition load imbalance bull Small partition unnecessary overheadsbull NCBI-BLAST overheadbull Data transferring overhead

Best Practice test runs to profiling and set size to mitigate the overhead

Value of visibilityTimeout for each BLAST task bull Essentially an estimate of the task run time bull too small repeated computation bull too large unnecessary long period of waiting time in case of the instance failure

BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task

Micro-Benchmarks Inform DesignTask size vs Performancebull Benefit of the warm cache effectbull 100 sequences per partition is the best

choice

Instance size vs Performancebull Super-linear speedup with larger size

worker instancesbull Primarily due to the memory capability

Task SizeInstance Size vs Costbull Extra-large instance generated the best

and the most economical throughputbull Fully utilize the resource

AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob

Database updating Role

helliphellip

Scaling Engine

Blast databases temporary data etc)

Job RegistryNCBI databases

BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task

AzureBLAST Job PortalASPNET program hosted by a web role instancebull Submit jobsbull Track jobrsquos status and logs

AuthenticationAuthorization based on Live ID

The accepted job is stored into the job registry tablebull Fault tolerance avoid in-memory

states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration

R palustris as a platform for H2 productionEric Shadt SAGE Sam Phattarasukol Harwood Lab UW

Blasted ~5000 proteins (700K sequences)bull Against all NCBI non-redundant proteins completed in 30 minbull Against ~5000 proteins from another strain completed in less

than 30 sec

AzureBLAST significantly saved computing timehellip

All-Against-All ExperimentDiscovering Homologs bull Discover the interrelationships of known protein sequences

ldquoAll against Allrdquo querybull The database is also the input querybull The protein database is large (42 GB size)bull Totally 9865668 sequences to be queried

bull Theoretically 100 billion sequence comparisons

Performance estimationbull Based on the sampling-running on one extra-large Azure

instancebull Would require 3216731 minutes (61 years) on one desktop

This scale of experiments usually are infeasible to most scientists

Our Approachbull Allocated a total of ~4000 instances

bull 475 extra-large VMs (8 cores per VM) four datacenters US (2) Western and North Europe

bull 8 deployments of AzureBLASTbull Each deployment has its own co-located storage service

bull Divide 10 million sequences into multiple segmentsbull Each will be submitted to one deployment as one job for executionbull Each segment consists of smaller partitions

bull When load imbalances redistribute the load manually

50

6262 62

6262

5062

End Resultbull Total size of the output result is ~230GB

bull The number of total hits is 1764579487

bull Started at March 25th the last task completed on April 8th (10 days compute)bull But based our estimates real working instance time should be 6~8 daybull Look into log data to analyze what took placehellip

50

6262 62

6262

5062

Understanding Azure by analyzing logs

A normal log record should be

Otherwise something is wrong (eg task failed to complete)

3312010 614 RD00155D3611B0 Executing the task 251523 3312010 625 RD00155D3611B0 Execution of task 251523 is done it took 109mins3312010 625 RD00155D3611B0 Executing the task 251553 3312010 644 RD00155D3611B0 Execution of task 251553 is done it took 193mins3312010 644 RD00155D3611B0 Executing the task 251600 3312010 702 RD00155D3611B0 Execution of task 251600 is done it took 1727 mins

3312010 822 RD00155D3611B0 Executing the task 251774


3312010 1112 RD00155D3611B0 Execution of task 251895 is done it took 82 mins

Surviving System Upgrades

North Europe Data Center totally 34256 tasks processed

All 62 compute nodes lost tasks and then came back in a group This is an

Update domain

~30 mins

~ 6 nodes in one group

35 Nodes experience blob writing failure at same time

Surviving Storage FailuresWest Europe Datacenter 30976 tasks are completed and job was killed

A reasonable guess the Fault Domain is working

MODISAzure Computing Evapotranspiration (ET) in the Cloud

You never miss the water till the well has run dryIrish Proverb

Computing Evapotranspiration (ET)

ET = Water volume evapotranspired (m3 s-1 m-2) Δ = Rate of change of saturation specific humidity with air temperature(Pa K-1) λv = Latent heat of vaporization (Jg) Rn = Net radiation (W m-2)cp = Specific heat capacity of air (J kg-1 K-1) ρa = dry air density (kg m-3) δq = vapor pressure deficit (Pa)ga = Conductivity of air (inverse of ra) (m s-1)gs = Conductivity of plant stoma air (inverse of rs) (m s-1) γ = Psychrometric constant (γ asymp 66 Pa K-1)

Estimating resistanceconductivity across a catchment can be tricky

bull Lots of inputs big data reductionbull Some of the inputs are not so simple

119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592

Penman-Monteith (1964)

Evapotranspiration (ET) is the release of water to the atmosphere by evaporation from open water bodies and transpiration or evaporation through plant membranes by plants

ET Synthesizes Imagery Sensors Models and Field Data

NASA MODIS imagery source

archives5 TB (600K files)

FLUXNET curated sensor dataset

(30GB 960 files)

FLUXNET curated field dataset2 KB (1 file)

NCEPNCAR ~100MB (4K files)

Vegetative clumping~5MB (1file)

Climate classification~1MB (1file)

20 US year = 1 global year

MODISAzure Four Stage Image Processing PipelineData collection (map) stagebull Downloads requested input

tiles from NASA ftp sitesbull Includes geospatial lookup for

non-sinusoidal tiles that will contribute to a reprojected sinusoidal tile

Reprojection (map) stagebull Converts source tile(s) to

intermediate result sinusoidal tiles

bull Simple nearest neighbor or spline algorithms

Derivation reduction stagebull First stage visible to scientistbull Computes ET in our initial use

Analysis reduction stagebull Optional second stage visible

to scientistbull Enables production of science

analysis artifacts such as maps tables virtual sensors

Reduction 1 Queue

Source Metadata

AzureMODIS Service Web Role Portal

Request Queue

Scientific Results Download

Data Collection Stage

Source Imagery Download Sites

Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results

Analysis Reduction StageDerivation Reduction Stage Reprojection Stage

httpresearchmicrosoftcomen-usprojectsazureazuremodisaspx

MODISAzure Architectural Big Picture (12)

bull ModisAzure Service is the Web Role front doorbull Receives all user requestsbull Queues request to appropriate

Download Reprojection or Reduction Job Queue

bull Service Monitor is a dedicated Worker Rolebull Parses all job requests into tasks

ndash recoverable units of work bull Execution status of all jobs and

tasks persisted in Tables

ltPipelineStagegt Request

hellipltPipelineStagegtJobStatus

PersistltPipelineStagegtJob Queue

MODISAzure Service(Web Role)

Service Monitor (Worker Role)

Parse amp PersistltPipelineStagegtTaskStatus

hellip

DispatchltPipelineStagegtTask Queue


All work actually done by a Worker Role



GenericWorker (Worker Role)

hellip

hellip


hellip

ltInputgtData Storage

bull Dequeues tasks created by the Service Monitor

bull Retries failed tasks 3 timesbull Maintains all task status

Example Pipeline Stage Reprojection Service

Reprojection Requesthellip


ReprojectionJobStatusPersist

Parse amp PersistReprojectionTaskStatus


hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList

SwathGranuleMetaReprojection Data

Storage

Each entity specifies a single reprojection job request

Each entity specifies a single reprojection task (ie a single

tile)

Query this table to get geo-metadata (eg boundaries)

for each swath tile

Query this table to get the list of satellite scan times that

cover a target tile

Swath Source Data Storage

Costs for 1 US Year ET Computation

bull Computational costs driven by data scale and need to run reduction multiple times

bull Storage costs driven by data scale and 6 month project duration

bull Small with respect to the people costs even at graduate student rates

Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists


400-500 GB60K files10 MBsec11 hourslt10 workers

$50 upload$450 storage

400 GB45K files3500 hours20-100 workers

5-7 GB55K files1800 hours20-100 workers

lt10 GB~1K files1800 hours20-100 workers

$420 cpu$60 download

$216 cpu$1 download$6 storage



Total $1420

Observations and Experiencebull Clouds are the largest scale computer centers ever constructed and have

the potential to be important to both large and small scale science problems

bull Equally import they can increase participation in research providing needed resources to userscommunities without ready access

bull Clouds suitable for ldquoloosely coupledrdquo data parallel applications and can support many interesting ldquoprogramming patternsrdquo but tightly coupled low-latency applications do not perform optimally on clouds today

bull Provide valuable fault tolerance and scalability abstractions

bull Clouds as amplifier for familiar client tools and on premise compute

bull Clouds services to support research provide considerable leverage for both individual researchers and entire communities of researchers

Resources Cloud Research Community Sitehttpresearchmicrosoftcomazure bull Getting started steps for

developersbull Available research services bull Use cases on Azure for researchbull Event Announcementsbull Detailed tutorialsbull Technical papers

Email us with questions at xcgngagemicrosoftcom

Resources AzureScopehttpazurescopecloudappnet bull Simple benchmarks illustrating

basic performance for compute and storage services

bull Benchmarks for reference algorithms

bull Best Practice tipsbull Code Samples







Demonstration

Azure in Action Manning PressProgramming Windows Azure OrsquoReilly PressBing Channel 9 Windows AzureBing Windows Azure Platform Training Kit - November Updatehttpresearchmicrosoftcomazurexcgngagemicrosoftcom

Windows Azure for Research Roger Barga Architect



HPC Node Architecture

HPC Interconnects


HPC Storage Systems

HPC and Clouds ndash Select Comparisons (2)

Slide 9

Slide 10


Application Model Comparison (2)

Key Components

Key Components Fabric Controller

Key Components Fabric Controller (2)




Key Components ndash Compute Web Roles

Key Components ndash Compute Worker Roles

Suggested Application Model Using queues for reliable messaging


Key Components ndash Compute VM Roles

Slide 24

lsquoGrokkingrsquo the service model

Automated Service Management

Service Definition


GUI



The Secret Sauce ndash The Fabric

Slide 33


Blob Features and Functions

Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational


Storage Partitioning



Scalability Targets



Slide 54

Tables Recap

Queues Their Unique Role in Building Reliable Scalable Applica

Queue Terminology

Message Lifecycle



Removing Poison Messages (2)


Queues Recap

Windows Azure Storage Takeaways

Slide 65


Using Your VM to the Maximum

Exploiting Concurrency

Finding Good Code Neighbors

Scaling Appropriately

Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST

AzureBLAST Task-Flow

Micro-Benchmarks Inform Design

AzureBLAST (2)

AzureBLAST Job Portal

Demonstration

R palustris as a platform for H2 production

All-Against-All Experiment

Our Approach

End Result



Surviving Storage Failures




MODISAzure Four Stage Image Processing Pipeline





Observations and Experience

Resources Cloud Research Community Site

Resources AzureScope

Resources AzureScope (2)

Demonstration (2)

Slide 104





wwwtop500org


wwwtop500org

6



AR AR AR ARhellip

SSLB LB

Data CenterLayer 3

Internet

SS

A AA hellip

SS

A AA hellip

hellip

Layer 2


GigE

10 GigE




wwwnerscgov

~60 PB capacity





11






12








SQL Azure








Storagebull Blobs

bull Binary objects


































Application Hosting














Service Definition


GUI











2Allocate resources



3 Start roles











in size)







Containers








ID












Blocks

Bigmpg1 6 8 3 5 4 7 2

Bigmpg

Brian Prince (DPE)
Fix the animation






BLOB Leases






Windows Azure Drive








Blob









BLOB Guidance




Table Structure

Account MovieData


Table Name Movies



Account

Table

Entity





grows







dispatch




performance


partition key






Server Busy










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104


wwwtop500org

6



AR AR AR ARhellip

SSLB LB

Data CenterLayer 3

Internet

SS

A AA hellip

SS

A AA hellip

hellip

Layer 2


GigE

10 GigE




wwwnerscgov

~60 PB capacity





11






12








SQL Azure








Storagebull Blobs

bull Binary objects


































Application Hosting














Service Definition


GUI











2Allocate resources



3 Start roles











in size)







Containers








ID












Blocks

Bigmpg1 6 8 3 5 4 7 2

Bigmpg

Brian Prince (DPE)
Fix the animation






BLOB Leases






Windows Azure Drive








Blob









BLOB Guidance




Table Structure

Account MovieData


Table Name Movies



Account

Table

Entity





grows







dispatch




performance


partition key






Server Busy










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104

6



AR AR AR ARhellip

SSLB LB

Data CenterLayer 3

Internet

SS

A AA hellip

SS

A AA hellip

hellip

Layer 2


GigE

10 GigE




wwwnerscgov

~60 PB capacity





11






12








SQL Azure








Storagebull Blobs

bull Binary objects


































Application Hosting














Service Definition


GUI











2Allocate resources



3 Start roles











in size)







Containers








ID












Blocks

Bigmpg1 6 8 3 5 4 7 2

Bigmpg

Brian Prince (DPE)
Fix the animation






BLOB Leases






Windows Azure Drive








Blob









BLOB Guidance




Table Structure

Account MovieData


Table Name Movies



Account

Table

Entity





grows







dispatch




performance


partition key






Server Busy










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104




wwwnerscgov

~60 PB capacity





11






12








SQL Azure








Storagebull Blobs

bull Binary objects


































Application Hosting














Service Definition


GUI











2Allocate resources



3 Start roles











in size)







Containers








ID












Blocks

Bigmpg1 6 8 3 5 4 7 2

Bigmpg

Brian Prince (DPE)
Fix the animation






BLOB Leases






Windows Azure Drive








Blob









BLOB Guidance




Table Structure

Account MovieData


Table Name Movies



Account

Table

Entity





grows







dispatch




performance


partition key






Server Busy










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104





11






12








SQL Azure








Storagebull Blobs

bull Binary objects


































Application Hosting














Service Definition


GUI











2Allocate resources



3 Start roles











in size)







Containers








ID












Blocks

Bigmpg1 6 8 3 5 4 7 2

Bigmpg

Brian Prince (DPE)
Fix the animation






BLOB Leases






Windows Azure Drive








Blob









BLOB Guidance




Table Structure

Account MovieData


Table Name Movies



Account

Table

Entity





grows







dispatch




performance


partition key






Server Busy










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104

12








SQL Azure








Storagebull Blobs

bull Binary objects


































Application Hosting














Service Definition


GUI











2Allocate resources



3 Start roles











in size)







Containers








ID












Blocks

Bigmpg1 6 8 3 5 4 7 2

Bigmpg

Brian Prince (DPE)
Fix the animation






BLOB Leases






Windows Azure Drive








Blob









BLOB Guidance




Table Structure

Account MovieData


Table Name Movies



Account

Table

Entity





grows







dispatch




performance


partition key






Server Busy










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104







Storagebull Blobs

bull Binary objects


































Application Hosting














Service Definition


GUI











2Allocate resources



3 Start roles











in size)







Containers








ID












Blocks

Bigmpg1 6 8 3 5 4 7 2

Bigmpg

Brian Prince (DPE)
Fix the animation






BLOB Leases






Windows Azure Drive








Blob









BLOB Guidance




Table Structure

Account MovieData


Table Name Movies



Account

Table

Entity





grows







dispatch




performance


partition key






Server Busy










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104































Application Hosting














Service Definition


GUI











2Allocate resources



3 Start roles











in size)







Containers








ID












Blocks

Bigmpg1 6 8 3 5 4 7 2

Bigmpg

Brian Prince (DPE)
Fix the animation






BLOB Leases






Windows Azure Drive








Blob









BLOB Guidance




Table Structure

Account MovieData


Table Name Movies



Account

Table

Entity





grows







dispatch




performance


partition key






Server Busy










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104














Service Definition


GUI











2Allocate resources



3 Start roles











in size)







Containers








ID












Blocks

Bigmpg1 6 8 3 5 4 7 2

Bigmpg

Brian Prince (DPE)
Fix the animation






BLOB Leases






Windows Azure Drive








Blob









BLOB Guidance




Table Structure

Account MovieData


Table Name Movies



Account

Table

Entity





grows







dispatch




performance


partition key






Server Busy










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104


GUI











2Allocate resources



3 Start roles











in size)







Containers








ID












Blocks

Bigmpg1 6 8 3 5 4 7 2

Bigmpg

Brian Prince (DPE)
Fix the animation






BLOB Leases






Windows Azure Drive








Blob









BLOB Guidance




Table Structure

Account MovieData


Table Name Movies



Account

Table

Entity





grows







dispatch




performance


partition key






Server Busy










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104










2Allocate resources



3 Start roles











in size)







Containers








ID












Blocks

Bigmpg1 6 8 3 5 4 7 2

Bigmpg

Brian Prince (DPE)
Fix the animation






BLOB Leases






Windows Azure Drive








Blob









BLOB Guidance




Table Structure

Account MovieData


Table Name Movies



Account

Table

Entity





grows







dispatch




performance


partition key






Server Busy










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104








in size)







Containers








ID












Blocks

Bigmpg1 6 8 3 5 4 7 2

Bigmpg

Brian Prince (DPE)
Fix the animation






BLOB Leases






Windows Azure Drive








Blob









BLOB Guidance




Table Structure

Account MovieData


Table Name Movies



Account

Table

Entity





grows







dispatch




performance


partition key






Server Busy










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104

Containers








ID












Blocks

Bigmpg1 6 8 3 5 4 7 2

Bigmpg

Brian Prince (DPE)
Fix the animation






BLOB Leases






Windows Azure Drive








Blob









BLOB Guidance




Table Structure

Account MovieData


Table Name Movies



Account

Table

Entity





grows







dispatch




performance


partition key






Server Busy










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104





ID












Blocks

Bigmpg1 6 8 3 5 4 7 2

Bigmpg

Brian Prince (DPE)
Fix the animation






BLOB Leases






Windows Azure Drive








Blob









BLOB Guidance




Table Structure

Account MovieData


Table Name Movies



Account

Table

Entity





grows







dispatch




performance


partition key






Server Busy










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104






Blocks

Bigmpg1 6 8 3 5 4 7 2

Bigmpg

Brian Prince (DPE)
Fix the animation






BLOB Leases






Windows Azure Drive








Blob









BLOB Guidance




Table Structure

Account MovieData


Table Name Movies



Account

Table

Entity





grows







dispatch




performance


partition key






Server Busy










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104






BLOB Leases






Windows Azure Drive








Blob









BLOB Guidance




Table Structure

Account MovieData


Table Name Movies



Account

Table

Entity





grows







dispatch




performance


partition key






Server Busy










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104

Windows Azure Drive








Blob









BLOB Guidance




Table Structure

Account MovieData


Table Name Movies



Account

Table

Entity





grows







dispatch




performance


partition key






Server Busy










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104









BLOB Guidance




Table Structure

Account MovieData


Table Name Movies



Account

Table

Entity





grows







dispatch




performance


partition key






Server Busy










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104

Table Structure

Account MovieData


Table Name Movies



Account

Table

Entity





grows







dispatch




performance


partition key






Server Busy










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104




grows







dispatch




performance


partition key






Server Busy










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104





dispatch




performance


partition key






Server Busy










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104


performance


partition key






Server Busy










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104










Blob name






Queue Message

jobs Message1

jobs Message2

workflow Message1





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104





P1

P2

Pn

P1

P2

Pn

P1

P2

Pn










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104










RowKey(Title)








RowKey(Title)








RowKey(Title)

















[Min - Max]






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104






Scalability



















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104

















WCF Data Services







Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104





Queue Terminology

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104

Message Lifecycle

Queue

Msg 1

Msg 2

Msg 3

Msg 4

Worker Role

Worker Role

PutMessage

Web Role


Msg 2Msg 1

Worker Role

Msg 2









60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104





60

21

11

C1

C2


11

21

340

Producers Consumers

P2

P1

30



11

21

10

20

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104

61

C1

C2


340

Producers Consumers

P2

P1

11

21



11

21


12

11

12

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104

62

C1

C2


340

Producers Consumers

P2

P1

12



2


30

13

12

13

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104

Queues Recap


















Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104






Best Practices












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104












unit of concurrency










MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104




MemoryIntensive

CPUIntensive



over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104


over-scaled)





Performance Cost

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104

Storage Costs



transaction









Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104







Compressing Content







Compressed Content





Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104



Measure everything



NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104


NCBI BLAST







to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104





to 1TB


AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104

AzureBLAST











BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104






BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104


choice





AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104

AzureBLAST

Web Portal

Web Service

Job registration

Job Scheduler

WorkerWorker

WorkerWorker

WorkerWorker

Global dispatch

queue

Web Role

Azure Table

Job Management Role

Azure Blob


helliphellip

Scaling Engine



BLAST task

Splitting task

BLAST task

BLAST task

BLAST task

hellip

Merging Task




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104




states

Web Portal

Web Service

Job registration

Job Scheduler

Job Portal

Scaling Engine

Job Registry

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104

Demonstration



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104



than 30 sec













50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104












50

6262 62

6262

5062




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104




50

6262 62

6262

5062











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104











Update domain

~30 mins











119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104










119864119879= ∆119877119899 + 120588119886 119888119901ሺ120575119902ሻ119892119886(∆+ 120574ሺ1+ 119892119886 119892119904Τ ሻ)120582120592







(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104





(30GB 960 files)
















Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104











Reduction 1 Queue

Source Metadata


Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists

Science results















hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104













hellip







hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104






hellip

hellip


hellip










hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104







hellip

Job Queue

hellip

Dispatch

Task Queue

Points to

hellip

ScanTimeList


Storage



tile)


for each swath tile


cover a target tile






Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104





Reduction 1 Queue

Source Metadata

Request Queue




Reprojection Queue

Reduction 2 Queue

DownloadQueue

Scientists











Total $1420





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104





















Demonstration






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104






HPC Interconnects


HPC Storage Systems


Slide 9

Slide 10



Key Components











Slide 24



Service Definition


GUI




Slide 33



Containers


Blocks

Pages

BLOB Leases

Windows Azure Drive


BLOB Guidance

Table Structure


Is not relational





Scalability Targets



Slide 54

Tables Recap


Queue Terminology

Message Lifecycle





Queues Recap


Slide 65






Storage Costs


Compressing Content



NCBI BLAST


AzureBLAST



AzureBLAST (2)


Demonstration



Our Approach

End Result
















Demonstration (2)

Slide 104

Download - Windows Azure for Research Roger Barga, Architect Cloud Computing Futures, MSR

Top Related