tpponthecloud# - iit bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · amazon#web#services#...
TRANSCRIPT
![Page 1: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/1.jpg)
TPP On The Cloud
Joe Slagel
![Page 2: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/2.jpg)
Lecture topics
• Introduc5on to Cloud Compu5ng and Amazon Web Services
• Overview of TPP Cloud components
• Setup trial AWS and use of the new TPP Web Launcher for Amazon (TWA)
• Future TPP Direc5on with the Cloud
2
![Page 3: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/3.jpg)
So What is Cloud Compu5ng?
3
Cloud compu5ng is Internet-‐based compu3ng, whereby shared resources, so9ware, and informa3on are provided to computer and other devices on demand, like the electricity grid.
-‐-‐Wikipedia
![Page 4: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/4.jpg)
Three Aspects of Cloud Compu5ng
So9ware applica3ons available via the browser
E.g. Gmail, flickr, NCBI
SaaS So9ware as a
Service
Hosted development environment for building and deploying cloud applica3ons
E.g Google Apps, Microso9 Azure, Salesforce.com
PaaS PlaKorm as a
Service
Storage, servers and networking components provided on demand through the internet
E.G Amazon EC2 & S3, Rackspace, IBM, HP,…
IaaS Infrastructure as a Service
![Page 5: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/5.jpg)
Cloud Compu5ng Players
5 Source: Gartner (June 2015)
![Page 6: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/6.jpg)
Amazon Web Services
• Collec5on of web compu5ng services offered by Amazon
• “Elas5c” IT infrastructure – allocate computers, storage, and other services as needed
• Cost effec5ve -‐-‐ pay only for what you use
• Easy to use – simple API accessed over HTTP which supports almost every language
• Large number of tools available built for it
Elas3c Compute Cloud (EC2)
SimpleDB Simple Storage Service (S3)
Simple Queue Service
AWS Import/Export
Elas3c MapReduce
Flexible Payments Services
Mechanical Turk
And many more…
![Page 7: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/7.jpg)
Amazon Web Services Regions
![Page 8: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/8.jpg)
Amazon S3: Simple Storage Service
• S3 lets you store files/data on the “web” in “buckets”
– Virtually unlimited storage, bandwidth, and # users
– No loss of data – 99.999999999 % durability/yr
– Always on -‐ 99.99% availability/yr • Files can range from 1 byte to 5
gigabytes* in size with no limit to # files
• Authen5ca5on mechanisms ensure that data is kept secure and access rights can be granted to specific users.
• Uses standard h[p REST and SOAP interfaces to access the data that work with any language
First Tier S3 Pricing
Storage $0.150 $0.11 $0.0295 per GB – first 50 TB / month of storage used ( 100 GB = $35/yr )
Data Transfer
In: $0.100 Free Out: $0.120 per GB first 10 TB/month (sliding scale)
Requests • $0.01 per 1,000 PUT, COPY, POST, or LIST requests • $0.01 per 10,000 GET and all other requests • Delete requests are free
![Page 9: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/9.jpg)
Amazon S3: Management Console
Features include:
• Create/delete buckets • Create/delete folders • Upload or download files
• Modify proper5es (permissions)
9
Web based, secure management tool for managing S3 storage
![Page 10: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/10.jpg)
Amazon EC2: Elas5c Compute Cloud
• EC2 allows you launch new computer instances in minutes on demand
• Can choose from “Small” to “High-‐CPU Extra Large” to Cluster compute instances
• Choice of large assortment of different OS images (Linux, Windows) or create your own amazon machine image (AMI)
• Billed only for actual usage/hr + data transfers
• Full control of the instance
US East EC2 Linux/Unix Pricing
Small 1.7 GB 32-‐bit 1 Core, 1 ECU1 , 160 GB storage, Moderate I/O
$0.08/hr + I/O2
Large 7.5 GB, 64-‐bit, 2 Core 2 ECU1each 850 GB storage, High I/O
$0.32/hr + I/O2
High -‐CPU Extra Large
7GB, 64-‐bit, 8 Core, 2.5 ECU1 each 1690 GB storage, High I/O
$0.64/hr + I/O2
1 EC2 Compute Units – One unit is equivalent CPU capacity of a 1.0-‐1.2 GHZ Operon or Xeon processor 2 Data transfer in is $0.10/GB free, transfer out is $0.12/GB for first 10 TB/month
![Page 11: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/11.jpg)
Amazon EC2: Management Console
• Start and stop EC2 instances • Find, manage, and create Amazon Machine Images (AMIs)
• Monitor instances with real 5me-‐opera5onal metrics 11
Web based, secure management tool for managing EC2
![Page 12: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/12.jpg)
Gecng Started: Account Crea5on
• Go to h[ps://aws.amazon.com/ and click on the “Create a Free Account” bu[on
![Page 13: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/13.jpg)
Gecng Started: AWS Console
![Page 14: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/14.jpg)
Gecng Started: Amazon Key ID and Secret Key
Your Amazon API key and secret key are used to programma5cally access Amazon services
14
1. Under the account menu select “Security Creden3als”
2. Go to iAM users, select user to see Access Key ID/Keys
![Page 15: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/15.jpg)
Lecture topics
• Introduc5on to Cloud Compu5ng and Amazon Web Services
• Overview of new TPP Cloud components
• Setup and Trial of the new TPP Web Launcher for Amazon (TWA)
• Future TPP Direc5on with the Cloud
15
![Page 16: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/16.jpg)
TPP Amazon Images
• Based on official public releases of Ubuntu
• Contain addi5onal open sogware (omssa, myrimatch, etc)
• Publicly available scripts for building, upda5ng and publishing images
• Instruc5ons on usage and details documented on wiki site
16
Publicly available Amazon Machine Instances (AMI) for the TPP
http://tools.proteomecenter.org/wiki/index.php?title=Amazon_EC2_AMI
![Page 17: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/17.jpg)
Using TPP on the Cloud
• Simple web based launcher to start petunia on a Amazon server
• Starts up an pre-‐configured TPP instance
• Doesn’t require any sogware installa5on and is inexpensive to run
• Great tool for just trying out TPP
• Can be used when memory and be[er CPU is needed for an analysis
• Advanced command line toolset
• Launches parallel searches of files across mul5ple nodes
• Currently supports X!Tandem, OMSSA, MyriMatch, InsPect
• Manage all aspects of cloud compu5ng including data transfer, scheduling, and instances
• Great for quickly and inexpensively processing large amounts of data
TPP Web Application (TWA)
TPP Amazon command line tools (amztpp)
Direct Cloud support in TPP’s User Interface, Petunia
![Page 18: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/18.jpg)
TPP Web Launcher for Amazon (TWA)
1. Navigate to h[p://tools.proteomecenter.org/twa
2. Enter your Amazon Key ID and Secret
3. Click “Start Instance” 4. Welcome to Petunia
5. When you are done just click “Stop Instance”
18
![Page 19: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/19.jpg)
Using the amztpp tool
• Requires a separate installa5on – not included in the standard TPP installa5on (See wiki for instruc5ons and post on spc-‐discuss mailing list)
• Has a simple command line interface amztpp <op3ons> <command> <file(s)>
Examples:
amztpp xtandem *.mzML => submit xtandem searches
amztpp –n 5 launch => start 5 more instances
amztpp status => report current status
amztpp –man => display manual
• Supports X!Tandem, omssa, Myrimatch, and Inspect search engines
• Can run mul5ple different search engines in parallel on mul5ple instances
• Data automa5cally copied to the cloud and results downloaded when available
• Automa5cally launches instances based on amount of data being searched
• NEW! Execute and monitor X!Tandem searches via Petunia, as well as basic cloud account administra5on 19
![Page 20: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/20.jpg)
Amazon Web Services Cost
Hypothe5cal Analysis – 100 mzXML files
– Avg 100MB/file
– Avg 10 min/file
20
$-‐
$5.00
$10.00
$15.00
0.00 5.00
10.00 15.00 20.00
1 10 30 50 70 90
Hou
rs
# Instances
Compute Time vs Cost
Compute Time Cost
# EC2 Time (Hrs) Cost
1 16.98 $8.10
5 3.7 $8.10
100 0.54 $11.34
Data Set
# EC2 / Threads # Files
Scan Count Time Cost
Cost/ file
0021 20 / 8x 132 1,372,984 4:03 $ 50.05 $ 0.38
0049 20 / 8x 132 2,279,874 6:08 $ 46.24 $ 0.35
Actual Results
Amazon EC2 99%
AmazonS3 1%
AmazonSQS 0%
AWS Cost Breakdown
![Page 21: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/21.jpg)
amztpp: What does it cost?
Canis lupus familiaris Data Set • Total 982 raw files organized in 35 folders
– 598 raw files from LTQ Orbitrap – 288 raw files from LTQ – 96 raw files from LCQ Deca
• Searched using X!Tandem, InsPecT, MyriMatch and OMSSA
• Total of 3,928 searches
• Total of 10,759,379 spectrum
• Spot price (-‐-‐ec2-‐spot) of $ 0.22 (market $.2160)
• Max # of EC2 instances (-‐m) = 100
• Max # of parallel upload/download processes (-‐P) = 10
EC2
Opera5on Spot Price Hours Cost
m1.xlarge $ 0.216 95 $ 20.52
m1.xlarge $ 0.22 328 $ 72.16
Subtotal 423 $ 92.68
Opera5on Price Usage Cost
PublicIP-‐In $ 0.12/GB 0.0062 $ 0.00
PublicIP-‐Out $ 0.12/GB 0.0105 $ 0.00
InterZone-‐In $ 0.12/GB 0.0211 $ 0.00
InterZone-‐Out $ 0.12/GB 0.0005 $ 0.00
Subtotal $ 0.00
EC2 Total $ 92.68
S3
Opera5on Price Usage Cost
PUT, COPY, POST, LIST $0.01/1,000 11,909 $ 0.12
GET, all other $0.01/10,000 17,433 $ 0.02
Data Transfer In $ -‐ 118.08 $ -‐
Data Transfer Out $ 0.12/GB 165.56 $ 19.87
S3 Total $ 20.00
SQS
Opera5on Price Usage Cost
Requests $0.01/10,000 58,392 $ 0.06
Data Transfer In $ -‐ 0 $ -‐
Data Transfer Out $ 0.12/GB 0.023 $ 0.00
SQS Total $ 0.06
• Total AWS cost of $112.74 • 82% was EC instances
• Time to comple3on 5.95 hrs • 3,915/3,928 Completed (99%)
![Page 22: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/22.jpg)
Advantages of Cloud vs. Cluster
• Scalable – Unlimited amount of disk space
– As many “servers” as needed
• Dependable – Large distributed system
– Secure – Resilient
• Plauorm agnos5c – CentOS, Debian, Ubuntu, Windows,
etc. – 32-‐bit/64-‐bit
• No support costs
• High ini5al startup cost
• Limited scalability
• Single point of failure
• Requires local IT personal for maintenance
• OS/Hardware lock-‐in
• Requires 3rd party grid sogware (PBS/GridEngine, etc)
• High ini5al costs and variable support costs
• Scheduling issues/complexity
• Users compete for resources
Tradi5onal Cluster
![Page 23: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/23.jpg)
Advantages of Cloud vs. Cluster
Tradi5onal Cluster
• Performance issues – Bandwidth between instances
– Bandwidth between S3 upload/downloads
– Instance performance
– Resource alloca3on (instances)
– File I/O
• Cost model – Could get expensive
– Pay as you go means you also pay for mistakes
• Lack of control • Regulatory compliance
• High Performance – Can finely tune network performance
– Can finely tune hardware performance
– Excep3onal hardware capabili3es
– Excep3onal File I/O
– Dedicated resources
• Instances immediately available
• Host MPI Applica5ons
• Security
![Page 24: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/24.jpg)
TPP Cloud Future Direc5ons
• Include features for persistently storing MS data and TPP results both using TWA and amztpp
• Improved performance in file transfers to S3
• Launch other search engines in the cloud via Petunia
• Support more TPP programs (e.g prophets)
• Ability to share data sets on the cloud
24
![Page 25: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/25.jpg)
Ar5cle in MCP describing AWS & TPP
25
![Page 26: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–](https://reader031.vdocument.in/reader031/viewer/2022021711/5b4ff6787f8b9a5a6f8d723d/html5/thumbnails/26.jpg)
More Informa5on
• TPP Cloud Services hxp://tools.proteomecenter.org/wiki/index.php?3tle=TPP:Cloud
• Cloud compu5ng with Amazon Web Services hxp://www.ibm.com/developerworks/library/ar-‐cloudaws1/
• Amazon Elas5c Compute Cloud hxp://aws.amazon.com/ec2/
• Amazon Simple Storage Solu5on hxp://aws.amazon.com/s3
• TPP Mailing List hxp://groups.google.com/group/spctools-‐discuss
26