tpponthecloud# - iit bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · amazon#web#services#...

26
TPP On The Cloud Joe Slagel

Upload: dangtuyen

Post on 19-Jul-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

TPP  On  The  Cloud  

Joe  Slagel  

Page 2: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

Lecture  topics  

•  Introduc5on  to  Cloud  Compu5ng  and  Amazon  Web  Services  

•  Overview  of  TPP  Cloud  components  

•  Setup  trial  AWS  and  use  of  the  new  TPP  Web  Launcher  for  Amazon  (TWA)  

•  Future  TPP  Direc5on  with  the  Cloud  

2

Page 3: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

So  What  is  Cloud  Compu5ng?  

3

Cloud  compu5ng  is  Internet-­‐based  compu3ng,  whereby  shared  resources,  so9ware,  and  informa3on  are  provided  to  computer  and  other  devices  on  demand,  like  the  electricity  grid.  

-­‐-­‐Wikipedia  

Page 4: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

Three  Aspects  of  Cloud  Compu5ng  

So9ware  applica3ons  available  via  the  browser  

E.g.  Gmail,  flickr,  NCBI  

SaaS  So9ware  as  a  

Service  

Hosted  development  environment  for  building  and  deploying  cloud  applica3ons  

E.g  Google  Apps,  Microso9  Azure,  Salesforce.com  

PaaS  PlaKorm  as  a  

Service  

Storage,  servers  and  networking  components  provided  on  demand  through  the  internet  

E.G  Amazon  EC2  &  S3,  Rackspace,  IBM,  HP,…  

IaaS  Infrastructure  as  a  Service  

Page 5: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

Cloud  Compu5ng  Players  

5 Source: Gartner (June 2015)

Page 6: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

Amazon  Web  Services  

•  Collec5on  of  web  compu5ng  services  offered  by  Amazon  

•  “Elas5c”  IT  infrastructure  –  allocate  computers,  storage,  and  other  services  as  needed  

•  Cost  effec5ve  -­‐-­‐  pay  only  for  what  you  use  

•  Easy  to  use  –  simple  API  accessed  over  HTTP  which  supports  almost  every  language  

•  Large  number  of  tools  available  built  for  it  

Elas3c  Compute  Cloud  (EC2)  

SimpleDB   Simple  Storage  Service  (S3)  

Simple  Queue  Service  

AWS  Import/Export  

Elas3c  MapReduce  

Flexible  Payments  Services  

Mechanical  Turk  

And  many  more…  

Page 7: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

Amazon  Web  Services  Regions  

Page 8: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

Amazon  S3:  Simple  Storage  Service  

•  S3  lets  you  store  files/data  on  the  “web”  in  “buckets”  

–  Virtually  unlimited  storage,  bandwidth,  and  #  users  

–  No  loss  of  data  –  99.999999999  %  durability/yr  

–  Always  on  -­‐  99.99%  availability/yr  •  Files  can  range  from  1  byte  to  5  

gigabytes*  in  size  with  no  limit  to  #  files  

•  Authen5ca5on  mechanisms  ensure  that  data  is  kept  secure  and  access  rights  can  be  granted  to  specific  users.  

•  Uses  standard  h[p  REST  and  SOAP  interfaces  to  access  the  data  that  work  with  any  language  

First  Tier  S3  Pricing  

Storage   $0.150  $0.11  $0.0295  per  GB  –  first  50  TB  /  month  of  storage  used  (  100  GB  =  $35/yr  )  

Data  Transfer  

In:        $0.100    Free  Out:  $0.120  per  GB  first  10  TB/month  (sliding  scale)  

Requests   •   $0.01  per  1,000  PUT,  COPY,  POST,  or  LIST  requests  •   $0.01  per  10,000  GET  and  all  other  requests  •   Delete  requests  are  free  

Page 9: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

Amazon  S3:  Management  Console  

Features  include:  

•  Create/delete  buckets  •  Create/delete  folders  •  Upload  or  download  files  

• Modify  proper5es  (permissions)  

9

Web  based,  secure  management  tool  for  managing  S3  storage  

Page 10: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

Amazon  EC2:  Elas5c  Compute  Cloud  

•  EC2  allows  you  launch  new  computer  instances  in  minutes  on  demand  

•  Can  choose  from  “Small”  to  “High-­‐CPU  Extra  Large”  to  Cluster  compute  instances  

•  Choice  of  large  assortment  of  different  OS  images  (Linux,  Windows)  or  create  your  own  amazon  machine  image  (AMI)  

•  Billed  only  for  actual  usage/hr  +  data  transfers  

•  Full  control  of  the  instance  

US  East  EC2  Linux/Unix  Pricing  

Small   1.7  GB  32-­‐bit    1  Core,  1  ECU1  ,    160  GB  storage,    Moderate  I/O  

$0.08/hr  +  I/O2  

Large   7.5  GB,  64-­‐bit,  2  Core  2  ECU1each  850  GB  storage,    High  I/O  

$0.32/hr  +  I/O2  

High  -­‐CPU  Extra  Large  

7GB,  64-­‐bit,  8  Core,  2.5  ECU1  each  1690  GB  storage,    High  I/O  

$0.64/hr  +  I/O2  

1  EC2  Compute  Units  –  One  unit  is  equivalent  CPU  capacity  of  a  1.0-­‐1.2  GHZ  Operon  or  Xeon  processor  2  Data  transfer  in  is  $0.10/GB  free,  transfer  out  is  $0.12/GB  for  first  10  TB/month  

Page 11: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

Amazon  EC2:  Management  Console  

•  Start  and  stop  EC2  instances  •  Find,  manage,  and  create  Amazon  Machine  Images  (AMIs)  

•  Monitor  instances  with  real  5me-­‐opera5onal  metrics   11

Web  based,  secure  management  tool  for  managing  EC2  

Page 12: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

Gecng  Started:  Account  Crea5on  

•  Go  to  h[ps://aws.amazon.com/  and  click  on  the  “Create  a  Free  Account”  bu[on    

Page 13: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

Gecng  Started:  AWS  Console  

Page 14: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

Gecng  Started:  Amazon  Key  ID  and  Secret  Key  

Your  Amazon  API  key  and  secret  key  are  used  to  programma5cally  access  Amazon  services  

14

1.  Under  the  account  menu  select  “Security  Creden3als”  

2.  Go  to  iAM  users,  select  user  to  see  Access  Key  ID/Keys  

Page 15: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

Lecture  topics  

•  Introduc5on  to  Cloud  Compu5ng  and  Amazon  Web  Services  

•  Overview  of  new  TPP  Cloud  components  

•  Setup  and  Trial  of  the  new  TPP  Web  Launcher  for  Amazon  (TWA)  

•  Future  TPP  Direc5on  with  the  Cloud  

15

Page 16: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

TPP  Amazon  Images  

•  Based  on  official  public  releases  of  Ubuntu  

•  Contain  addi5onal  open  sogware  (omssa,  myrimatch,  etc)  

•  Publicly  available  scripts  for  building,  upda5ng  and  publishing  images  

•  Instruc5ons  on  usage  and  details  documented  on  wiki  site  

16

Publicly available Amazon Machine Instances (AMI) for the TPP

http://tools.proteomecenter.org/wiki/index.php?title=Amazon_EC2_AMI

Page 17: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

Using  TPP  on  the  Cloud  

•  Simple  web  based  launcher  to  start  petunia  on  a  Amazon  server  

•  Starts  up  an  pre-­‐configured  TPP  instance  

•  Doesn’t  require  any  sogware  installa5on  and  is  inexpensive  to  run  

•  Great  tool  for  just  trying  out  TPP  

•  Can  be  used  when  memory  and  be[er  CPU  is  needed  for  an  analysis  

•  Advanced  command  line  toolset  

•  Launches  parallel  searches  of  files  across  mul5ple  nodes  

•  Currently  supports  X!Tandem,  OMSSA,  MyriMatch,  InsPect  

•  Manage  all  aspects  of  cloud  compu5ng  including  data  transfer,  scheduling,  and  instances  

•  Great  for  quickly  and  inexpensively  processing  large  amounts  of  data  

TPP Web Application (TWA)

TPP Amazon command line tools (amztpp)

Direct Cloud support in TPP’s User Interface, Petunia

Page 18: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

TPP  Web  Launcher  for  Amazon  (TWA)  

1.   Navigate  to  h[p://tools.proteomecenter.org/twa  

2.   Enter  your  Amazon  Key  ID  and  Secret  

3.   Click  “Start  Instance”  4.   Welcome  to  Petunia  

5.   When  you  are  done  just  click  “Stop  Instance”  

18

Page 19: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

Using  the  amztpp  tool  

•  Requires  a  separate  installa5on  –  not  included  in  the  standard  TPP  installa5on  (See  wiki  for  instruc5ons  and  post  on  spc-­‐discuss  mailing  list)  

•  Has  a  simple  command  line  interface  amztpp  <op3ons>  <command>  <file(s)>  

Examples:  

amztpp  xtandem  *.mzML    =>  submit  xtandem  searches  

amztpp  –n  5  launch                      =>  start  5  more  instances  

amztpp  status                                        =>  report  current  status  

amztpp  –man                                        =>  display  manual  

•  Supports  X!Tandem,  omssa,  Myrimatch,  and  Inspect  search  engines  

•  Can  run  mul5ple  different  search  engines  in  parallel  on  mul5ple  instances  

•  Data  automa5cally  copied  to  the  cloud  and  results  downloaded  when  available  

•  Automa5cally  launches  instances  based  on  amount  of  data  being  searched  

•  NEW!    Execute  and  monitor  X!Tandem  searches  via  Petunia,  as  well  as  basic  cloud  account  administra5on   19

Page 20: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

Amazon  Web  Services  Cost  

Hypothe5cal  Analysis  –  100  mzXML  files  

–  Avg  100MB/file  

–  Avg  10  min/file  

20

 $-­‐        

 $5.00    

 $10.00    

 $15.00    

0.00  5.00  

10.00  15.00  20.00  

1   10   30   50   70   90  

Hou

rs  

#  Instances  

Compute  Time  vs  Cost  

Compute  Time   Cost  

#  EC2   Time  (Hrs)   Cost  

1   16.98   $8.10  

5   3.7   $8.10  

100   0.54   $11.34  

Data  Set  

#  EC2  /  Threads   #  Files  

Scan  Count   Time   Cost  

Cost/  file  

0021   20  /  8x   132        1,372,984     4:03    $    50.05      $    0.38    

0049   20  /  8x   132        2,279,874     6:08    $    46.24      $    0.35    

Actual  Results  

Amazon EC2 99%

AmazonS3 1%

AmazonSQS 0%

AWS Cost Breakdown

Page 21: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

amztpp:  What  does  it  cost?  

Canis  lupus  familiaris  Data  Set  •  Total  982  raw  files  organized  in  35  folders  

–   598  raw  files  from  LTQ  Orbitrap  –   288  raw  files  from  LTQ  –   96  raw  files  from  LCQ  Deca  

•   Searched  using  X!Tandem,  InsPecT,  MyriMatch  and  OMSSA  

•  Total  of  3,928  searches  

•  Total  of  10,759,379  spectrum  

•   Spot  price  (-­‐-­‐ec2-­‐spot)  of  $  0.22    (market  $.2160)  

•   Max  #  of  EC2  instances  (-­‐m)  =  100  

•   Max  #  of  parallel  upload/download  processes  (-­‐P)    =  10  

EC2  

Opera5on   Spot  Price   Hours   Cost  

m1.xlarge    $  0.216     95    $        20.52    

m1.xlarge    $  0.22     328    $        72.16    

Subtotal   423    $        92.68    

Opera5on   Price   Usage   Cost  

PublicIP-­‐In    $  0.12/GB     0.0062    $              0.00    

PublicIP-­‐Out    $  0.12/GB     0.0105    $              0.00    

InterZone-­‐In    $  0.12/GB     0.0211    $              0.00    

InterZone-­‐Out    $  0.12/GB     0.0005    $              0.00    

Subtotal    $              0.00    

EC2  Total    $        92.68    

S3  

Opera5on   Price   Usage   Cost  

PUT,  COPY,  POST,  LIST    $0.01/1,000      11,909      $              0.12    

GET,  all  other    $0.01/10,000        17,433      $              0.02    

Data  Transfer  In    $        -­‐         118.08    $                          -­‐        

Data  Transfer  Out    $  0.12/GB              165.56    $        19.87    

S3  Total    $        20.00    

SQS  

Opera5on   Price   Usage   Cost  

Requests   $0.01/10,000      58,392      $              0.06    

Data  Transfer  In    $  -­‐         0    $                          -­‐        

Data  Transfer  Out    $  0.12/GB               0.023    $              0.00    

SQS  Total    $              0.06    

• Total  AWS  cost  of  $112.74  •   82%  was  EC  instances  

•   Time  to  comple3on  5.95  hrs  •   3,915/3,928  Completed  (99%)  

Page 22: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

Advantages  of  Cloud  vs.  Cluster  

•  Scalable  –  Unlimited  amount  of  disk  space  

–  As  many  “servers”  as  needed  

•  Dependable  –  Large  distributed  system  

–  Secure  –  Resilient  

•  Plauorm  agnos5c  –  CentOS,  Debian,  Ubuntu,  Windows,  

etc.  –  32-­‐bit/64-­‐bit  

•  No  support  costs  

•  High  ini5al  startup  cost  

•  Limited  scalability  

•  Single  point  of  failure  

•  Requires  local  IT  personal  for  maintenance  

•  OS/Hardware  lock-­‐in  

•  Requires  3rd  party  grid  sogware  (PBS/GridEngine,  etc)  

•  High  ini5al  costs  and  variable  support  costs  

•  Scheduling  issues/complexity  

•  Users  compete  for  resources  

Tradi5onal  Cluster  

Page 23: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

Advantages  of  Cloud  vs.  Cluster  

Tradi5onal  Cluster  

•  Performance  issues  –  Bandwidth  between  instances  

–  Bandwidth  between  S3  upload/downloads  

–  Instance  performance  

–  Resource  alloca3on  (instances)  

–  File  I/O  

•  Cost  model  –  Could  get  expensive  

–  Pay  as  you  go  means  you  also  pay  for  mistakes  

•  Lack  of  control  •  Regulatory  compliance  

•  High  Performance  –  Can  finely  tune  network  performance  

–  Can  finely  tune  hardware  performance  

–  Excep3onal  hardware  capabili3es  

–  Excep3onal  File  I/O  

–  Dedicated  resources  

•  Instances  immediately  available  

•  Host  MPI  Applica5ons  

•  Security  

Page 24: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

TPP  Cloud  Future  Direc5ons  

•  Include  features  for  persistently  storing  MS  data  and  TPP  results  both  using  TWA  and  amztpp  

•  Improved  performance  in  file  transfers  to  S3  

•  Launch  other  search  engines  in  the  cloud  via  Petunia  

•  Support  more  TPP  programs  (e.g  prophets)  

•  Ability  to  share  data  sets  on  the  cloud  

24

Page 25: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

Ar5cle  in  MCP  describing  AWS  &  TPP  

25

Page 26: TPPOnTheCloud# - IIT Bombaysanjeeva/itpws/wp-content/uploads/2016/01/... · Amazon#Web#Services# • Colleconofwebcompung services#offered#by#Amazon# • “Elas5c”#IT#infrastructure#–

More  Informa5on  

•  TPP  Cloud  Services  hxp://tools.proteomecenter.org/wiki/index.php?3tle=TPP:Cloud  

•  Cloud  compu5ng  with  Amazon  Web  Services  hxp://www.ibm.com/developerworks/library/ar-­‐cloudaws1/  

•  Amazon  Elas5c  Compute  Cloud    hxp://aws.amazon.com/ec2/  

•  Amazon  Simple  Storage  Solu5on  hxp://aws.amazon.com/s3  

•  TPP  Mailing  List  hxp://groups.google.com/group/spctools-­‐discuss  

26