genome&assembly:&novel&applica3ons& …sergek/presentations/thesistalk_final.pdf ·...

36
Genome Assembly: Novel Applica3ons by Harnessing Emerging Sequencing Technologies and Graph Algorithms Sergey Koren PhD Thesis Defense March 16, 2012

Upload: nguyenduong

Post on 02-Dec-2018

220 views

Category:

Documents


0 download

TRANSCRIPT

Genome  Assembly:  Novel  Applica3ons  by  Harnessing  Emerging  Sequencing  Technologies  and  Graph  Algorithms  

Sergey  Koren  PhD  Thesis  Defense  March  16,  2012  

         

Outline  

1. Preliminaries

2. Challenges and Solutions

3. Results

2  

Central  Dogma  

3  http://www.cs.cmu.edu/~wcohen/GuideToBiology-pictures-color-release1.5.pdf

Sequencing  Output  

DNA Data Tsunami

"Will Computers Crash Genomics?" Elizabeth Pennisi (2011) Science. 331(6018): 666-668.

Current world-wide sequencing capacity exceeds 13Pbp/year and is growing at 5x per year!

4  Will Computes Crash Genomics.

Pennisi, E (2011) Science.

Biological  Goals  •  Disease  outbreaks  

– V.  cholerae  in  HaiH  – B.  anthracis  in  Heroin  users  

•  Learn  what  the  cell  is  doing  – The  DNA  transcribed  into  RNA  to  be  translated  to  proteins  

•  Studying  whole  communiHes  (metagenomics)  – Human  symbioHc  bacteria  – Ocean  bacterial  populaHon  

•  Studying  the  dark  maQer  – Studying  individual  cells  (single-­‐cell)   5  

What  is  Assembly  •  Break  target  into  pieces  we  can  read  •  Convert  sequence  to  a  graph  

–  Requires  idenHfying  segments  with  shared  origin  –  Sequences  occurring  mulHple  Hmes  (repeats)  make  this  ambiguous  

–  Repeat  sequences  must  be  spanned  with  sufficient  unique  sequence  to  be  unambiguous  

•  Find  simple  paths  in  the  graph  –  Sequences  have  a  direcHon  so  graph  is  bi-­‐directed  – Require  no  forks,  otherwise  graph  is  ambiguous  

6  

…AGCCTAGACCTACAGGATGCGCGACACGT GGATGCGCGACACGTCGCATATCCGGT… GGATGCGCGACACGTTAGCATAGCCTA… TTGCTC CCTACA

Basic  AssumpHons  •  Equal  representaHon  of  all  posiHons  of  the  target  – A  repeat  sequence  must  be  idenHfied  for  special  handling  to  minimize  error  

– A  repeat  can  be  idenHfied  using  coverage  

7  

FormulaHon  •  NP-­‐hard  to  find  correct  reconstrucHon  

–  Repeats  introduce  exponenHal  number  of  paths  

•  NP-­‐hard  to  chose  orientaHon  for  nodes  – Any  cycle  with  an  odd  number  of  reverse-­‐edges  is  un-­‐resolvable  

– Maximal  BiparHte  Subgraph  –  between  forward  and  reverse  nodes  

•  NP-­‐hard  to  assign  a  posiHon  to  the  nodes  – OpHmal  Linear  Arrangement  

(10,000bp  ±  1,000bp)  

(5,000bp  ±  500bp)  

8  

9  Mycoplasma  genitalium,  600Kbp    

Assembly  ApplicaHon  Example  

10  

Assembly  ApplicaHon  Example  

11  

3kb11kb18kb25kb33kb40kb48kb55kb62kb70kb77kb84kb92kb99kb107kb114kb121kb129kb136kb144kb151kb158kb166kb173kb180kb188kb195kb203kb210kb217kb225kb232kb240kb247kb254kb262kb269kb276kb284kb291kb299kb306kb313kb321kb328kb336kb343kb350kb358kb365kb372kb380kb387kb395kb402kb409kb417kb424kb432kb439kb446kb454kb461kb469kb476kb483kb491kb498kb505kb513kb520kb528kb535kb542kb550kb557kb565kb572kb579kb587kb594kb601kb609kb616kb624kb631kb638kb646kb653kb661kb668kb675kb683kb690kb697kb705kb712kb720kb727kb734kb742kb749kb757kb764kb771kb779kb786kb793kb801kb808kb816kb823kb830kb838kb845kb853kb860kb867kb875kb882kb890kb897kb904kb912kb919kb926kb934kb941kb949kb956kb963kb971kb978kb986kb993kb1000kb1008kb1015kb1022kb1030kb1037kb1045kb1052kb1059kb1067kb1074kb1082kb1089kb1096kb1104kb1111kb1118kb1126kb1133kb1141kb1148kb1155kb1163kb1170kb1178kb1185kb1192kb1200kb1207kb1214kb1222kb1229kb1237kb1244kb1251kb1259kb1266kb1274kb1281kb1288kb1296kb1303kb1311kb1318kb1325kb1333kb1340kb1347kb1355kb1362kb1370kb1377kb1384kb1392kb1399kb1407kb1414kb1421kb1429kb1436kb1443kb1451kb1458kb1466kb1473kb1480kb1488kb1495kb1503kb1510kb1517kb1525kb1532kb1539kb1547kb1554kb1562kb1569kb1576kb1584kb1591kb1599kb1606kb1613kb1621kb1628kb1635kb1643kb1650kb1658kb1665kb1672kb1680kb1687kb1695kb1702kb1709kb1717kb1724kb1732kb1739kb1746kb1754kb1761kb1768kb1776kb1783kb1791kb1798kb1805kb1813kb1820kb1828kb1835kb1842kb1850kb1857kb1864kb1872kb1879kb1887kb

3kb11kb18kb25kb33kb40kb48kb55kb62kb70kb77kb84kb92kb99kb107kb114kb121kb129kb136kb144kb151kb158kb166kb173kb180kb188kb195kb203kb210kb217kb225kb232kb240kb247kb254kb262kb269kb276kb284kb291kb299kb306kb313kb321kb328kb336kb343kb350kb358kb365kb372kb380kb387kb395kb402kb409kb417kb424kb432kb439kb446kb454kb461kb469kb476kb483kb491kb498kb505kb513kb520kb528kb535kb542kb550kb557kb565kb572kb579kb587kb594kb601kb609kb616kb624kb631kb638kb646kb653kb661kb668kb675kb683kb690kb697kb705kb712kb720kb727kb734kb742kb749kb757kb764kb771kb779kb786kb793kb801kb808kb816kb823kb830kb838kb845kb853kb860kb867kb875kb882kb890kb897kb904kb912kb919kb926kb934kb941kb949kb956kb963kb971kb978kb986kb993kb1000kb1008kb1015kb1022kb1030kb1037kb1045kb1052kb1059kb1067kb1074kb1082kb1089kb1096kb1104kb1111kb1118kb1126kb1133kb1141kb1148kb1155kb1163kb1170kb1178kb1185kb1192kb1200kb1207kb1214kb1222kb1229kb1237kb1244kb1251kb1259kb1266kb1274kb1281kb1288kb1296kb1303kb1311kb1318kb1325kb1333kb1340kb1347kb1355kb1362kb1370kb1377kb1384kb1392kb1399kb1407kb1414kb1421kb1429kb1436kb1443kb1451kb1458kb1466kb1473kb1480kb1488kb1495kb1503kb1510kb1517kb1525kb1532kb1539kb1547kb1554kb1562kb1569kb1576kb1584kb1591kb1599kb1606kb1613kb1621kb1628kb1635kb1643kb1650kb1658kb1665kb1672kb1680kb1687kb1695kb1702kb1709kb1717kb1724kb1732kb1739kb1746kb1754kb1761kb1768kb1776kb1783kb1791kb1798kb1805kb1813kb1820kb1828kb1835kb1842kb1850kb1857kb1864kb1872kb1879kb1887kb

3kb11kb18kb25kb33kb40kb48kb55kb62kb70kb77kb84kb92kb99kb107kb114kb121kb129kb136kb144kb151kb158kb166kb173kb180kb188kb195kb203kb210kb217kb225kb232kb240kb247kb254kb262kb269kb276kb284kb291kb299kb306kb313kb321kb328kb336kb343kb350kb358kb365kb372kb380kb387kb395kb402kb409kb417kb424kb432kb439kb446kb454kb461kb469kb476kb483kb491kb498kb505kb513kb520kb528kb535kb542kb550kb557kb565kb572kb579kb587kb594kb601kb609kb616kb624kb631kb638kb646kb653kb661kb668kb675kb683kb690kb697kb705kb712kb720kb727kb734kb742kb749kb757kb764kb771kb779kb786kb793kb801kb808kb816kb823kb830kb838kb845kb853kb860kb867kb875kb882kb890kb897kb904kb912kb919kb926kb934kb941kb949kb956kb963kb971kb978kb986kb993kb1000kb1008kb1015kb1022kb1030kb1037kb1045kb1052kb1059kb1067kb1074kb1082kb1089kb1096kb1104kb1111kb1118kb1126kb1133kb1141kb1148kb1155kb1163kb1170kb1178kb1185kb1192kb1200kb1207kb1214kb1222kb1229kb1237kb1244kb1251kb1259kb1266kb1274kb1281kb1288kb1296kb1303kb1311kb1318kb1325kb1333kb1340kb1347kb1355kb1362kb1370kb1377kb1384kb1392kb1399kb1407kb1414kb1421kb1429kb1436kb1443kb1451kb1458kb1466kb1473kb1480kb1488kb1495kb1503kb1510kb1517kb1525kb1532kb1539kb1547kb1554kb1562kb1569kb1576kb1584kb1591kb1599kb1606kb1613kb1621kb1628kb1635kb1643kb1650kb1658kb1665kb1672kb1680kb1687kb1695kb1702kb1709kb1717kb1724kb1732kb1739kb1746kb1754kb1761kb1768kb1776kb1783kb1791kb1798kb1805kb1813kb1820kb1828kb1835kb1842kb1850kb1857kb1864kb1872kb1879kb1887kb

3kb11kb18kb25kb33kb40kb48kb55kb62kb70kb77kb84kb92kb99kb107kb114kb121kb129kb136kb144kb151kb158kb166kb173kb180kb188kb195kb203kb210kb217kb225kb232kb240kb247kb254kb262kb269kb276kb284kb291kb299kb306kb313kb321kb328kb336kb343kb350kb358kb365kb372kb380kb387kb395kb402kb409kb417kb424kb432kb439kb446kb454kb461kb469kb476kb483kb491kb498kb505kb513kb520kb528kb535kb542kb550kb557kb565kb572kb579kb587kb594kb601kb609kb616kb624kb631kb638kb646kb653kb661kb668kb675kb683kb690kb697kb705kb712kb720kb727kb734kb742kb749kb757kb764kb771kb779kb786kb793kb801kb808kb816kb823kb830kb838kb845kb853kb860kb867kb875kb882kb890kb897kb904kb912kb919kb926kb934kb941kb949kb956kb963kb971kb978kb986kb993kb1000kb1008kb1015kb1022kb1030kb1037kb1045kb1052kb1059kb1067kb1074kb1082kb1089kb1096kb1104kb1111kb1118kb1126kb1133kb1141kb1148kb1155kb1163kb1170kb1178kb1185kb1192kb1200kb1207kb1214kb1222kb1229kb1237kb1244kb1251kb1259kb1266kb1274kb1281kb1288kb1296kb1303kb1311kb1318kb1325kb1333kb1340kb1347kb1355kb1362kb1370kb1377kb1384kb1392kb1399kb1407kb1414kb1421kb1429kb1436kb1443kb1451kb1458kb1466kb1473kb1480kb1488kb1495kb1503kb1510kb1517kb1525kb1532kb1539kb1547kb1554kb1562kb1569kb1576kb1584kb1591kb1599kb1606kb1613kb1621kb1628kb1635kb1643kb1650kb1658kb1665kb1672kb1680kb1687kb1695kb1702kb1709kb1717kb1724kb1732kb1739kb1746kb1754kb1761kb1768kb1776kb1783kb1791kb1798kb1805kb1813kb1820kb1828kb1835kb1842kb1850kb1857kb1864kb1872kb1879kb1887kb

3kb11kb18kb25kb33kb40kb48kb55kb62kb70kb77kb84kb92kb99kb107kb114kb121kb129kb136kb144kb151kb158kb166kb173kb180kb188kb195kb203kb210kb217kb225kb232kb240kb247kb254kb262kb269kb276kb284kb291kb299kb306kb313kb321kb328kb336kb343kb350kb358kb365kb372kb380kb387kb395kb402kb409kb417kb424kb432kb439kb446kb454kb461kb469kb476kb483kb491kb498kb505kb513kb520kb528kb535kb542kb550kb557kb565kb572kb579kb587kb594kb601kb609kb616kb624kb631kb638kb646kb653kb661kb668kb675kb683kb690kb697kb705kb712kb720kb727kb734kb742kb749kb757kb764kb771kb779kb786kb793kb801kb808kb816kb823kb830kb838kb845kb853kb860kb867kb875kb882kb890kb897kb904kb912kb919kb926kb934kb941kb949kb956kb963kb971kb978kb986kb993kb1000kb1008kb1015kb1022kb1030kb1037kb1045kb1052kb1059kb1067kb1074kb1082kb1089kb1096kb1104kb1111kb1118kb1126kb1133kb1141kb1148kb1155kb1163kb1170kb1178kb1185kb1192kb1200kb1207kb1214kb1222kb1229kb1237kb1244kb1251kb1259kb1266kb1274kb1281kb1288kb1296kb1303kb1311kb1318kb1325kb1333kb1340kb1347kb1355kb1362kb1370kb1377kb1384kb1392kb1399kb1407kb1414kb1421kb1429kb1436kb1443kb1451kb1458kb1466kb1473kb1480kb1488kb1495kb1503kb1510kb1517kb1525kb1532kb1539kb1547kb1554kb1562kb1569kb1576kb1584kb1591kb1599kb1606kb1613kb1621kb1628kb1635kb1643kb1650kb1658kb1665kb1672kb1680kb1687kb1695kb1702kb1709kb1717kb1724kb1732kb1739kb1746kb1754kb1761kb1768kb1776kb1783kb1791kb1798kb1805kb1813kb1820kb1828kb1835kb1842kb1850kb1857kb1864kb1872kb1879kb1887kb

3kb11kb18kb25kb33kb40kb48kb55kb62kb70kb77kb84kb92kb99kb107kb114kb121kb129kb136kb144kb151kb158kb166kb173kb180kb188kb195kb203kb210kb217kb225kb232kb240kb247kb254kb262kb269kb276kb284kb291kb299kb306kb313kb321kb328kb336kb343kb350kb358kb365kb372kb380kb387kb395kb402kb409kb417kb424kb432kb439kb446kb454kb461kb469kb476kb483kb491kb498kb505kb513kb520kb528kb535kb542kb550kb557kb565kb572kb579kb587kb594kb601kb609kb616kb624kb631kb638kb646kb653kb661kb668kb675kb683kb690kb697kb705kb712kb720kb727kb734kb742kb749kb757kb764kb771kb779kb786kb793kb801kb808kb816kb823kb830kb838kb845kb853kb860kb867kb875kb882kb890kb897kb904kb912kb919kb926kb934kb941kb949kb956kb963kb971kb978kb986kb993kb1000kb1008kb1015kb1022kb1030kb1037kb1045kb1052kb1059kb1067kb1074kb1082kb1089kb1096kb1104kb1111kb1118kb1126kb1133kb1141kb1148kb1155kb1163kb1170kb1178kb1185kb1192kb1200kb1207kb1214kb1222kb1229kb1237kb1244kb1251kb1259kb1266kb1274kb1281kb1288kb1296kb1303kb1311kb1318kb1325kb1333kb1340kb1347kb1355kb1362kb1370kb1377kb1384kb1392kb1399kb1407kb1414kb1421kb1429kb1436kb1443kb1451kb1458kb1466kb1473kb1480kb1488kb1495kb1503kb1510kb1517kb1525kb1532kb1539kb1547kb1554kb1562kb1569kb1576kb1584kb1591kb1599kb1606kb1613kb1621kb1628kb1635kb1643kb1650kb1658kb1665kb1672kb1680kb1687kb1695kb1702kb1709kb1717kb1724kb1732kb1739kb1746kb1754kb1761kb1768kb1776kb1783kb1791kb1798kb1805kb1813kb1820kb1828kb1835kb1842kb1850kb1857kb1864kb1872kb1879kb1887kb

3kb11kb18kb25kb33kb40kb48kb55kb62kb70kb77kb84kb92kb99kb107kb114kb121kb129kb136kb144kb151kb158kb166kb173kb180kb188kb195kb203kb210kb217kb225kb232kb240kb247kb254kb262kb269kb276kb284kb291kb299kb306kb313kb321kb328kb336kb343kb350kb358kb365kb372kb380kb387kb395kb402kb409kb417kb424kb432kb439kb446kb454kb461kb469kb476kb483kb491kb498kb505kb513kb520kb528kb535kb542kb550kb557kb565kb572kb579kb587kb594kb601kb609kb616kb624kb631kb638kb646kb653kb661kb668kb675kb683kb690kb697kb705kb712kb720kb727kb734kb742kb749kb757kb764kb771kb779kb786kb793kb801kb808kb816kb823kb830kb838kb845kb853kb860kb867kb875kb882kb890kb897kb904kb912kb919kb926kb934kb941kb949kb956kb963kb971kb978kb986kb993kb1000kb1008kb1015kb1022kb1030kb1037kb1045kb1052kb1059kb1067kb1074kb1082kb1089kb1096kb1104kb1111kb1118kb1126kb1133kb1141kb1148kb1155kb1163kb1170kb1178kb1185kb1192kb1200kb1207kb1214kb1222kb1229kb1237kb1244kb1251kb1259kb1266kb1274kb1281kb1288kb1296kb1303kb1311kb1318kb1325kb1333kb1340kb1347kb1355kb1362kb1370kb1377kb1384kb1392kb1399kb1407kb1414kb1421kb1429kb1436kb1443kb1451kb1458kb1466kb1473kb1480kb1488kb1495kb1503kb1510kb1517kb1525kb1532kb1539kb1547kb1554kb1562kb1569kb1576kb1584kb1591kb1599kb1606kb1613kb1621kb1628kb1635kb1643kb1650kb1658kb1665kb1672kb1680kb1687kb1695kb1702kb1709kb1717kb1724kb1732kb1739kb1746kb1754kb1761kb1768kb1776kb1783kb1791kb1798kb1805kb1813kb1820kb1828kb1835kb1842kb1850kb1857kb1864kb1872kb1879kb1887kb

3kb11kb18kb25kb33kb40kb48kb55kb62kb70kb77kb84kb92kb99kb107kb114kb121kb129kb136kb144kb151kb158kb166kb173kb180kb188kb195kb203kb210kb217kb225kb232kb240kb247kb254kb262kb269kb276kb284kb291kb299kb306kb313kb321kb328kb336kb343kb350kb358kb365kb372kb380kb387kb395kb402kb409kb417kb424kb432kb439kb446kb454kb461kb469kb476kb483kb491kb498kb505kb513kb520kb528kb535kb542kb550kb557kb565kb572kb579kb587kb594kb601kb609kb616kb624kb631kb638kb646kb653kb661kb668kb675kb683kb690kb697kb705kb712kb720kb727kb734kb742kb749kb757kb764kb771kb779kb786kb793kb801kb808kb816kb823kb830kb838kb845kb853kb860kb867kb875kb882kb890kb897kb904kb912kb919kb926kb934kb941kb949kb956kb963kb971kb978kb986kb993kb1000kb1008kb1015kb1022kb1030kb1037kb1045kb1052kb1059kb1067kb1074kb1082kb1089kb1096kb1104kb1111kb1118kb1126kb1133kb1141kb1148kb1155kb1163kb1170kb1178kb1185kb1192kb1200kb1207kb1214kb1222kb1229kb1237kb1244kb1251kb1259kb1266kb1274kb1281kb1288kb1296kb1303kb1311kb1318kb1325kb1333kb1340kb1347kb1355kb1362kb1370kb1377kb1384kb1392kb1399kb1407kb1414kb1421kb1429kb1436kb1443kb1451kb1458kb1466kb1473kb1480kb1488kb1495kb1503kb1510kb1517kb1525kb1532kb1539kb1547kb1554kb1562kb1569kb1576kb1584kb1591kb1599kb1606kb1613kb1621kb1628kb1635kb1643kb1650kb1658kb1665kb1672kb1680kb1687kb1695kb1702kb1709kb1717kb1724kb1732kb1739kb1746kb1754kb1761kb1768kb1776kb1783kb1791kb1798kb1805kb1813kb1820kb1828kb1835kb1842kb1850kb1857kb1864kb1872kb1879kb1887kb

3kb11kb18kb25kb33kb40kb48kb55kb62kb70kb77kb84kb92kb99kb107kb114kb121kb129kb136kb144kb151kb158kb166kb173kb180kb188kb195kb203kb210kb217kb225kb232kb240kb247kb254kb262kb269kb276kb284kb291kb299kb306kb313kb321kb328kb336kb343kb350kb358kb365kb372kb380kb387kb395kb402kb409kb417kb424kb432kb439kb446kb454kb461kb469kb476kb483kb491kb498kb505kb513kb520kb528kb535kb542kb550kb557kb565kb572kb579kb587kb594kb601kb609kb616kb624kb631kb638kb646kb653kb661kb668kb675kb683kb690kb697kb705kb712kb720kb727kb734kb742kb749kb757kb764kb771kb779kb786kb793kb801kb808kb816kb823kb830kb838kb845kb853kb860kb867kb875kb882kb890kb897kb904kb912kb919kb926kb934kb941kb949kb956kb963kb971kb978kb986kb993kb1000kb1008kb1015kb1022kb1030kb1037kb1045kb1052kb1059kb1067kb1074kb1082kb1089kb1096kb1104kb1111kb1118kb1126kb1133kb1141kb1148kb1155kb1163kb1170kb1178kb1185kb1192kb1200kb1207kb1214kb1222kb1229kb1237kb1244kb1251kb1259kb1266kb1274kb1281kb1288kb1296kb1303kb1311kb1318kb1325kb1333kb1340kb1347kb1355kb1362kb1370kb1377kb1384kb1392kb1399kb1407kb1414kb1421kb1429kb1436kb1443kb1451kb1458kb1466kb1473kb1480kb1488kb1495kb1503kb1510kb1517kb1525kb1532kb1539kb1547kb1554kb1562kb1569kb1576kb1584kb1591kb1599kb1606kb1613kb1621kb1628kb1635kb1643kb1650kb1658kb1665kb1672kb1680kb1687kb1695kb1702kb1709kb1717kb1724kb1732kb1739kb1746kb1754kb1761kb1768kb1776kb1783kb1791kb1798kb1805kb1813kb1820kb1828kb1835kb1842kb1850kb1857kb1864kb1872kb1879kb1887kb

3kb11kb18kb25kb33kb40kb48kb55kb62kb70kb77kb84kb92kb99kb107kb114kb121kb129kb136kb144kb151kb158kb166kb173kb180kb188kb195kb203kb210kb217kb225kb232kb240kb247kb254kb262kb269kb276kb284kb291kb299kb306kb313kb321kb328kb336kb343kb350kb358kb365kb372kb380kb387kb395kb402kb409kb417kb424kb432kb439kb446kb454kb461kb469kb476kb483kb491kb498kb505kb513kb520kb528kb535kb542kb550kb557kb565kb572kb579kb587kb594kb601kb609kb616kb624kb631kb638kb646kb653kb661kb668kb675kb683kb690kb697kb705kb712kb720kb727kb734kb742kb749kb757kb764kb771kb779kb786kb793kb801kb808kb816kb823kb830kb838kb845kb853kb860kb867kb875kb882kb890kb897kb904kb912kb919kb926kb934kb941kb949kb956kb963kb971kb978kb986kb993kb1000kb1008kb1015kb1022kb1030kb1037kb1045kb1052kb1059kb1067kb1074kb1082kb1089kb1096kb1104kb1111kb1118kb1126kb1133kb1141kb1148kb1155kb1163kb1170kb1178kb1185kb1192kb1200kb1207kb1214kb1222kb1229kb1237kb1244kb1251kb1259kb1266kb1274kb1281kb1288kb1296kb1303kb1311kb1318kb1325kb1333kb1340kb1347kb1355kb1362kb1370kb1377kb1384kb1392kb1399kb1407kb1414kb1421kb1429kb1436kb1443kb1451kb1458kb1466kb1473kb1480kb1488kb1495kb1503kb1510kb1517kb1525kb1532kb1539kb1547kb1554kb1562kb1569kb1576kb1584kb1591kb1599kb1606kb1613kb1621kb1628kb1635kb1643kb1650kb1658kb1665kb1672kb1680kb1687kb1695kb1702kb1709kb1717kb1724kb1732kb1739kb1746kb1754kb1761kb1768kb1776kb1783kb1791kb1798kb1805kb1813kb1820kb1828kb1835kb1842kb1850kb1857kb1864kb1872kb1879kb1887kb

3kb11kb18kb25kb33kb40kb48kb55kb62kb70kb77kb84kb92kb99kb107kb114kb121kb129kb136kb144kb151kb158kb166kb173kb180kb188kb195kb203kb210kb217kb225kb232kb240kb247kb254kb262kb269kb276kb284kb291kb299kb306kb313kb321kb328kb336kb343kb350kb358kb365kb372kb380kb387kb395kb402kb409kb417kb424kb432kb439kb446kb454kb461kb469kb476kb483kb491kb498kb505kb513kb520kb528kb535kb542kb550kb557kb565kb572kb579kb587kb594kb601kb609kb616kb624kb631kb638kb646kb653kb661kb668kb675kb683kb690kb697kb705kb712kb720kb727kb734kb742kb749kb757kb764kb771kb779kb786kb793kb801kb808kb816kb823kb830kb838kb845kb853kb860kb867kb875kb882kb890kb897kb904kb912kb919kb926kb934kb941kb949kb956kb963kb971kb978kb986kb993kb1000kb1008kb1015kb1022kb1030kb1037kb1045kb1052kb1059kb1067kb1074kb1082kb1089kb1096kb1104kb1111kb1118kb1126kb1133kb1141kb1148kb1155kb1163kb1170kb1178kb1185kb1192kb1200kb1207kb1214kb1222kb1229kb1237kb1244kb1251kb1259kb1266kb1274kb1281kb1288kb1296kb1303kb1311kb1318kb1325kb1333kb1340kb1347kb1355kb1362kb1370kb1377kb1384kb1392kb1399kb1407kb1414kb1421kb1429kb1436kb1443kb1451kb1458kb1466kb1473kb1480kb1488kb1495kb1503kb1510kb1517kb1525kb1532kb1539kb1547kb1554kb1562kb1569kb1576kb1584kb1591kb1599kb1606kb1613kb1621kb1628kb1635kb1643kb1650kb1658kb1665kb1672kb1680kb1687kb1695kb1702kb1709kb1717kb1724kb1732kb1739kb1746kb1754kb1761kb1768kb1776kb1783kb1791kb1798kb1805kb1813kb1820kb1828kb1835kb1842kb1850kb1857kb1864kb1872kb1879kb1887kb

7kb22kb36kb51kb66kb81kb96kb110kb125kb140kb155kb169kb184kb199kb214kb228kb243kb258kb273kb288kb302kb317kb332kb347kb361kb376kb391kb406kb421kb435kb450kb465kb480kb494kb509kb524kb539kb553kb568kb583kb598kb613kb627kb642kb657kb672kb686kb701kb716kb731kb745kb760kb775kb790kb805kb819kb834kb849kb864kb878kb893kb908kb923kb938kb952kb967kb982kb997kb1011kb1026kb1041kb1056kb1070kb1085kb1100kb1115kb1130kb1144kb1159kb1174kb1189kb1203kb1218kb1233kb1248kb1263kb1277kb1292kb1307kb1322kb1336kb1351kb1366kb1381kb1395kb1410kb1425kb1440kb1455kb1469kb1484kb1499kb1514kb1528kb1543kb1558kb1573kb1587kb1602kb1617kb1632kb1647kb1661kb1676kb1691kb1706kb1720kb1735kb1750kb1765kb1780kb1794kb1809kb1824kb1839kb1853kb1868kb1883kb

7kb22kb36kb51kb66kb81kb96kb110kb125kb140kb155kb169kb184kb199kb214kb228kb243kb258kb273kb288kb302kb317kb332kb347kb361kb376kb391kb406kb421kb435kb450kb465kb480kb494kb509kb524kb539kb553kb568kb583kb598kb613kb627kb642kb657kb672kb686kb701kb716kb731kb745kb760kb775kb790kb805kb819kb834kb849kb864kb878kb893kb908kb923kb938kb952kb967kb982kb997kb1011kb1026kb1041kb1056kb1070kb1085kb1100kb1115kb1130kb1144kb1159kb1174kb1189kb1203kb1218kb1233kb1248kb1263kb1277kb1292kb1307kb1322kb1336kb1351kb1366kb1381kb1395kb1410kb1425kb1440kb1455kb1469kb1484kb1499kb1514kb1528kb1543kb1558kb1573kb1587kb1602kb1617kb1632kb1647kb1661kb1676kb1691kb1706kb1720kb1735kb1750kb1765kb1780kb1794kb1809kb1824kb1839kb1853kb1868kb1883kb

7kb22kb36kb51kb66kb81kb96kb110kb125kb140kb155kb169kb184kb199kb214kb228kb243kb258kb273kb288kb302kb317kb332kb347kb361kb376kb391kb406kb421kb435kb450kb465kb480kb494kb509kb524kb539kb553kb568kb583kb598kb613kb627kb642kb657kb672kb686kb701kb716kb731kb745kb760kb775kb790kb805kb819kb834kb849kb864kb878kb893kb908kb923kb938kb952kb967kb982kb997kb1011kb1026kb1041kb1056kb1070kb1085kb1100kb1115kb1130kb1144kb1159kb1174kb1189kb1203kb1218kb1233kb1248kb1263kb1277kb1292kb1307kb1322kb1336kb1351kb1366kb1381kb1395kb1410kb1425kb1440kb1455kb1469kb1484kb1499kb1514kb1528kb1543kb1558kb1573kb1587kb1602kb1617kb1632kb1647kb1661kb1676kb1691kb1706kb1720kb1735kb1750kb1765kb1780kb1794kb1809kb1824kb1839kb1853kb1868kb1883kb

7kb22kb36kb51kb66kb81kb96kb110kb125kb140kb155kb169kb184kb199kb214kb228kb243kb258kb273kb288kb302kb317kb332kb347kb361kb376kb391kb406kb421kb435kb450kb465kb480kb494kb509kb524kb539kb553kb568kb583kb598kb613kb627kb642kb657kb672kb686kb701kb716kb731kb745kb760kb775kb790kb805kb819kb834kb849kb864kb878kb893kb908kb923kb938kb952kb967kb982kb997kb1011kb1026kb1041kb1056kb1070kb1085kb1100kb1115kb1130kb1144kb1159kb1174kb1189kb1203kb1218kb1233kb1248kb1263kb1277kb1292kb1307kb1322kb1336kb1351kb1366kb1381kb1395kb1410kb1425kb1440kb1455kb1469kb1484kb1499kb1514kb1528kb1543kb1558kb1573kb1587kb1602kb1617kb1632kb1647kb1661kb1676kb1691kb1706kb1720kb1735kb1750kb1765kb1780kb1794kb1809kb1824kb1839kb1853kb1868kb1883kb

7kb22kb36kb51kb66kb81kb96kb110kb125kb140kb155kb169kb184kb199kb214kb228kb243kb258kb273kb288kb302kb317kb332kb347kb361kb376kb391kb406kb421kb435kb450kb465kb480kb494kb509kb524kb539kb553kb568kb583kb598kb613kb627kb642kb657kb672kb686kb701kb716kb731kb745kb760kb775kb790kb805kb819kb834kb849kb864kb878kb893kb908kb923kb938kb952kb967kb982kb997kb1011kb1026kb1041kb1056kb1070kb1085kb1100kb1115kb1130kb1144kb1159kb1174kb1189kb1203kb1218kb1233kb1248kb1263kb1277kb1292kb1307kb1322kb1336kb1351kb1366kb1381kb1395kb1410kb1425kb1440kb1455kb1469kb1484kb1499kb1514kb1528kb1543kb1558kb1573kb1587kb1602kb1617kb1632kb1647kb1661kb1676kb1691kb1706kb1720kb1735kb1750kb1765kb1780kb1794kb1809kb1824kb1839kb1853kb1868kb1883kb

7kb22kb36kb51kb66kb81kb96kb110kb125kb140kb155kb169kb184kb199kb214kb228kb243kb258kb273kb288kb302kb317kb332kb347kb361kb376kb391kb406kb421kb435kb450kb465kb480kb494kb509kb524kb539kb553kb568kb583kb598kb613kb627kb642kb657kb672kb686kb701kb716kb731kb745kb760kb775kb790kb805kb819kb834kb849kb864kb878kb893kb908kb923kb938kb952kb967kb982kb997kb1011kb1026kb1041kb1056kb1070kb1085kb1100kb1115kb1130kb1144kb1159kb1174kb1189kb1203kb1218kb1233kb1248kb1263kb1277kb1292kb1307kb1322kb1336kb1351kb1366kb1381kb1395kb1410kb1425kb1440kb1455kb1469kb1484kb1499kb1514kb1528kb1543kb1558kb1573kb1587kb1602kb1617kb1632kb1647kb1661kb1676kb1691kb1706kb1720kb1735kb1750kb1765kb1780kb1794kb1809kb1824kb1839kb1853kb1868kb1883kb

7kb22kb36kb51kb66kb81kb96kb110kb125kb140kb155kb169kb184kb199kb214kb228kb243kb258kb273kb288kb302kb317kb332kb347kb361kb376kb391kb406kb421kb435kb450kb465kb480kb494kb509kb524kb539kb553kb568kb583kb598kb613kb627kb642kb657kb672kb686kb701kb716kb731kb745kb760kb775kb790kb805kb819kb834kb849kb864kb878kb893kb908kb923kb938kb952kb967kb982kb997kb1011kb1026kb1041kb1056kb1070kb1085kb1100kb1115kb1130kb1144kb1159kb1174kb1189kb1203kb1218kb1233kb1248kb1263kb1277kb1292kb1307kb1322kb1336kb1351kb1366kb1381kb1395kb1410kb1425kb1440kb1455kb1469kb1484kb1499kb1514kb1528kb1543kb1558kb1573kb1587kb1602kb1617kb1632kb1647kb1661kb1676kb1691kb1706kb1720kb1735kb1750kb1765kb1780kb1794kb1809kb1824kb1839kb1853kb1868kb1883kb

7kb22kb36kb51kb66kb81kb96kb110kb125kb140kb155kb169kb184kb199kb214kb228kb243kb258kb273kb288kb302kb317kb332kb347kb361kb376kb391kb406kb421kb435kb450kb465kb480kb494kb509kb524kb539kb553kb568kb583kb598kb613kb627kb642kb657kb672kb686kb701kb716kb731kb745kb760kb775kb790kb805kb819kb834kb849kb864kb878kb893kb908kb923kb938kb952kb967kb982kb997kb1011kb1026kb1041kb1056kb1070kb1085kb1100kb1115kb1130kb1144kb1159kb1174kb1189kb1203kb1218kb1233kb1248kb1263kb1277kb1292kb1307kb1322kb1336kb1351kb1366kb1381kb1395kb1410kb1425kb1440kb1455kb1469kb1484kb1499kb1514kb1528kb1543kb1558kb1573kb1587kb1602kb1617kb1632kb1647kb1661kb1676kb1691kb1706kb1720kb1735kb1750kb1765kb1780kb1794kb1809kb1824kb1839kb1853kb1868kb1883kb

7kb22kb36kb51kb66kb81kb96kb110kb125kb140kb155kb169kb184kb199kb214kb228kb243kb258kb273kb288kb302kb317kb332kb347kb361kb376kb391kb406kb421kb435kb450kb465kb480kb494kb509kb524kb539kb553kb568kb583kb598kb613kb627kb642kb657kb672kb686kb701kb716kb731kb745kb760kb775kb790kb805kb819kb834kb849kb864kb878kb893kb908kb923kb938kb952kb967kb982kb997kb1011kb1026kb1041kb1056kb1070kb1085kb1100kb1115kb1130kb1144kb1159kb1174kb1189kb1203kb1218kb1233kb1248kb1263kb1277kb1292kb1307kb1322kb1336kb1351kb1366kb1381kb1395kb1410kb1425kb1440kb1455kb1469kb1484kb1499kb1514kb1528kb1543kb1558kb1573kb1587kb1602kb1617kb1632kb1647kb1661kb1676kb1691kb1706kb1720kb1735kb1750kb1765kb1780kb1794kb1809kb1824kb1839kb1853kb1868kb1883kb

7kb22kb36kb51kb66kb81kb96kb110kb125kb140kb155kb169kb184kb199kb214kb228kb243kb258kb273kb288kb302kb317kb332kb347kb361kb376kb391kb406kb421kb435kb450kb465kb480kb494kb509kb524kb539kb553kb568kb583kb598kb613kb627kb642kb657kb672kb686kb701kb716kb731kb745kb760kb775kb790kb805kb819kb834kb849kb864kb878kb893kb908kb923kb938kb952kb967kb982kb997kb1011kb1026kb1041kb1056kb1070kb1085kb1100kb1115kb1130kb1144kb1159kb1174kb1189kb1203kb1218kb1233kb1248kb1263kb1277kb1292kb1307kb1322kb1336kb1351kb1366kb1381kb1395kb1410kb1425kb1440kb1455kb1469kb1484kb1499kb1514kb1528kb1543kb1558kb1573kb1587kb1602kb1617kb1632kb1647kb1661kb1676kb1691kb1706kb1720kb1735kb1750kb1765kb1780kb1794kb1809kb1824kb1839kb1853kb1868kb1883kb

7kb22kb36kb51kb66kb81kb96kb110kb125kb140kb155kb169kb184kb199kb214kb228kb243kb258kb273kb288kb302kb317kb332kb347kb361kb376kb391kb406kb421kb435kb450kb465kb480kb494kb509kb524kb539kb553kb568kb583kb598kb613kb627kb642kb657kb672kb686kb701kb716kb731kb745kb760kb775kb790kb805kb819kb834kb849kb864kb878kb893kb908kb923kb938kb952kb967kb982kb997kb1011kb1026kb1041kb1056kb1070kb1085kb1100kb1115kb1130kb1144kb1159kb1174kb1189kb1203kb1218kb1233kb1248kb1263kb1277kb1292kb1307kb1322kb1336kb1351kb1366kb1381kb1395kb1410kb1425kb1440kb1455kb1469kb1484kb1499kb1514kb1528kb1543kb1558kb1573kb1587kb1602kb1617kb1632kb1647kb1661kb1676kb1691kb1706kb1720kb1735kb1750kb1765kb1780kb1794kb1809kb1824kb1839kb1853kb1868kb1883kb

14kb44kb73kb103kb132kb162kb192kb221kb251kb280kb310kb339kb369kb398kb428kb457kb487kb517kb546kb576kb605kb635kb664kb694kb723kb753kb782kb812kb842kb871kb901kb930kb960kb989kb1019kb1048kb1078kb1107kb1137kb1166kb1196kb1226kb1255kb1285kb1314kb1344kb1373kb1403kb1432kb1462kb1491kb1521kb1551kb1580kb1610kb1639kb1669kb1698kb1728kb1757kb1787kb1816kb1846kb1876kb

14kb44kb73kb103kb132kb162kb192kb221kb251kb280kb310kb339kb369kb398kb428kb457kb487kb517kb546kb576kb605kb635kb664kb694kb723kb753kb782kb812kb842kb871kb901kb930kb960kb989kb1019kb1048kb1078kb1107kb1137kb1166kb1196kb1226kb1255kb1285kb1314kb1344kb1373kb1403kb1432kb1462kb1491kb1521kb1551kb1580kb1610kb1639kb1669kb1698kb1728kb1757kb1787kb1816kb1846kb1876kb

14kb44kb73kb103kb132kb162kb192kb221kb251kb280kb310kb339kb369kb398kb428kb457kb487kb517kb546kb576kb605kb635kb664kb694kb723kb753kb782kb812kb842kb871kb901kb930kb960kb989kb1019kb1048kb1078kb1107kb1137kb1166kb1196kb1226kb1255kb1285kb1314kb1344kb1373kb1403kb1432kb1462kb1491kb1521kb1551kb1580kb1610kb1639kb1669kb1698kb1728kb1757kb1787kb1816kb1846kb1876kb

14kb44kb73kb103kb132kb162kb192kb221kb251kb280kb310kb339kb369kb398kb428kb457kb487kb517kb546kb576kb605kb635kb664kb694kb723kb753kb782kb812kb842kb871kb901kb930kb960kb989kb1019kb1048kb1078kb1107kb1137kb1166kb1196kb1226kb1255kb1285kb1314kb1344kb1373kb1403kb1432kb1462kb1491kb1521kb1551kb1580kb1610kb1639kb1669kb1698kb1728kb1757kb1787kb1816kb1846kb1876kb

14kb44kb73kb103kb132kb162kb192kb221kb251kb280kb310kb339kb369kb398kb428kb457kb487kb517kb546kb576kb605kb635kb664kb694kb723kb753kb782kb812kb842kb871kb901kb930kb960kb989kb1019kb1048kb1078kb1107kb1137kb1166kb1196kb1226kb1255kb1285kb1314kb1344kb1373kb1403kb1432kb1462kb1491kb1521kb1551kb1580kb1610kb1639kb1669kb1698kb1728kb1757kb1787kb1816kb1846kb1876kb

14kb44kb73kb103kb132kb162kb192kb221kb251kb280kb310kb339kb369kb398kb428kb457kb487kb517kb546kb576kb605kb635kb664kb694kb723kb753kb782kb812kb842kb871kb901kb930kb960kb989kb1019kb1048kb1078kb1107kb1137kb1166kb1196kb1226kb1255kb1285kb1314kb1344kb1373kb1403kb1432kb1462kb1491kb1521kb1551kb1580kb1610kb1639kb1669kb1698kb1728kb1757kb1787kb1816kb1846kb1876kb

14kb44kb73kb103kb132kb162kb192kb221kb251kb280kb310kb339kb369kb398kb428kb457kb487kb517kb546kb576kb605kb635kb664kb694kb723kb753kb782kb812kb842kb871kb901kb930kb960kb989kb1019kb1048kb1078kb1107kb1137kb1166kb1196kb1226kb1255kb1285kb1314kb1344kb1373kb1403kb1432kb1462kb1491kb1521kb1551kb1580kb1610kb1639kb1669kb1698kb1728kb1757kb1787kb1816kb1846kb1876kb

14kb44kb73kb103kb132kb162kb192kb221kb251kb280kb310kb339kb369kb398kb428kb457kb487kb517kb546kb576kb605kb635kb664kb694kb723kb753kb782kb812kb842kb871kb901kb930kb960kb989kb1019kb1048kb1078kb1107kb1137kb1166kb1196kb1226kb1255kb1285kb1314kb1344kb1373kb1403kb1432kb1462kb1491kb1521kb1551kb1580kb1610kb1639kb1669kb1698kb1728kb1757kb1787kb1816kb1846kb1876kb

14kb44kb73kb103kb132kb162kb192kb221kb251kb280kb310kb339kb369kb398kb428kb457kb487kb517kb546kb576kb605kb635kb664kb694kb723kb753kb782kb812kb842kb871kb901kb930kb960kb989kb1019kb1048kb1078kb1107kb1137kb1166kb1196kb1226kb1255kb1285kb1314kb1344kb1373kb1403kb1432kb1462kb1491kb1521kb1551kb1580kb1610kb1639kb1669kb1698kb1728kb1757kb1787kb1816kb1846kb1876kb

14kb44kb73kb103kb132kb162kb192kb221kb251kb280kb310kb339kb369kb398kb428kb457kb487kb517kb546kb576kb605kb635kb664kb694kb723kb753kb782kb812kb842kb871kb901kb930kb960kb989kb1019kb1048kb1078kb1107kb1137kb1166kb1196kb1226kb1255kb1285kb1314kb1344kb1373kb1403kb1432kb1462kb1491kb1521kb1551kb1580kb1610kb1639kb1669kb1698kb1728kb1757kb1787kb1816kb1846kb1876kb

14kb44kb73kb103kb132kb162kb192kb221kb251kb280kb310kb339kb369kb398kb428kb457kb487kb517kb546kb576kb605kb635kb664kb694kb723kb753kb782kb812kb842kb871kb901kb930kb960kb989kb1019kb1048kb1078kb1107kb1137kb1166kb1196kb1226kb1255kb1285kb1314kb1344kb1373kb1403kb1432kb1462kb1491kb1521kb1551kb1580kb1610kb1639kb1669kb1698kb1728kb1757kb1787kb1816kb1846kb1876kb

29kb88kb147kb206kb265kb324kb384kb443kb502kb561kb620kb679kb738kb797kb856kb915kb974kb1034kb1093kb1152kb1211kb1270kb1329kb1388kb1447kb1506kb1565kb1624kb1684kb1743kb1802kb1861kb

29kb88kb147kb206kb265kb324kb384kb443kb502kb561kb620kb679kb738kb797kb856kb915kb974kb1034kb1093kb1152kb1211kb1270kb1329kb1388kb1447kb1506kb1565kb1624kb1684kb1743kb1802kb1861kb

29kb88kb147kb206kb265kb324kb384kb443kb502kb561kb620kb679kb738kb797kb856kb915kb974kb1034kb1093kb1152kb1211kb1270kb1329kb1388kb1447kb1506kb1565kb1624kb1684kb1743kb1802kb1861kb

29kb88kb147kb206kb265kb324kb384kb443kb502kb561kb620kb679kb738kb797kb856kb915kb974kb1034kb1093kb1152kb1211kb1270kb1329kb1388kb1447kb1506kb1565kb1624kb1684kb1743kb1802kb1861kb

29kb88kb147kb206kb265kb324kb384kb443kb502kb561kb620kb679kb738kb797kb856kb915kb974kb1034kb1093kb1152kb1211kb1270kb1329kb1388kb1447kb1506kb1565kb1624kb1684kb1743kb1802kb1861kb

29kb88kb147kb206kb265kb324kb384kb443kb502kb561kb620kb679kb738kb797kb856kb915kb974kb1034kb1093kb1152kb1211kb1270kb1329kb1388kb1447kb1506kb1565kb1624kb1684kb1743kb1802kb1861kb

29kb88kb147kb206kb265kb324kb384kb443kb502kb561kb620kb679kb738kb797kb856kb915kb974kb1034kb1093kb1152kb1211kb1270kb1329kb1388kb1447kb1506kb1565kb1624kb1684kb1743kb1802kb1861kb

29kb88kb147kb206kb265kb324kb384kb443kb502kb561kb620kb679kb738kb797kb856kb915kb974kb1034kb1093kb1152kb1211kb1270kb1329kb1388kb1447kb1506kb1565kb1624kb1684kb1743kb1802kb1861kb

29kb88kb147kb206kb265kb324kb384kb443kb502kb561kb620kb679kb738kb797kb856kb915kb974kb1034kb1093kb1152kb1211kb1270kb1329kb1388kb1447kb1506kb1565kb1624kb1684kb1743kb1802kb1861kb

29kb88kb147kb206kb265kb324kb384kb443kb502kb561kb620kb679kb738kb797kb856kb915kb974kb1034kb1093kb1152kb1211kb1270kb1329kb1388kb1447kb1506kb1565kb1624kb1684kb1743kb1802kb1861kb

29kb88kb147kb206kb265kb324kb384kb443kb502kb561kb620kb679kb738kb797kb856kb915kb974kb1034kb1093kb1152kb1211kb1270kb1329kb1388kb1447kb1506kb1565kb1624kb1684kb1743kb1802kb1861kb

59kb 177kb 295kb 413kb 531kb 649kb 768kb 886kb 1004kb 1122kb 1240kb 1359kb 1477kb 1595kb 1713kb 1831kb

59kb 177kb 295kb 413kb 531kb 649kb 768kb 886kb 1004kb 1122kb 1240kb 1359kb 1477kb 1595kb 1713kb 1831kb

59kb 177kb 295kb 413kb 531kb 649kb 768kb 886kb 1004kb 1122kb 1240kb 1359kb 1477kb 1595kb 1713kb 1831kb

59kb 177kb 295kb 413kb 531kb 649kb 768kb 886kb 1004kb 1122kb 1240kb 1359kb 1477kb 1595kb 1713kb 1831kb

59kb 177kb 295kb 413kb 531kb 649kb 768kb 886kb 1004kb 1122kb 1240kb 1359kb 1477kb 1595kb 1713kb 1831kb

59kb 177kb 295kb 413kb 531kb 649kb 768kb 886kb 1004kb 1122kb 1240kb 1359kb 1477kb 1595kb 1713kb 1831kb

59kb 177kb 295kb 413kb 531kb 649kb 768kb 886kb 1004kb 1122kb 1240kb 1359kb 1477kb 1595kb 1713kb 1831kb

59kb 177kb 295kb 413kb 531kb 649kb 768kb 886kb 1004kb 1122kb 1240kb 1359kb 1477kb 1595kb 1713kb 1831kb

59kb 177kb 295kb 413kb 531kb 649kb 768kb 886kb 1004kb 1122kb 1240kb 1359kb 1477kb 1595kb 1713kb 1831kb

59kb 177kb 295kb 413kb 531kb 649kb 768kb 886kb 1004kb 1122kb 1240kb 1359kb 1477kb 1595kb 1713kb 1831kb

59kb 177kb 295kb 413kb 531kb 649kb 768kb 886kb 1004kb 1122kb 1240kb 1359kb 1477kb 1595kb 1713kb 1831kb

118kb 354kb 590kb 827kb 1063kb 1299kb 1536kb 1772kb

118kb 354kb 590kb 827kb 1063kb 1299kb 1536kb 1772kb

118kb 354kb 590kb 827kb 1063kb 1299kb 1536kb 1772kb

118kb 354kb 590kb 827kb 1063kb 1299kb 1536kb 1772kb

118kb 354kb 590kb 827kb 1063kb 1299kb 1536kb 1772kb

118kb 354kb 590kb 827kb 1063kb 1299kb 1536kb 1772kb

118kb 354kb 590kb 827kb 1063kb 1299kb 1536kb 1772kb

118kb 354kb 590kb 827kb 1063kb 1299kb 1536kb 1772kb

118kb 354kb 590kb 827kb 1063kb 1299kb 1536kb 1772kb

118kb 354kb 590kb 827kb 1063kb 1299kb 1536kb 1772kb

118kb 354kb 590kb 827kb 1063kb 1299kb 1536kb 1772kb

236kb 709kb 1181kb 1654kb

236kb 709kb 1181kb 1654kb

236kb 709kb 1181kb 1654kb

236kb 709kb 1181kb 1654kb

236kb 709kb 1181kb 1654kb

236kb 709kb 1181kb 1654kb

236kb 709kb 1181kb 1654kb

236kb 709kb 1181kb 1654kb

236kb 709kb 1181kb 1654kb

236kb 709kb 1181kb 1654kb

236kb 709kb 1181kb 1654kb

>gi|240248234|emb|AJ74

>gi|89143280|emb|AM233

>gi|110319990|emb|AM28

>gi|115128880|gb|CP000

>gi|118422521|gb|CP000

>gi|134048946|gb|CP000

>gi|156251972|gb|CP000

>gi|187711822|gb|CP000

>gi|282158286|gb|CP001

>gi|377828067|gb|CP003

>gi|377826522|gb|CP003

1892 kb

1895 kb

1892 kb

1895 kb

1910 kb

1898 kb

1890 kb

1893 kb

1892 kb

1968 kb

1892 kb

472kb 945kb 1418kb

472kb 945kb 1418kb

472kb 945kb 1418kb

472kb 945kb 1418kb

472kb 945kb 1418kb

472kb 945kb 1418kb

472kb 945kb 1418kb

472kb 945kb 1418kb

472kb 945kb 1418kb

472kb 945kb 1418kb

472kb 945kb 1418kb

M-GCAT: interactively and efficiently constructing large-scale multiple genome comparison frameworks in closely related species. Treangen TJ, Messeguer X (2006) BMC Bioinformatics

Challenges  •  Newly  available  long  inaccurate  sequences  

–  Finding  shared  sequence  is  an  N2  algorithm  –  Use  exact  seeds  and  extend  but  high  error-­‐rates  require  very  small  seeds  

–  Consequently,  no  method  to  use  these  sequences  

•  Assemble  metagenomic  and  single  cell  datasets  –  AssumpHon  of  uniform  coverage  is  violated,  requiring  novel  methods  to  idenHfy  repeats  

–  AssumpHon  of  simple  paths  with  no  forks  is  violated  by  differences  between  closely  related  organisms  

•  Automated,  Accessible,  and  Reproducible  Results  –  Current  methods  require  manual  parameter  tuning  and  are  error-­‐prone  

–  Easy-­‐to-­‐use  visual  representaHons  of  assembly  graph  structure  •  NP-­‐hard  formulaHon  

–  HeurisHc  for  layout  in  linear  Hme  works  well  in  pracHce  –  HeurisHc  for  orientaHon  (known  2-­‐approximate)   12  

Long  Sequences  Simply  the  Graph  

Assembly complexity of prokaryotic genomes using short reads. Kingsford C, Schatz MC, Pop M (2010) BMC Bioinformatics.

k  =  50   k  =  1,000   k  =  5,000  

13  

! !"

!

"

#

$

%

#

! !"

Hybrid error correction and de novo assembly of single-molecule sequencing reads. Koren, S, et al. (2012) Nature Biotech In Review.

The  Challenge  of  Long  Reads  

14  

Correction results of 20x PacBio coverage of E. coli K12 corrected using 50x Illumina

PacBio Pre Correction Read Length

Length (bp)

% R

eads

0 1000 2000 3000 4000

0.00

000.

0005

0.00

100.

0015

Pre Correction Coverage

% Coverage

% R

eads

0 10 30 50 70 90

00.

10.

30.

50.

91

Pre Correction Identity

% Identity

% R

eads

0 10 30 50 70 90

00.

10.

30.

50.

91

PacBio Post Correction Read Length

Length (bp)

% R

eads

0 1000 2000 3000 4000

0.00

000.

0005

0.00

100.

0015

Post Correction Coverage

% Coverage

% R

eads

0 10 30 50 70 90

00.

10.

91

Post Correction Identity

% Identity

% R

eads

0 10 30 50 70 90

00.

10.

91

CorrecHon  Results  

15  

Recipe #1 Recipe #2

20x PBcR (>6Kbp)

50x PBcR 10x 8Kbp

1 1

1 1

2,419,630 2,415,188

2,419,630 2,415,188

One  Chromosome,  One  ConHg  •  USDA  collaboraHon,  Bibersteinia  trehalosi  192  

– 2.4  Mbp  genome,  5.5  Kbp  max  repeat  

Assembler S. aureus R. sphaeroides Previously Possible

Data 45x

Illumina 45x 3Kbp

45x Ilumina 45x 3Kbp

50x 454 50x 8Kbp

Total Scaffolds 12 34 1

Total Contigs 60 204 25

N50 Contig 96,740 42,455 190,476

Max Contig 234,488 106,467 347,035

16  

GAGE:  A  cri)cal  evalua)on  of  genome  assemblies  and  assembly  algorithms.  Salzberg  SL,  Phillippy  AM,  Zimin  AV,  Puiu  D,  Magoc  T,  Koren  S,  Treangen  TJ,  Schatz  MC,  Delcher  AL,  Roberts  M,  

Marcais  G,  Pop  M,  Yorke  JA  (2011).  Genome  Research!

Assembling  A  Vertebrate  

0 20 40 60 80 100

020

4060

8010

0

% CDS Coverage

% N

umbe

r of C

DS

PBcR454Illumina

0 20 40 60 80 100

020

4060

8010

0

% CDS Coverage

% N

umbe

r of C

DS

PBcR454Illumina

Mapped to contig Mapped to scaffolda) b)

•  Accurate  correcHon  for  vertebrates  –  >99.5%  read  correcHon  accuracy  – Good  quality  of  assembly    

17  

On  To  Metagenomics  •  Newly  available  long  inaccurate  sequences  

–  Finding  shared  sequence  is  an  N2  algorithm  –  Use  exact  seeds  and  extend  but  high  error-­‐rates  require  very  small  seeds  

–  Consequently,  no  method  to  use  these  sequences  

•  Assemble  metagenomic  and  single  cell  datasets  –  AssumpHon  of  uniform  coverage  is  violated,  requiring  novel  methods  to  idenHfy  repeats  

–  AssumpHon  of  simple  paths  with  no  forks  is  violated  by  differences  between  closely  related  organisms  

•  Automated,  Accessible,  and  Reproducible  Results  –  Current  methods  require  manual  parameter  tuning  and  are  error-­‐prone  

–  Easy-­‐to-­‐use  visual  representaHons  of  assembly  graph  structure  •  NP-­‐hard  formulaHon  

–  HeurisHc  for  layout  in  linear  Hme  works  well  in  pracHce  –  HeurisHc  for  orientaHon  (known  2-­‐approximate)   18  

Previous  Approaches  

Coverage  observed  across  large  uniHgs  

Higher  coverage  indicates  a  Repeat  

•  Manually  increase  threshold  for  what  is  a  repeat  

Largest ContigsNumber Number Scaffold in mixedContigs Scaffolds (Kbp) Scaffolds

Celera Assembler 18,577 1,600 29 0.44%with metagenomics settings 8,128 422 864 17.79%

Bambus initial 8,128 328 972 33.77%with Secondary Signal detection 8,128 291 430 0.98%

•  Leads  to  assembly  errors  

The  Sorcerer  II  Global  Ocean  Sampling  expedi)on:  northwest  Atlan)c  through  eastern  tropical  Pacific.  Rusch  DB,  et.  al.  (2007)  PLoS  Biology!

•  Manually  increase  threshold  for  what  is  shared  sequence  

19  

The  Challenge  of  Uneven  Coverage  

•  Repeat  affects  global  structure  of  the  graph  •  Detect  using  number  of  shortest  paths  it  is  on  

129 119

112

113

118

42

70

83

100 120

145

15

51

58

75

8

88

20  

Use  Local  Coverage  Metrics  •  Split  graph  into  strongly  connected  components  – Compute  A-­‐stat  for  each  component  –                                                                                           =  arrival  rate  =  α  =    –                                                                                 is  Binomial  

• We  approximate  using  Poisson  with                                giving  us    

– For  a  2-­‐copy  repeat,  the  arrival  rate  is  doubled,  giving  us  

–     

contigs oflength reads ofnumber

ραλ == np

!

"ke #"

k!

!

2"ke #2"

k!

A! stat = P(unique)P(2! copy)

= (loge)! ! (log2)k

!

P(read starting at a base)

!

P(k starts in " bases)

21  

Repeat  DetecHon  is  Improved  

Bambus  2:  Scaffolding  Metagenomes.    Koren  S,  Treangen  TJ,  Pop  M  (2011)  Bioinforma3cs.! 22  

Scaling  Using  Parallel  ImplementaHon  

23  

The  Challenge  of  VariaHons  

!"#$%&'($!"" ))**+*,,+*+,),+,++,,,))+)++*)+*))+))+)),)))**,**!"#$%''"$!""-./01,*/-.,2+/-3&4-!"%##5 ))**+*,,+*+,),+,++,,,))+)++*)+*))+))+)),)))**,**---------------- 666666666666666666666666666666666666666666666666

!"#$%&'($!"" ,)),)*,,,)*,,),)+)***+*,))++*)))+**++)+,))**+*,+!"#$%''"$!""-./01,*/-.,2+/-3&4-!"%##5 )),,+*+**++,,),),))*,),))))))***+**++*,**+,++,,+

-6-6-6-----66666-6-6----66------66666-------6-66

!"#$%&'($!"" *+*)*+**,+)+*+))+)++))+*+)+,),,,*,+,)*+,))+*++),!"#$%''"$!""-./01,*/-.,2+/-3&4-!"%##5 )+*)*++*,,)*)*))**,+*777777+)*)**,+,*,,)**+*,+*,

-66666-66-6---66---6--------6---6666------66-6-6!"#$%&'($!"" ))))+*)+)++),)+++*))),*++,)),*777+++,))**+)+,*)*!"#$%''"$!""-./01,*/-.,2+/-3&4-!"%##5 *,)*,**)+,,),)+,**),*+**)))))*)**+,+*,))))*,+*+*

--6--6-----6666--66---6---66-6---6-6--6------6-6

Bambus  2:  Scaffolding  Metagenomes.    Koren  S,  Treangen  TJ,  Pop  M  (2011)  Bioinforma3cs.! 24  

IdenHfy  Biological  Variants  

25  

The  Challenge  of  Reproducibility  

metAMOS:  A  modular  and  open  source  metagenomic  assembly  and  analysis  pipeline  Treangen  TJ*,  Koren  S*,  Sommer  D,  Astrovskaya  I,  Liu  B,  Darling  AE,  and  Pop  M.  Genome  Biology  In  Prep.  

MapReads

             FindORFS

               FindRepeats

             Annotate

           Scaffold

               Propagate

             Classify

         Postprocess/Results

MapReads

             FindORFS

FastQCfastx_toolkit

MetaGeneMark  (Zhu  et  al  2010)FragGeneScan  (Rho  et  al  2010)Glimmer-­MG  (Kelley  et  al  2012)

Bowtie  (Langmead  et  al  2009)

BLASTMetaPhyler  (Liu  et  al  2011)PHMMER  (Eddy  2011)PhyloSift  (In  prep)PhymmBL  (Brady  et  al  2011)NB/RITA  (McDonald  et  al  2011)

Bambus  2  (Koren  et  al  2011)

Krona  (Ondov  et  al  2011)

CA  (Miller  et  al  2008)Meta-­IDBA  (Peng  et  al  2011)Minimus  (Sommer  et  al  2008)Newbler  SOAPdenovo  (Li  et  al  2010)Velvet  (Zerbino  et  al  2008)Velvet-­SC  (Chitsaz  et  al  2011)

Preprocess

             Assemble

{

{{

{

{{

=  Start

=  Existing  

=  Novel

{Propagate  classificationsvia  assembly  graph  andmake  final  classifications  ofassembled  contigs/scaffolds

{

26  

TradiHonal  Assembly  

Table 4.3: Assembly contiguity on two NGS prokaryotic datasets. Assem-blies of Staphylococcus aureus (genome size 2 872 915) and Rhodobacter sphaeroides(genome size 4 603 060). For all assemblies, N50 values are based on the same genomesize. The Errors column contains the number of mojoins plus indel errors > 5bp forcontigs, and the total number of misjoins for sca!olds. Corrected N50 values werecomputed after correcting contigs and sca!olds by breaking them at each error. Seethe GAGE publication [146] for details on how errors were identified.Genome Assembler Contigs Sca!olds

Num N50 Errors N50 corr (kb) Num N50 Errors N50 corr. (kb)

S. aureus ABySS 301 29.2 14 24.8 246 34 1 28Allpaths-LG 60 96.7 16 66.2 12 1 092 0 1 092Bambus 2 164 29.5 15 26.0 16 1 089 0 1 089CABOG Could not run: incompatible read lengths in one library.MSR-CA 94 59.2 22 48.2 17 2 412 3 1 022SGA 252 4.0 6 4.0 456 208 1 208SOAPdenovo 107 288.2 48 62.7 99 332 8 284Velvet 162 48.4 28 41.5 45 762 17 162

R. sphaeroides ABySS 1915 5.9 55 4.2 1 701 9 3 5Allpaths-LG 204 42.5 43 34.4 34 3 192 0 3 092Bambus 2 376 21.0 25 19.5 92 2 478 2 2 478CABOG 322 20.2 34 17.9 130 66 5 55MSR-CA 395 22.1 42 19.1 43 2 976 5 2 966SGA 3 067 4.5 8 2.9 2 096 51 0 51SOAPdenovo 204 131.7 414 14.3 166 660 3 658Velvet 583 15.7 35 14.5 178 353 6 270

Table 4.4: Assembly correctness on two NGS prokaryotic datasets. Assem-blies of Staphylococcus aureus (genome size 2 872 915) and Rhodobacter sphaeroides(genome size 4 603 060). See the GAGE publication [146] for details on how errorswere identified.Genome Assembler SNPs Indels Contigs Sca!olds

! 5bp > 5bp Misjoins Inv Reloc Misjoins Inv Reloc

S. aureus ABySS 258 20 9 5 3 2 1 1 0Allpaths-LG 79 4 12 4 0 4 0 0 0Bambus 2 39 12 11 4 1 3 0 0 0MSR-CA 191 23 10 12 6 6 3 3 0SGA 32 2 2 4 1 3 0 0 0SOAPdenovo 246 25 31 17 1 16 8 1 7Velvet 217 6 14 14 5 9 17 5 12

R. sphaeroides ABySS 692 208 34 21 2 19 3 0 3Allpaths-LG 218 150 37 6 0 6 0 0 0Bambus 2 193 136 23 2 1 1 2 0 2CABOG 536 145 24 10 1 9 5 4 1MSR-CA 807 179 32 10 1 9 5 2 3SGA 336 116 4 4 0 4 0 0 0SOAPdenovo 527 155 406 8 0 8 3 1 2Velvet 413 148 27 8 0 8 6 6 7

We ran Bambus 2 to sca!old unitigs from CA-met and Minimus [164]. As seen

in Figure 4.6, for all genomes, Bambus 2 outperforms CA. For all but one genome,

92

GAGE:  A  cri)cal  evalua)on  of  genome  assemblies  and  assembly  algorithms.  Salzberg  SL,  Phillippy  AM,  Zimin  AV,  Puiu  D,  Magoc  T,  Koren  S,  Treangen  TJ,  Schatz  MC,  Delcher  AL,  Roberts  M,  

Marcais  G,  Pop  M,  Yorke  JA  (2011).  Genome  Research! 27  

Metagenomic  Assembly  

0  

5  

10  

15  

20  

25  

SOAP.utg   metAMOS

(SOAP.utg)  

SOAP   Meta-­IDBA velvet   CA   CAdeg  

#  of  genomes  >90%  covered  

Mock  Even   Mock  Staggered  

28  

Metagenomic  Error  Reduced  

0  

5  

10  

15  

20  

25  

30  

SOAP.utg   metAMOS

(SOAP.utg)  

SOAP   Meta-­IDBA velvet   CA   CAdeg  

Errors  per  MB  (Total  errors/Total  contig  length)  

Error  rate  

Mock  Even   Mock  Staggered  

29  

Metagenomic  Assembly  

30  

N. meningitidis

S. gallolyticus

S. agalactiae

S. suis

A. pleuropneumoniae

A. succinogenes

M. succiniciproducens

S. gordonii

P. multocida

S. pneumoniae

S. sanguinis

C. concisus

A. aphrophilus

S. mitis

R. mucilaginosa

10% 20% 30% 40% 50% 60% >70%

soap.utg

metAMOS

soap.utg Meta-IDBA

metAMOS

Meta-IDBA

>10X

covera

ge

Recruited reference genome coverage (%aligned contigs)

S. thermophilus

V. parvula

A. actinomycetemcomitans

A. parvulum

H. ducreyi

0 100 200 300 400 500 600

Errors per MB (Total Errors / Total length of aligned contigs)

soap.utg

metAMOS

soap.utg Meta-IDBA

metAMOS

Meta-IDBA

Table 5.3: Assembly of the HMP tongue dorsum dataset with metAMOS.Stats comparing two assemblies generated within metAMOS (using Meta-IDBAand SOAPdenovo, MetaGeneMark and Bambus 2) of tongue dorsum female sample(HMP, SRS077736). Unitigs indicate initial output of each assembler. Contigs arereported after sca!olding by Bambus 2 (splitting sca!olds at Ns).

Unitigs Contigs Sca!oldsAssembler Total BP # Max # Max # Max

Meta-IDBA 119075 843 678 034 220 488 673 291 220 488 644 997 443823SOAPdenovo 101769 360 451 765 116 181 292 706 238 051 287 108 238 051

5.3.5 HMP tongue dorsum

Our second analysis was performed on real data (HMP tongue dorsum female

sample). For this sample, the true and complete composition of the community is

unknown; instead we constructed a reference genome set from the genomes identi-

fied by the HMP to have high similarity to the sequences within the sample. This

dataset was previously assembled with Meta-IDBA and the published results demon-

strated that Meta-IDBA was able to generate larger contigs than SOAPdenovo [117].

We used both SOAPdenovo and Meta-IDBA as starting points for the metAMOS

pipeline. The results shown in Table 5.3 show that while Meta-IDBA produces a

significantly larger maximum unitig (doubling that obtained by SOAP.utg), the re-

sulting contigs and sca!olds are much closer in length. Focusing on producing larger

unitigs in an initial assembly leads to higher error rates (Table 5.1) while metAMOS

produces accurate unitigs and contiguous contigs/sca!olds. Figure 5.10 shows the

Krona [114] plot for the sample which is automatically generated by metAMOS.

The figure allows both for an overview of the taxonomic composition in a dataset as

well as allowing interactive navigation to explore specific branches of the taxonomy.

To evaluate the correctness of these assemblies, we aligned them against our

set of reference genomes. In Figure 5.11 we show the percentage of each reference

genome covered by correctly assembled contigs. While both assemblers (SOAP-

denovo and Meta-IDBA) vary in their ability to reconstruct individual genomes,

167

Single-­‐Cell  Assembly  Table 4.6: Assembly contiguity on two single-cell prokaryotic datasets.Assemblies of Staphylococcus aureus (genome size 2 872 915) and Rhodobactersphaeroides (genome size 4 603 060). For all assemblies, N50 values are based onthe same genome size. The Errors column contains the number of mojoins plusindel errors > 5bp for contigs, and the total number of misjoins for sca!olds. Cor-rected N50 values were computed after correcting contigs and sca!olds by breakingthem at each error. See the GAGE publication [146] for details on how errors wereidentified.Genome Assembler Contigs Sca!olds

Num N50 Errors N50 corr (kb) Num N50 Errors N50 corr. (kb)

E. coli Lane 6 Euler+Velvet-SC [18] 501 32.0 - - - - - -Velvet-SC 220 56.5 23 52.1 - - - -Velvet-SC+Bambus 2 204 60.7 24 54.4 193 65.0 0 59.6

S. aureus Lane 7 Euler+Velvet-SC [18] 355 32.3 - - - - - -Velvet-SC 175 37.5 19 34.9 - - - -Velvet-SC+Bambus 2 141 45.8 19 42.7 136 48.4 4 40.9

The table shows that Bambus 2 is able to generate longer contigs than those

from Velvet-SC alone. The assemblies show a 4.4% and 22% corrected N50 gain while

introducing one and zero errors in E. coli K12 and S. aureus USA300, respectively.

While the gain is small, the datasets only have short Illumina paired-ends (260bp)

and Bambus 2 is still able to improve on the state-of the-art. Based on the assembly

of S. aureus from Salzberg et. al. [146], Bambus 2 contiguity increases 22–fold when

long (3Kbp) Illumina mate-pairs are available (from 50Kbp to 1 082Kbp sca!old

corrected N50). Therefore, we expect Bambus 2 can significantly improve single-cell

assemblies.

4.5 Discussion

The repeat detection procedures used in Bambus 2 are sensitive without sac-

rificing specificity, and have been applied to the assembly of single genomes. The

sca!olds generated by Bambus 2 cover a large percentage of the genomes in the

103

Efficient  de  novo  assembly  of  single-­‐cell  bacterial  genomes  from  short-­‐read  data  sets.  Chitsaz  H,  Yee-­‐Greenbaum  JL,  Tesler  G,  Lombardo  MJ,  Dupont  CL,  Badger  JH,  Novotny  M,  Rusch  DB,  Fraser  LJ,  

Gormley  NA,  Schulz-­‐Trieglaff  O,  Smith  JP,  Evers  DJ,  Pevzner  PA,  Lasken  RL  (2011).  Nat  Biotech! 31  

Enable  Novel  Biology  

•  AutomaHcally  finish  bacterial  genomes  – Enable  comparaHve  analysis  on  a  scale  not  previously  possible  

•  Analyze  metagenomic  datasets  and  more  accurately  idenHfy  funcHonal  potenHal  – Accurately  represent  closely-­‐related  organisms  in  a  sample  

•  Allow  reproducible  and  consistent  analysis  •  Independent  of  technology  

32  

Conclusion  •  Demonstrated  on  mulHple  domains,  outperforming  domain-­‐specific  tools  

•  Applicable  to  other  areas  of  computer  science  –  CorrecHon  of  high-­‐error  strings  using  a  consensus  approach  

–  OpenMP  implementaHon  of  k-­‐betweenness  vertex  centrality  with  automated  outlier  detecHon  

–  Linear-­‐Hme  heurisHc  linear  layout  algorithm  

•  Future  Work  –  Applying  long-­‐reads  to  metagenomic  and  single-­‐cell  assembly  

–  AutomaHcally  scaling  #  threads  for  repeat  detecHon  –  Confidence  assignments  to  biological  variants  

33  

Sooware  Publications

Journal Articles

[J1] Koren S, Schatz, M. C., Walenz, B. P., Martin, J., Howard, J., Ganapathy, G., Wang, Z., Rasko,D. A., McCombie, W. R., Jarvis, E. D., and Phillippy, A. M. Hybrid error correction and de novoassembly of single-molecule sequencing reads. Nature Biotechnology In Review (2012).

[J2] Prüfer, K., Munch, K., Hellmann, I., Akagi, K., Miller, J. R., Walenz, B., Koren S, Sutton, G.,Kodira, C., Winer, R., Knight, J. R., Mullikin, J. C., Meader, S. J., Ponting, C. P., Lunter, G.,Higashino, S., Hobolth, A., Dutheil, J., Karakoç, E., Alkan, C., Sajjadian, S., Catacchio, C. R.,Ventura, M., Marques-Bonet, T., Eichler, E. E., AndrO, C., Atencia, R., Mugisha, L., Patterson,N., Siebauer, M., Good, J. M., Fischer, A., Ptak, S. E., Lachmann, M., Symer, D. E., Mailund, T.,Schierup, M. H., Andrés, A. M., Kelso, J., and Pääbo, S. The bonobo genome compared with thegenomes of chimpanzee and human. Nature In Review (2012).

[J3] Treangen*, T. J., Koren S*, Sommer, D., Astrovskaya, I., Liu, B., Darling, A. E., and Pop, M.metAMOS: A modular and open source metagenomic assembly and analysis pipeline. Genome BiologyIn Review (2012).

[J4] Earl, D. A., Bradnam, K., St. John, J., Darling, A., Lin, D., Faas, J., Yu, H. O. K., Vince, B., Zerbino,D. R., Diekhans, M., Nguyen, N., Nuwantha, P., Sung, A. W.-K., Ning, Z., Haimel, M., Simpson,J. T., Fronseca, N. A., Birol, Ä., Docking, T. R., Ho, I. Y., Rokhsar, D. S., Chikhi, R., Lavenier,D., Chapuis, G., Naquin, D., Maillet, N., Schatz, M. C., Kelly, D. R., Phillippy, A. M., Koren, S,Yang, S.-P., Wu, W., Chou, W.-C., Srivastava, A., Shaw, T. I., Ruby, J. G., Skewes-Cox, P., Betegon,M., Dimon, M. T., Solovyev, V., Kosarev, P., Vorobyev, D., Ramirez-Gonzalez, R., Leggett, R.,MacLean, D., Xia, F., Luo, R., L, Z., Xie, Y., Liu, B., Gnerre, S., MacCallum, I., Przybylski, D.,Ribeiro, F. J., Yin, S., Sharpe, T., Hall, G., Kersey, P. J., Durbin, R., Jackman, S. D., Chapman,J. A., Huang, X., DeRisi, J. L., Caccamo, M., Li, Y., Jaffe, D. B., Green, R., Haussler, D., Korf, I.,and Paten, B. Assemblathon 1: A competitive assessment of de novo short read assembly methods.Genome Research (2011).

[J5] Koren, S, Treangen, T. J., and Pop, M. Bambus 2: scaffolding metagenomes. Bioinformatics 27, 21(2011), 2964–2971.

[J6] Salzberg, S. L., Phillippy, A. M., Zimin, A. V., Puiu, D., Magoc, T., Koren, S, Treangen, T., Schatz,M. C., Delcher, A. L., Roberts, M., Marcais, G., Pop, M., and Yorke, J. A. GAGE: a criticalevaluation of genome assemblies and assembly algorithms. Genome Research (2011).

[J7] Treangen, T. J., Sommer, D. D., Angly, F. E., Koren, S., and Pop, M. Next generation sequenceassembly with amos. Current Protocols in Bioinformatics 33 (2011), 11.8.1–11.8.18.

[J8] Koren, S., Miller, J., Walenz, B., and Sutton, G. An algorithm for automated closure duringassembly (highly accessed). BMC Bioinformatics 11, 1 (2010), 457.

[J9] Miller, J., Koren, S., and Sutton, G. Assembly algorithms for next-generation sequencing data.Genomics (2010).

[J10] Rausch, T., Koren, S., Denisov, G., Weese, D., Emde, A., Doring, A., and Reinert, K. A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads. Bioin-formatics 25, 9 (2009), 1118.

[J11] Miller, J., Delcher, A., Koren, S., Venter, E., Walenz, B., Brownley, A., Johnson, J., Li, K., Mo-barry, C., and Sutton, G. Aggressive assembly of pyrosequencing reads with mates. Bioinformatics24, 24 (2008), 2818.

2

Publications

Journal Articles

[J1] Koren S, Schatz, M. C., Walenz, B. P., Martin, J., Howard, J., Ganapathy, G., Wang, Z., Rasko,D. A., McCombie, W. R., Jarvis, E. D., and Phillippy, A. M. Hybrid error correction and de novoassembly of single-molecule sequencing reads. Nature Biotechnology In Review (2012).

[J2] Prüfer, K., Munch, K., Hellmann, I., Akagi, K., Miller, J. R., Walenz, B., Koren S, Sutton, G.,Kodira, C., Winer, R., Knight, J. R., Mullikin, J. C., Meader, S. J., Ponting, C. P., Lunter, G.,Higashino, S., Hobolth, A., Dutheil, J., Karakoç, E., Alkan, C., Sajjadian, S., Catacchio, C. R.,Ventura, M., Marques-Bonet, T., Eichler, E. E., AndrO, C., Atencia, R., Mugisha, L., Patterson,N., Siebauer, M., Good, J. M., Fischer, A., Ptak, S. E., Lachmann, M., Symer, D. E., Mailund, T.,Schierup, M. H., Andrés, A. M., Kelso, J., and Pääbo, S. The bonobo genome compared with thegenomes of chimpanzee and human. Nature In Review (2012).

[J3] Treangen*, T. J., Koren S*, Sommer, D., Astrovskaya, I., Liu, B., Darling, A. E., and Pop, M.metAMOS: A modular and open source metagenomic assembly and analysis pipeline. Genome BiologyIn Review (2012).

[J4] Earl, D. A., Bradnam, K., St. John, J., Darling, A., Lin, D., Faas, J., Yu, H. O. K., Vince, B., Zerbino,D. R., Diekhans, M., Nguyen, N., Nuwantha, P., Sung, A. W.-K., Ning, Z., Haimel, M., Simpson,J. T., Fronseca, N. A., Birol, Ä., Docking, T. R., Ho, I. Y., Rokhsar, D. S., Chikhi, R., Lavenier,D., Chapuis, G., Naquin, D., Maillet, N., Schatz, M. C., Kelly, D. R., Phillippy, A. M., Koren, S,Yang, S.-P., Wu, W., Chou, W.-C., Srivastava, A., Shaw, T. I., Ruby, J. G., Skewes-Cox, P., Betegon,M., Dimon, M. T., Solovyev, V., Kosarev, P., Vorobyev, D., Ramirez-Gonzalez, R., Leggett, R.,MacLean, D., Xia, F., Luo, R., L, Z., Xie, Y., Liu, B., Gnerre, S., MacCallum, I., Przybylski, D.,Ribeiro, F. J., Yin, S., Sharpe, T., Hall, G., Kersey, P. J., Durbin, R., Jackman, S. D., Chapman,J. A., Huang, X., DeRisi, J. L., Caccamo, M., Li, Y., Jaffe, D. B., Green, R., Haussler, D., Korf, I.,and Paten, B. Assemblathon 1: A competitive assessment of de novo short read assembly methods.Genome Research (2011).

[J5] Koren, S, Treangen, T. J., and Pop, M. Bambus 2: scaffolding metagenomes. Bioinformatics 27, 21(2011), 2964–2971.

[J6] Salzberg, S. L., Phillippy, A. M., Zimin, A. V., Puiu, D., Magoc, T., Koren, S, Treangen, T., Schatz,M. C., Delcher, A. L., Roberts, M., Marcais, G., Pop, M., and Yorke, J. A. GAGE: a criticalevaluation of genome assemblies and assembly algorithms. Genome Research (2011).

[J7] Treangen, T. J., Sommer, D. D., Angly, F. E., Koren, S., and Pop, M. Next generation sequenceassembly with amos. Current Protocols in Bioinformatics 33 (2011), 11.8.1–11.8.18.

[J8] Koren, S., Miller, J., Walenz, B., and Sutton, G. An algorithm for automated closure duringassembly (highly accessed). BMC Bioinformatics 11, 1 (2010), 457.

[J9] Miller, J., Koren, S., and Sutton, G. Assembly algorithms for next-generation sequencing data.Genomics (2010).

[J10] Rausch, T., Koren, S., Denisov, G., Weese, D., Emde, A., Doring, A., and Reinert, K. A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads. Bioin-formatics 25, 9 (2009), 1118.

[J11] Miller, J., Delcher, A., Koren, S., Venter, E., Walenz, B., Brownley, A., Johnson, J., Li, K., Mo-barry, C., and Sutton, G. Aggressive assembly of pyrosequencing reads with mates. Bioinformatics24, 24 (2008), 2818.

2

Publications

Journal Articles

[J1] Koren S, Schatz, M. C., Walenz, B. P., Martin, J., Howard, J., Ganapathy, G., Wang, Z., Rasko,D. A., McCombie, W. R., Jarvis, E. D., and Phillippy, A. M. Hybrid error correction and de novoassembly of single-molecule sequencing reads. Nature Biotechnology In Review (2012).

[J2] Prüfer, K., Munch, K., Hellmann, I., Akagi, K., Miller, J. R., Walenz, B., Koren S, Sutton, G.,Kodira, C., Winer, R., Knight, J. R., Mullikin, J. C., Meader, S. J., Ponting, C. P., Lunter, G.,Higashino, S., Hobolth, A., Dutheil, J., Karakoç, E., Alkan, C., Sajjadian, S., Catacchio, C. R.,Ventura, M., Marques-Bonet, T., Eichler, E. E., AndrO, C., Atencia, R., Mugisha, L., Patterson,N., Siebauer, M., Good, J. M., Fischer, A., Ptak, S. E., Lachmann, M., Symer, D. E., Mailund, T.,Schierup, M. H., Andrés, A. M., Kelso, J., and Pääbo, S. The bonobo genome compared with thegenomes of chimpanzee and human. Nature In Review (2012).

[J3] Treangen*, T. J., Koren S*, Sommer, D., Astrovskaya, I., Liu, B., Darling, A. E., and Pop, M.metAMOS: A modular and open source metagenomic assembly and analysis pipeline. Genome BiologyIn Review (2012).

[J4] Earl, D. A., Bradnam, K., St. John, J., Darling, A., Lin, D., Faas, J., Yu, H. O. K., Vince, B., Zerbino,D. R., Diekhans, M., Nguyen, N., Nuwantha, P., Sung, A. W.-K., Ning, Z., Haimel, M., Simpson,J. T., Fronseca, N. A., Birol, Ä., Docking, T. R., Ho, I. Y., Rokhsar, D. S., Chikhi, R., Lavenier,D., Chapuis, G., Naquin, D., Maillet, N., Schatz, M. C., Kelly, D. R., Phillippy, A. M., Koren, S,Yang, S.-P., Wu, W., Chou, W.-C., Srivastava, A., Shaw, T. I., Ruby, J. G., Skewes-Cox, P., Betegon,M., Dimon, M. T., Solovyev, V., Kosarev, P., Vorobyev, D., Ramirez-Gonzalez, R., Leggett, R.,MacLean, D., Xia, F., Luo, R., L, Z., Xie, Y., Liu, B., Gnerre, S., MacCallum, I., Przybylski, D.,Ribeiro, F. J., Yin, S., Sharpe, T., Hall, G., Kersey, P. J., Durbin, R., Jackman, S. D., Chapman,J. A., Huang, X., DeRisi, J. L., Caccamo, M., Li, Y., Jaffe, D. B., Green, R., Haussler, D., Korf, I.,and Paten, B. Assemblathon 1: A competitive assessment of de novo short read assembly methods.Genome Research (2011).

[J5] Koren, S, Treangen, T. J., and Pop, M. Bambus 2: scaffolding metagenomes. Bioinformatics 27, 21(2011), 2964–2971.

[J6] Salzberg, S. L., Phillippy, A. M., Zimin, A. V., Puiu, D., Magoc, T., Koren, S, Treangen, T., Schatz,M. C., Delcher, A. L., Roberts, M., Marcais, G., Pop, M., and Yorke, J. A. GAGE: a criticalevaluation of genome assemblies and assembly algorithms. Genome Research (2011).

[J7] Treangen, T. J., Sommer, D. D., Angly, F. E., Koren, S., and Pop, M. Next generation sequenceassembly with amos. Current Protocols in Bioinformatics 33 (2011), 11.8.1–11.8.18.

[J8] Koren, S., Miller, J., Walenz, B., and Sutton, G. An algorithm for automated closure duringassembly (highly accessed). BMC Bioinformatics 11, 1 (2010), 457.

[J9] Miller, J., Koren, S., and Sutton, G. Assembly algorithms for next-generation sequencing data.Genomics (2010).

[J10] Rausch, T., Koren, S., Denisov, G., Weese, D., Emde, A., Doring, A., and Reinert, K. A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads. Bioin-formatics 25, 9 (2009), 1118.

[J11] Miller, J., Delcher, A., Koren, S., Venter, E., Walenz, B., Brownley, A., Johnson, J., Li, K., Mo-barry, C., and Sutton, G. Aggressive assembly of pyrosequencing reads with mates. Bioinformatics24, 24 (2008), 2818.

2

34  

Celera  Assembler      hQp://wgs-­‐assembler.sf.net  

Publications

Journal Articles

[J1] Koren S, Schatz, M. C., Walenz, B. P., Martin, J., Howard, J., Ganapathy, G., Wang, Z., Rasko,D. A., McCombie, W. R., Jarvis, E. D., and Phillippy, A. M. Hybrid error correction and de novoassembly of single-molecule sequencing reads. Nature Biotechnology In Review (2012).

[J2] Prüfer, K., Munch, K., Hellmann, I., Akagi, K., Miller, J. R., Walenz, B., Koren S, Sutton, G.,Kodira, C., Winer, R., Knight, J. R., Mullikin, J. C., Meader, S. J., Ponting, C. P., Lunter, G.,Higashino, S., Hobolth, A., Dutheil, J., Karakoç, E., Alkan, C., Sajjadian, S., Catacchio, C. R.,Ventura, M., Marques-Bonet, T., Eichler, E. E., AndrO, C., Atencia, R., Mugisha, L., Patterson,N., Siebauer, M., Good, J. M., Fischer, A., Ptak, S. E., Lachmann, M., Symer, D. E., Mailund, T.,Schierup, M. H., Andrés, A. M., Kelso, J., and Pääbo, S. The bonobo genome compared with thegenomes of chimpanzee and human. Nature In Review (2012).

[J3] Treangen*, T. J., Koren S*, Sommer, D., Astrovskaya, I., Liu, B., Darling, A. E., and Pop, M.metAMOS: A modular and open source metagenomic assembly and analysis pipeline. Genome BiologyIn Review (2012).

[J4] Earl, D. A., Bradnam, K., St. John, J., Darling, A., Lin, D., Faas, J., Yu, H. O. K., Vince, B., Zerbino,D. R., Diekhans, M., Nguyen, N., Nuwantha, P., Sung, A. W.-K., Ning, Z., Haimel, M., Simpson,J. T., Fronseca, N. A., Birol, Ä., Docking, T. R., Ho, I. Y., Rokhsar, D. S., Chikhi, R., Lavenier,D., Chapuis, G., Naquin, D., Maillet, N., Schatz, M. C., Kelly, D. R., Phillippy, A. M., Koren, S,Yang, S.-P., Wu, W., Chou, W.-C., Srivastava, A., Shaw, T. I., Ruby, J. G., Skewes-Cox, P., Betegon,M., Dimon, M. T., Solovyev, V., Kosarev, P., Vorobyev, D., Ramirez-Gonzalez, R., Leggett, R.,MacLean, D., Xia, F., Luo, R., L, Z., Xie, Y., Liu, B., Gnerre, S., MacCallum, I., Przybylski, D.,Ribeiro, F. J., Yin, S., Sharpe, T., Hall, G., Kersey, P. J., Durbin, R., Jackman, S. D., Chapman,J. A., Huang, X., DeRisi, J. L., Caccamo, M., Li, Y., Jaffe, D. B., Green, R., Haussler, D., Korf, I.,and Paten, B. Assemblathon 1: A competitive assessment of de novo short read assembly methods.Genome Research (2011).

[J5] Koren, S, Treangen, T. J., and Pop, M. Bambus 2: scaffolding metagenomes. Bioinformatics 27, 21(2011), 2964–2971.

[J6] Salzberg, S. L., Phillippy, A. M., Zimin, A. V., Puiu, D., Magoc, T., Koren, S, Treangen, T., Schatz,M. C., Delcher, A. L., Roberts, M., Marcais, G., Pop, M., and Yorke, J. A. GAGE: a criticalevaluation of genome assemblies and assembly algorithms. Genome Research (2011).

[J7] Treangen, T. J., Sommer, D. D., Angly, F. E., Koren, S., and Pop, M. Next generation sequenceassembly with amos. Current Protocols in Bioinformatics 33 (2011), 11.8.1–11.8.18.

[J8] Koren, S., Miller, J., Walenz, B., and Sutton, G. An algorithm for automated closure duringassembly (highly accessed). BMC Bioinformatics 11, 1 (2010), 457.

[J9] Miller, J., Koren, S., and Sutton, G. Assembly algorithms for next-generation sequencing data.Genomics (2010).

[J10] Rausch, T., Koren, S., Denisov, G., Weese, D., Emde, A., Doring, A., and Reinert, K. A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads. Bioin-formatics 25, 9 (2009), 1118.

[J11] Miller, J., Delcher, A., Koren, S., Venter, E., Walenz, B., Brownley, A., Johnson, J., Li, K., Mo-barry, C., and Sutton, G. Aggressive assembly of pyrosequencing reads with mates. Bioinformatics24, 24 (2008), 2818.

2

Publications

Journal Articles

[J1] Koren S, Schatz, M. C., Walenz, B. P., Martin, J., Howard, J., Ganapathy, G., Wang, Z., Rasko,D. A., McCombie, W. R., Jarvis, E. D., and Phillippy, A. M. Hybrid error correction and de novoassembly of single-molecule sequencing reads. Nature Biotechnology In Review (2012).

[J2] Prüfer, K., Munch, K., Hellmann, I., Akagi, K., Miller, J. R., Walenz, B., Koren S, Sutton, G.,Kodira, C., Winer, R., Knight, J. R., Mullikin, J. C., Meader, S. J., Ponting, C. P., Lunter, G.,Higashino, S., Hobolth, A., Dutheil, J., Karakoç, E., Alkan, C., Sajjadian, S., Catacchio, C. R.,Ventura, M., Marques-Bonet, T., Eichler, E. E., AndrO, C., Atencia, R., Mugisha, L., Patterson,N., Siebauer, M., Good, J. M., Fischer, A., Ptak, S. E., Lachmann, M., Symer, D. E., Mailund, T.,Schierup, M. H., Andrés, A. M., Kelso, J., and Pääbo, S. The bonobo genome compared with thegenomes of chimpanzee and human. Nature In Review (2012).

[J3] Treangen*, T. J., Koren S*, Sommer, D., Astrovskaya, I., Liu, B., Darling, A. E., and Pop, M.metAMOS: A modular and open source metagenomic assembly and analysis pipeline. Genome BiologyIn Review (2012).

[J4] Earl, D. A., Bradnam, K., St. John, J., Darling, A., Lin, D., Faas, J., Yu, H. O. K., Vince, B., Zerbino,D. R., Diekhans, M., Nguyen, N., Nuwantha, P., Sung, A. W.-K., Ning, Z., Haimel, M., Simpson,J. T., Fronseca, N. A., Birol, Ä., Docking, T. R., Ho, I. Y., Rokhsar, D. S., Chikhi, R., Lavenier,D., Chapuis, G., Naquin, D., Maillet, N., Schatz, M. C., Kelly, D. R., Phillippy, A. M., Koren, S,Yang, S.-P., Wu, W., Chou, W.-C., Srivastava, A., Shaw, T. I., Ruby, J. G., Skewes-Cox, P., Betegon,M., Dimon, M. T., Solovyev, V., Kosarev, P., Vorobyev, D., Ramirez-Gonzalez, R., Leggett, R.,MacLean, D., Xia, F., Luo, R., L, Z., Xie, Y., Liu, B., Gnerre, S., MacCallum, I., Przybylski, D.,Ribeiro, F. J., Yin, S., Sharpe, T., Hall, G., Kersey, P. J., Durbin, R., Jackman, S. D., Chapman,J. A., Huang, X., DeRisi, J. L., Caccamo, M., Li, Y., Jaffe, D. B., Green, R., Haussler, D., Korf, I.,and Paten, B. Assemblathon 1: A competitive assessment of de novo short read assembly methods.Genome Research (2011).

[J5] Koren, S, Treangen, T. J., and Pop, M. Bambus 2: scaffolding metagenomes. Bioinformatics 27, 21(2011), 2964–2971.

[J6] Salzberg, S. L., Phillippy, A. M., Zimin, A. V., Puiu, D., Magoc, T., Koren, S, Treangen, T., Schatz,M. C., Delcher, A. L., Roberts, M., Marcais, G., Pop, M., and Yorke, J. A. GAGE: a criticalevaluation of genome assemblies and assembly algorithms. Genome Research (2011).

[J7] Treangen, T. J., Sommer, D. D., Angly, F. E., Koren, S., and Pop, M. Next generation sequenceassembly with amos. Current Protocols in Bioinformatics 33 (2011), 11.8.1–11.8.18.

[J8] Koren, S., Miller, J., Walenz, B., and Sutton, G. An algorithm for automated closure duringassembly (highly accessed). BMC Bioinformatics 11, 1 (2010), 457.

[J9] Miller, J., Koren, S., and Sutton, G. Assembly algorithms for next-generation sequencing data.Genomics (2010).

[J10] Rausch, T., Koren, S., Denisov, G., Weese, D., Emde, A., Doring, A., and Reinert, K. A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads. Bioin-formatics 25, 9 (2009), 1118.

[J11] Miller, J., Delcher, A., Koren, S., Venter, E., Walenz, B., Brownley, A., Johnson, J., Li, K., Mo-barry, C., and Sutton, G. Aggressive assembly of pyrosequencing reads with mates. Bioinformatics24, 24 (2008), 2818.

2

Bambus  2  /  AMOS      hQp://www.cbcb.umd.edu/sooware/bambus/  

metAMOS      hQps://github.com/treangen/metAMOS/wiki    

                                 hQp://sourceforge.net/apps/mediawiki/wgs-­‐assembler/index.php?Htle=PacBioToCA  

SMRT Sequencing

Time

Inte

nsity

http://www.pacificbiosciences.com/assets/files/pacbio_technology_backgrounder.pdf

Imaging of florescent phospholinked labeled nucleotides as they are incorporated by a polymerase anchored to a Zero-Mode Waveguide (ZMW).

Select  Assembly  PublicaHons  Publications

Journal Articles

[J1] Koren S, Schatz, M. C., Walenz, B. P., Martin, J., Howard, J., Ganapathy, G., Wang, Z., Rasko,D. A., McCombie, W. R., Jarvis, E. D., and Phillippy, A. M. Hybrid error correction and de novoassembly of single-molecule sequencing reads. Nature Biotechnology In Review (2012).

[J2] Prüfer, K., Munch, K., Hellmann, I., Akagi, K., Miller, J. R., Walenz, B., Koren S, Sutton, G.,Kodira, C., Winer, R., Knight, J. R., Mullikin, J. C., Meader, S. J., Ponting, C. P., Lunter, G.,Higashino, S., Hobolth, A., Dutheil, J., Karakoç, E., Alkan, C., Sajjadian, S., Catacchio, C. R.,Ventura, M., Marques-Bonet, T., Eichler, E. E., AndrO, C., Atencia, R., Mugisha, L., Patterson,N., Siebauer, M., Good, J. M., Fischer, A., Ptak, S. E., Lachmann, M., Symer, D. E., Mailund, T.,Schierup, M. H., Andrés, A. M., Kelso, J., and Pääbo, S. The bonobo genome compared with thegenomes of chimpanzee and human. Nature In Review (2012).

[J3] Treangen*, T. J., Koren S*, Sommer, D., Astrovskaya, I., Liu, B., Darling, A. E., and Pop, M.metAMOS: A modular and open source metagenomic assembly and analysis pipeline. Genome BiologyIn Review (2012).

[J4] Earl, D. A., Bradnam, K., St. John, J., Darling, A., Lin, D., Faas, J., Yu, H. O. K., Vince, B., Zerbino,D. R., Diekhans, M., Nguyen, N., Nuwantha, P., Sung, A. W.-K., Ning, Z., Haimel, M., Simpson,J. T., Fronseca, N. A., Birol, Ä., Docking, T. R., Ho, I. Y., Rokhsar, D. S., Chikhi, R., Lavenier,D., Chapuis, G., Naquin, D., Maillet, N., Schatz, M. C., Kelly, D. R., Phillippy, A. M., Koren, S,Yang, S.-P., Wu, W., Chou, W.-C., Srivastava, A., Shaw, T. I., Ruby, J. G., Skewes-Cox, P., Betegon,M., Dimon, M. T., Solovyev, V., Kosarev, P., Vorobyev, D., Ramirez-Gonzalez, R., Leggett, R.,MacLean, D., Xia, F., Luo, R., L, Z., Xie, Y., Liu, B., Gnerre, S., MacCallum, I., Przybylski, D.,Ribeiro, F. J., Yin, S., Sharpe, T., Hall, G., Kersey, P. J., Durbin, R., Jackman, S. D., Chapman,J. A., Huang, X., DeRisi, J. L., Caccamo, M., Li, Y., Jaffe, D. B., Green, R., Haussler, D., Korf, I.,and Paten, B. Assemblathon 1: A competitive assessment of de novo short read assembly methods.Genome Research (2011).

[J5] Koren, S, Treangen, T. J., and Pop, M. Bambus 2: scaffolding metagenomes. Bioinformatics 27, 21(2011), 2964–2971.

[J6] Salzberg, S. L., Phillippy, A. M., Zimin, A. V., Puiu, D., Magoc, T., Koren, S, Treangen, T., Schatz,M. C., Delcher, A. L., Roberts, M., Marcais, G., Pop, M., and Yorke, J. A. GAGE: a criticalevaluation of genome assemblies and assembly algorithms. Genome Research (2011).

[J7] Treangen, T. J., Sommer, D. D., Angly, F. E., Koren, S., and Pop, M. Next generation sequenceassembly with amos. Current Protocols in Bioinformatics 33 (2011), 11.8.1–11.8.18.

[J8] Koren, S., Miller, J., Walenz, B., and Sutton, G. An algorithm for automated closure duringassembly (highly accessed). BMC Bioinformatics 11, 1 (2010), 457.

[J9] Miller, J., Koren, S., and Sutton, G. Assembly algorithms for next-generation sequencing data.Genomics (2010).

[J10] Rausch, T., Koren, S., Denisov, G., Weese, D., Emde, A., Doring, A., and Reinert, K. A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads. Bioin-formatics 25, 9 (2009), 1118.

[J11] Miller, J., Delcher, A., Koren, S., Venter, E., Walenz, B., Brownley, A., Johnson, J., Li, K., Mo-barry, C., and Sutton, G. Aggressive assembly of pyrosequencing reads with mates. Bioinformatics24, 24 (2008), 2818.

2

Publications

Journal Articles

[J1] Koren S, Schatz, M. C., Walenz, B. P., Martin, J., Howard, J., Ganapathy, G., Wang, Z., Rasko,D. A., McCombie, W. R., Jarvis, E. D., and Phillippy, A. M. Hybrid error correction and de novoassembly of single-molecule sequencing reads. Nature Biotechnology In Review (2012).

[J2] Prüfer, K., Munch, K., Hellmann, I., Akagi, K., Miller, J. R., Walenz, B., Koren S, Sutton, G.,Kodira, C., Winer, R., Knight, J. R., Mullikin, J. C., Meader, S. J., Ponting, C. P., Lunter, G.,Higashino, S., Hobolth, A., Dutheil, J., Karakoç, E., Alkan, C., Sajjadian, S., Catacchio, C. R.,Ventura, M., Marques-Bonet, T., Eichler, E. E., AndrO, C., Atencia, R., Mugisha, L., Patterson,N., Siebauer, M., Good, J. M., Fischer, A., Ptak, S. E., Lachmann, M., Symer, D. E., Mailund, T.,Schierup, M. H., Andrés, A. M., Kelso, J., and Pääbo, S. The bonobo genome compared with thegenomes of chimpanzee and human. Nature In Review (2012).

[J3] Treangen*, T. J., Koren S*, Sommer, D., Astrovskaya, I., Liu, B., Darling, A. E., and Pop, M.metAMOS: A modular and open source metagenomic assembly and analysis pipeline. Genome BiologyIn Review (2012).

[J4] Earl, D. A., Bradnam, K., St. John, J., Darling, A., Lin, D., Faas, J., Yu, H. O. K., Vince, B., Zerbino,D. R., Diekhans, M., Nguyen, N., Nuwantha, P., Sung, A. W.-K., Ning, Z., Haimel, M., Simpson,J. T., Fronseca, N. A., Birol, Ä., Docking, T. R., Ho, I. Y., Rokhsar, D. S., Chikhi, R., Lavenier,D., Chapuis, G., Naquin, D., Maillet, N., Schatz, M. C., Kelly, D. R., Phillippy, A. M., Koren, S,Yang, S.-P., Wu, W., Chou, W.-C., Srivastava, A., Shaw, T. I., Ruby, J. G., Skewes-Cox, P., Betegon,M., Dimon, M. T., Solovyev, V., Kosarev, P., Vorobyev, D., Ramirez-Gonzalez, R., Leggett, R.,MacLean, D., Xia, F., Luo, R., L, Z., Xie, Y., Liu, B., Gnerre, S., MacCallum, I., Przybylski, D.,Ribeiro, F. J., Yin, S., Sharpe, T., Hall, G., Kersey, P. J., Durbin, R., Jackman, S. D., Chapman,J. A., Huang, X., DeRisi, J. L., Caccamo, M., Li, Y., Jaffe, D. B., Green, R., Haussler, D., Korf, I.,and Paten, B. Assemblathon 1: A competitive assessment of de novo short read assembly methods.Genome Research (2011).

[J5] Koren, S, Treangen, T. J., and Pop, M. Bambus 2: scaffolding metagenomes. Bioinformatics 27, 21(2011), 2964–2971.

[J6] Salzberg, S. L., Phillippy, A. M., Zimin, A. V., Puiu, D., Magoc, T., Koren, S, Treangen, T., Schatz,M. C., Delcher, A. L., Roberts, M., Marcais, G., Pop, M., and Yorke, J. A. GAGE: a criticalevaluation of genome assemblies and assembly algorithms. Genome Research (2011).

[J7] Treangen, T. J., Sommer, D. D., Angly, F. E., Koren, S., and Pop, M. Next generation sequenceassembly with amos. Current Protocols in Bioinformatics 33 (2011), 11.8.1–11.8.18.

[J8] Koren, S., Miller, J., Walenz, B., and Sutton, G. An algorithm for automated closure duringassembly (highly accessed). BMC Bioinformatics 11, 1 (2010), 457.

[J9] Miller, J., Koren, S., and Sutton, G. Assembly algorithms for next-generation sequencing data.Genomics (2010).

[J10] Rausch, T., Koren, S., Denisov, G., Weese, D., Emde, A., Doring, A., and Reinert, K. A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads. Bioin-formatics 25, 9 (2009), 1118.

[J11] Miller, J., Delcher, A., Koren, S., Venter, E., Walenz, B., Brownley, A., Johnson, J., Li, K., Mo-barry, C., and Sutton, G. Aggressive assembly of pyrosequencing reads with mates. Bioinformatics24, 24 (2008), 2818.

2

Publications

Journal Articles

[J1] Koren S, Schatz, M. C., Walenz, B. P., Martin, J., Howard, J., Ganapathy, G., Wang, Z., Rasko,D. A., McCombie, W. R., Jarvis, E. D., and Phillippy, A. M. Hybrid error correction and de novoassembly of single-molecule sequencing reads. Nature Biotechnology In Review (2012).

[J2] Prüfer, K., Munch, K., Hellmann, I., Akagi, K., Miller, J. R., Walenz, B., Koren S, Sutton, G.,Kodira, C., Winer, R., Knight, J. R., Mullikin, J. C., Meader, S. J., Ponting, C. P., Lunter, G.,Higashino, S., Hobolth, A., Dutheil, J., Karakoç, E., Alkan, C., Sajjadian, S., Catacchio, C. R.,Ventura, M., Marques-Bonet, T., Eichler, E. E., AndrO, C., Atencia, R., Mugisha, L., Patterson,N., Siebauer, M., Good, J. M., Fischer, A., Ptak, S. E., Lachmann, M., Symer, D. E., Mailund, T.,Schierup, M. H., Andrés, A. M., Kelso, J., and Pääbo, S. The bonobo genome compared with thegenomes of chimpanzee and human. Nature In Review (2012).

[J3] Treangen*, T. J., Koren S*, Sommer, D., Astrovskaya, I., Liu, B., Darling, A. E., and Pop, M.metAMOS: A modular and open source metagenomic assembly and analysis pipeline. Genome BiologyIn Review (2012).

[J4] Earl, D. A., Bradnam, K., St. John, J., Darling, A., Lin, D., Faas, J., Yu, H. O. K., Vince, B., Zerbino,D. R., Diekhans, M., Nguyen, N., Nuwantha, P., Sung, A. W.-K., Ning, Z., Haimel, M., Simpson,J. T., Fronseca, N. A., Birol, Ä., Docking, T. R., Ho, I. Y., Rokhsar, D. S., Chikhi, R., Lavenier,D., Chapuis, G., Naquin, D., Maillet, N., Schatz, M. C., Kelly, D. R., Phillippy, A. M., Koren, S,Yang, S.-P., Wu, W., Chou, W.-C., Srivastava, A., Shaw, T. I., Ruby, J. G., Skewes-Cox, P., Betegon,M., Dimon, M. T., Solovyev, V., Kosarev, P., Vorobyev, D., Ramirez-Gonzalez, R., Leggett, R.,MacLean, D., Xia, F., Luo, R., L, Z., Xie, Y., Liu, B., Gnerre, S., MacCallum, I., Przybylski, D.,Ribeiro, F. J., Yin, S., Sharpe, T., Hall, G., Kersey, P. J., Durbin, R., Jackman, S. D., Chapman,J. A., Huang, X., DeRisi, J. L., Caccamo, M., Li, Y., Jaffe, D. B., Green, R., Haussler, D., Korf, I.,and Paten, B. Assemblathon 1: A competitive assessment of de novo short read assembly methods.Genome Research (2011).

[J5] Koren, S, Treangen, T. J., and Pop, M. Bambus 2: scaffolding metagenomes. Bioinformatics 27, 21(2011), 2964–2971.

[J6] Salzberg, S. L., Phillippy, A. M., Zimin, A. V., Puiu, D., Magoc, T., Koren, S, Treangen, T., Schatz,M. C., Delcher, A. L., Roberts, M., Marcais, G., Pop, M., and Yorke, J. A. GAGE: a criticalevaluation of genome assemblies and assembly algorithms. Genome Research (2011).

[J7] Treangen, T. J., Sommer, D. D., Angly, F. E., Koren, S., and Pop, M. Next generation sequenceassembly with amos. Current Protocols in Bioinformatics 33 (2011), 11.8.1–11.8.18.

[J8] Koren, S., Miller, J., Walenz, B., and Sutton, G. An algorithm for automated closure duringassembly (highly accessed). BMC Bioinformatics 11, 1 (2010), 457.

[J9] Miller, J., Koren, S., and Sutton, G. Assembly algorithms for next-generation sequencing data.Genomics (2010).

[J10] Rausch, T., Koren, S., Denisov, G., Weese, D., Emde, A., Doring, A., and Reinert, K. A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads. Bioin-formatics 25, 9 (2009), 1118.

[J11] Miller, J., Delcher, A., Koren, S., Venter, E., Walenz, B., Brownley, A., Johnson, J., Li, K., Mo-barry, C., and Sutton, G. Aggressive assembly of pyrosequencing reads with mates. Bioinformatics24, 24 (2008), 2818.

2

Publications

Journal Articles

[J1] Koren S, Schatz, M. C., Walenz, B. P., Martin, J., Howard, J., Ganapathy, G., Wang, Z., Rasko,D. A., McCombie, W. R., Jarvis, E. D., and Phillippy, A. M. Hybrid error correction and de novoassembly of single-molecule sequencing reads. Nature Biotechnology In Review (2012).

[J2] Prüfer, K., Munch, K., Hellmann, I., Akagi, K., Miller, J. R., Walenz, B., Koren S, Sutton, G.,Kodira, C., Winer, R., Knight, J. R., Mullikin, J. C., Meader, S. J., Ponting, C. P., Lunter, G.,Higashino, S., Hobolth, A., Dutheil, J., Karakoç, E., Alkan, C., Sajjadian, S., Catacchio, C. R.,Ventura, M., Marques-Bonet, T., Eichler, E. E., AndrO, C., Atencia, R., Mugisha, L., Patterson,N., Siebauer, M., Good, J. M., Fischer, A., Ptak, S. E., Lachmann, M., Symer, D. E., Mailund, T.,Schierup, M. H., Andrés, A. M., Kelso, J., and Pääbo, S. The bonobo genome compared with thegenomes of chimpanzee and human. Nature In Review (2012).

[J3] Treangen*, T. J., Koren S*, Sommer, D., Astrovskaya, I., Liu, B., Darling, A. E., and Pop, M.metAMOS: A modular and open source metagenomic assembly and analysis pipeline. Genome BiologyIn Review (2012).

[J4] Earl, D. A., Bradnam, K., St. John, J., Darling, A., Lin, D., Faas, J., Yu, H. O. K., Vince, B., Zerbino,D. R., Diekhans, M., Nguyen, N., Nuwantha, P., Sung, A. W.-K., Ning, Z., Haimel, M., Simpson,J. T., Fronseca, N. A., Birol, Ä., Docking, T. R., Ho, I. Y., Rokhsar, D. S., Chikhi, R., Lavenier,D., Chapuis, G., Naquin, D., Maillet, N., Schatz, M. C., Kelly, D. R., Phillippy, A. M., Koren, S,Yang, S.-P., Wu, W., Chou, W.-C., Srivastava, A., Shaw, T. I., Ruby, J. G., Skewes-Cox, P., Betegon,M., Dimon, M. T., Solovyev, V., Kosarev, P., Vorobyev, D., Ramirez-Gonzalez, R., Leggett, R.,MacLean, D., Xia, F., Luo, R., L, Z., Xie, Y., Liu, B., Gnerre, S., MacCallum, I., Przybylski, D.,Ribeiro, F. J., Yin, S., Sharpe, T., Hall, G., Kersey, P. J., Durbin, R., Jackman, S. D., Chapman,J. A., Huang, X., DeRisi, J. L., Caccamo, M., Li, Y., Jaffe, D. B., Green, R., Haussler, D., Korf, I.,and Paten, B. Assemblathon 1: A competitive assessment of de novo short read assembly methods.Genome Research (2011).

[J5] Koren, S, Treangen, T. J., and Pop, M. Bambus 2: scaffolding metagenomes. Bioinformatics 27, 21(2011), 2964–2971.

[J6] Salzberg, S. L., Phillippy, A. M., Zimin, A. V., Puiu, D., Magoc, T., Koren, S, Treangen, T., Schatz,M. C., Delcher, A. L., Roberts, M., Marcais, G., Pop, M., and Yorke, J. A. GAGE: a criticalevaluation of genome assemblies and assembly algorithms. Genome Research (2011).

[J7] Treangen, T. J., Sommer, D. D., Angly, F. E., Koren, S., and Pop, M. Next generation sequenceassembly with amos. Current Protocols in Bioinformatics 33 (2011), 11.8.1–11.8.18.

[J8] Koren, S., Miller, J., Walenz, B., and Sutton, G. An algorithm for automated closure duringassembly (highly accessed). BMC Bioinformatics 11, 1 (2010), 457.

[J9] Miller, J., Koren, S., and Sutton, G. Assembly algorithms for next-generation sequencing data.Genomics (2010).

[J10] Rausch, T., Koren, S., Denisov, G., Weese, D., Emde, A., Doring, A., and Reinert, K. A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads. Bioin-formatics 25, 9 (2009), 1118.

[J11] Miller, J., Delcher, A., Koren, S., Venter, E., Walenz, B., Brownley, A., Johnson, J., Li, K., Mo-barry, C., and Sutton, G. Aggressive assembly of pyrosequencing reads with mates. Bioinformatics24, 24 (2008), 2818.

2

35  

Bonobo  Sequencing  Project  

GAGE:  Assembly  CompeHHon  

Surveys  

UMD  Pop  Lab  Mihai  Pop  

Lee  Mendelowitz  Todd  Treangen  Dan  Sommer  

Bo  Liu  Henry  Lin  

Mohammadreza  Ghodsi  Irina  Astrovskaya  Christopher  Hill  Chengxi  Ye  

Joseph  Paulson  

USDA   JCVI  Tim  Smith  

Gregory  Harhay    Dayna  Harhay  ScoQ  McVey  

Brian  P.  Walenz  Jason  R.  Miller  Granger  SuQon  

JGI   Duke  University  Jeffrey  MarHn  Zhong  Wang  

Jason  Howard  Ganeshkumar  Ganapathy  

Erich  D.  Jarvis  

CSHL   UMD  SOM  W.  Richard  McCombie  

Michael  Schatz  David  A.  Rasko  

NBACC   JHU  Adam  Phillippy  Brian  Ondov  

Steven  L.  Salzberg  Daniela  Puiu  Tanja  Magoc  

UC  Davis  Aaron  E.  Darling  

Acknowledgements  

36