genome&assembly:&novel&applica3ons& …sergek/presentations/thesistalk_final.pdf ·...
TRANSCRIPT
Genome Assembly: Novel Applica3ons by Harnessing Emerging Sequencing Technologies and Graph Algorithms
Sergey Koren PhD Thesis Defense March 16, 2012
Sequencing Output
DNA Data Tsunami
"Will Computers Crash Genomics?" Elizabeth Pennisi (2011) Science. 331(6018): 666-668.
Current world-wide sequencing capacity exceeds 13Pbp/year and is growing at 5x per year!
4 Will Computes Crash Genomics.
Pennisi, E (2011) Science.
Biological Goals • Disease outbreaks
– V. cholerae in HaiH – B. anthracis in Heroin users
• Learn what the cell is doing – The DNA transcribed into RNA to be translated to proteins
• Studying whole communiHes (metagenomics) – Human symbioHc bacteria – Ocean bacterial populaHon
• Studying the dark maQer – Studying individual cells (single-‐cell) 5
What is Assembly • Break target into pieces we can read • Convert sequence to a graph
– Requires idenHfying segments with shared origin – Sequences occurring mulHple Hmes (repeats) make this ambiguous
– Repeat sequences must be spanned with sufficient unique sequence to be unambiguous
• Find simple paths in the graph – Sequences have a direcHon so graph is bi-‐directed – Require no forks, otherwise graph is ambiguous
6
…AGCCTAGACCTACAGGATGCGCGACACGT GGATGCGCGACACGTCGCATATCCGGT… GGATGCGCGACACGTTAGCATAGCCTA… TTGCTC CCTACA
Basic AssumpHons • Equal representaHon of all posiHons of the target – A repeat sequence must be idenHfied for special handling to minimize error
– A repeat can be idenHfied using coverage
7
FormulaHon • NP-‐hard to find correct reconstrucHon
– Repeats introduce exponenHal number of paths
• NP-‐hard to chose orientaHon for nodes – Any cycle with an odd number of reverse-‐edges is un-‐resolvable
– Maximal BiparHte Subgraph – between forward and reverse nodes
• NP-‐hard to assign a posiHon to the nodes – OpHmal Linear Arrangement
(10,000bp ± 1,000bp)
(5,000bp ± 500bp)
8
Assembly ApplicaHon Example
11
3kb11kb18kb25kb33kb40kb48kb55kb62kb70kb77kb84kb92kb99kb107kb114kb121kb129kb136kb144kb151kb158kb166kb173kb180kb188kb195kb203kb210kb217kb225kb232kb240kb247kb254kb262kb269kb276kb284kb291kb299kb306kb313kb321kb328kb336kb343kb350kb358kb365kb372kb380kb387kb395kb402kb409kb417kb424kb432kb439kb446kb454kb461kb469kb476kb483kb491kb498kb505kb513kb520kb528kb535kb542kb550kb557kb565kb572kb579kb587kb594kb601kb609kb616kb624kb631kb638kb646kb653kb661kb668kb675kb683kb690kb697kb705kb712kb720kb727kb734kb742kb749kb757kb764kb771kb779kb786kb793kb801kb808kb816kb823kb830kb838kb845kb853kb860kb867kb875kb882kb890kb897kb904kb912kb919kb926kb934kb941kb949kb956kb963kb971kb978kb986kb993kb1000kb1008kb1015kb1022kb1030kb1037kb1045kb1052kb1059kb1067kb1074kb1082kb1089kb1096kb1104kb1111kb1118kb1126kb1133kb1141kb1148kb1155kb1163kb1170kb1178kb1185kb1192kb1200kb1207kb1214kb1222kb1229kb1237kb1244kb1251kb1259kb1266kb1274kb1281kb1288kb1296kb1303kb1311kb1318kb1325kb1333kb1340kb1347kb1355kb1362kb1370kb1377kb1384kb1392kb1399kb1407kb1414kb1421kb1429kb1436kb1443kb1451kb1458kb1466kb1473kb1480kb1488kb1495kb1503kb1510kb1517kb1525kb1532kb1539kb1547kb1554kb1562kb1569kb1576kb1584kb1591kb1599kb1606kb1613kb1621kb1628kb1635kb1643kb1650kb1658kb1665kb1672kb1680kb1687kb1695kb1702kb1709kb1717kb1724kb1732kb1739kb1746kb1754kb1761kb1768kb1776kb1783kb1791kb1798kb1805kb1813kb1820kb1828kb1835kb1842kb1850kb1857kb1864kb1872kb1879kb1887kb
3kb11kb18kb25kb33kb40kb48kb55kb62kb70kb77kb84kb92kb99kb107kb114kb121kb129kb136kb144kb151kb158kb166kb173kb180kb188kb195kb203kb210kb217kb225kb232kb240kb247kb254kb262kb269kb276kb284kb291kb299kb306kb313kb321kb328kb336kb343kb350kb358kb365kb372kb380kb387kb395kb402kb409kb417kb424kb432kb439kb446kb454kb461kb469kb476kb483kb491kb498kb505kb513kb520kb528kb535kb542kb550kb557kb565kb572kb579kb587kb594kb601kb609kb616kb624kb631kb638kb646kb653kb661kb668kb675kb683kb690kb697kb705kb712kb720kb727kb734kb742kb749kb757kb764kb771kb779kb786kb793kb801kb808kb816kb823kb830kb838kb845kb853kb860kb867kb875kb882kb890kb897kb904kb912kb919kb926kb934kb941kb949kb956kb963kb971kb978kb986kb993kb1000kb1008kb1015kb1022kb1030kb1037kb1045kb1052kb1059kb1067kb1074kb1082kb1089kb1096kb1104kb1111kb1118kb1126kb1133kb1141kb1148kb1155kb1163kb1170kb1178kb1185kb1192kb1200kb1207kb1214kb1222kb1229kb1237kb1244kb1251kb1259kb1266kb1274kb1281kb1288kb1296kb1303kb1311kb1318kb1325kb1333kb1340kb1347kb1355kb1362kb1370kb1377kb1384kb1392kb1399kb1407kb1414kb1421kb1429kb1436kb1443kb1451kb1458kb1466kb1473kb1480kb1488kb1495kb1503kb1510kb1517kb1525kb1532kb1539kb1547kb1554kb1562kb1569kb1576kb1584kb1591kb1599kb1606kb1613kb1621kb1628kb1635kb1643kb1650kb1658kb1665kb1672kb1680kb1687kb1695kb1702kb1709kb1717kb1724kb1732kb1739kb1746kb1754kb1761kb1768kb1776kb1783kb1791kb1798kb1805kb1813kb1820kb1828kb1835kb1842kb1850kb1857kb1864kb1872kb1879kb1887kb
3kb11kb18kb25kb33kb40kb48kb55kb62kb70kb77kb84kb92kb99kb107kb114kb121kb129kb136kb144kb151kb158kb166kb173kb180kb188kb195kb203kb210kb217kb225kb232kb240kb247kb254kb262kb269kb276kb284kb291kb299kb306kb313kb321kb328kb336kb343kb350kb358kb365kb372kb380kb387kb395kb402kb409kb417kb424kb432kb439kb446kb454kb461kb469kb476kb483kb491kb498kb505kb513kb520kb528kb535kb542kb550kb557kb565kb572kb579kb587kb594kb601kb609kb616kb624kb631kb638kb646kb653kb661kb668kb675kb683kb690kb697kb705kb712kb720kb727kb734kb742kb749kb757kb764kb771kb779kb786kb793kb801kb808kb816kb823kb830kb838kb845kb853kb860kb867kb875kb882kb890kb897kb904kb912kb919kb926kb934kb941kb949kb956kb963kb971kb978kb986kb993kb1000kb1008kb1015kb1022kb1030kb1037kb1045kb1052kb1059kb1067kb1074kb1082kb1089kb1096kb1104kb1111kb1118kb1126kb1133kb1141kb1148kb1155kb1163kb1170kb1178kb1185kb1192kb1200kb1207kb1214kb1222kb1229kb1237kb1244kb1251kb1259kb1266kb1274kb1281kb1288kb1296kb1303kb1311kb1318kb1325kb1333kb1340kb1347kb1355kb1362kb1370kb1377kb1384kb1392kb1399kb1407kb1414kb1421kb1429kb1436kb1443kb1451kb1458kb1466kb1473kb1480kb1488kb1495kb1503kb1510kb1517kb1525kb1532kb1539kb1547kb1554kb1562kb1569kb1576kb1584kb1591kb1599kb1606kb1613kb1621kb1628kb1635kb1643kb1650kb1658kb1665kb1672kb1680kb1687kb1695kb1702kb1709kb1717kb1724kb1732kb1739kb1746kb1754kb1761kb1768kb1776kb1783kb1791kb1798kb1805kb1813kb1820kb1828kb1835kb1842kb1850kb1857kb1864kb1872kb1879kb1887kb
3kb11kb18kb25kb33kb40kb48kb55kb62kb70kb77kb84kb92kb99kb107kb114kb121kb129kb136kb144kb151kb158kb166kb173kb180kb188kb195kb203kb210kb217kb225kb232kb240kb247kb254kb262kb269kb276kb284kb291kb299kb306kb313kb321kb328kb336kb343kb350kb358kb365kb372kb380kb387kb395kb402kb409kb417kb424kb432kb439kb446kb454kb461kb469kb476kb483kb491kb498kb505kb513kb520kb528kb535kb542kb550kb557kb565kb572kb579kb587kb594kb601kb609kb616kb624kb631kb638kb646kb653kb661kb668kb675kb683kb690kb697kb705kb712kb720kb727kb734kb742kb749kb757kb764kb771kb779kb786kb793kb801kb808kb816kb823kb830kb838kb845kb853kb860kb867kb875kb882kb890kb897kb904kb912kb919kb926kb934kb941kb949kb956kb963kb971kb978kb986kb993kb1000kb1008kb1015kb1022kb1030kb1037kb1045kb1052kb1059kb1067kb1074kb1082kb1089kb1096kb1104kb1111kb1118kb1126kb1133kb1141kb1148kb1155kb1163kb1170kb1178kb1185kb1192kb1200kb1207kb1214kb1222kb1229kb1237kb1244kb1251kb1259kb1266kb1274kb1281kb1288kb1296kb1303kb1311kb1318kb1325kb1333kb1340kb1347kb1355kb1362kb1370kb1377kb1384kb1392kb1399kb1407kb1414kb1421kb1429kb1436kb1443kb1451kb1458kb1466kb1473kb1480kb1488kb1495kb1503kb1510kb1517kb1525kb1532kb1539kb1547kb1554kb1562kb1569kb1576kb1584kb1591kb1599kb1606kb1613kb1621kb1628kb1635kb1643kb1650kb1658kb1665kb1672kb1680kb1687kb1695kb1702kb1709kb1717kb1724kb1732kb1739kb1746kb1754kb1761kb1768kb1776kb1783kb1791kb1798kb1805kb1813kb1820kb1828kb1835kb1842kb1850kb1857kb1864kb1872kb1879kb1887kb
3kb11kb18kb25kb33kb40kb48kb55kb62kb70kb77kb84kb92kb99kb107kb114kb121kb129kb136kb144kb151kb158kb166kb173kb180kb188kb195kb203kb210kb217kb225kb232kb240kb247kb254kb262kb269kb276kb284kb291kb299kb306kb313kb321kb328kb336kb343kb350kb358kb365kb372kb380kb387kb395kb402kb409kb417kb424kb432kb439kb446kb454kb461kb469kb476kb483kb491kb498kb505kb513kb520kb528kb535kb542kb550kb557kb565kb572kb579kb587kb594kb601kb609kb616kb624kb631kb638kb646kb653kb661kb668kb675kb683kb690kb697kb705kb712kb720kb727kb734kb742kb749kb757kb764kb771kb779kb786kb793kb801kb808kb816kb823kb830kb838kb845kb853kb860kb867kb875kb882kb890kb897kb904kb912kb919kb926kb934kb941kb949kb956kb963kb971kb978kb986kb993kb1000kb1008kb1015kb1022kb1030kb1037kb1045kb1052kb1059kb1067kb1074kb1082kb1089kb1096kb1104kb1111kb1118kb1126kb1133kb1141kb1148kb1155kb1163kb1170kb1178kb1185kb1192kb1200kb1207kb1214kb1222kb1229kb1237kb1244kb1251kb1259kb1266kb1274kb1281kb1288kb1296kb1303kb1311kb1318kb1325kb1333kb1340kb1347kb1355kb1362kb1370kb1377kb1384kb1392kb1399kb1407kb1414kb1421kb1429kb1436kb1443kb1451kb1458kb1466kb1473kb1480kb1488kb1495kb1503kb1510kb1517kb1525kb1532kb1539kb1547kb1554kb1562kb1569kb1576kb1584kb1591kb1599kb1606kb1613kb1621kb1628kb1635kb1643kb1650kb1658kb1665kb1672kb1680kb1687kb1695kb1702kb1709kb1717kb1724kb1732kb1739kb1746kb1754kb1761kb1768kb1776kb1783kb1791kb1798kb1805kb1813kb1820kb1828kb1835kb1842kb1850kb1857kb1864kb1872kb1879kb1887kb
3kb11kb18kb25kb33kb40kb48kb55kb62kb70kb77kb84kb92kb99kb107kb114kb121kb129kb136kb144kb151kb158kb166kb173kb180kb188kb195kb203kb210kb217kb225kb232kb240kb247kb254kb262kb269kb276kb284kb291kb299kb306kb313kb321kb328kb336kb343kb350kb358kb365kb372kb380kb387kb395kb402kb409kb417kb424kb432kb439kb446kb454kb461kb469kb476kb483kb491kb498kb505kb513kb520kb528kb535kb542kb550kb557kb565kb572kb579kb587kb594kb601kb609kb616kb624kb631kb638kb646kb653kb661kb668kb675kb683kb690kb697kb705kb712kb720kb727kb734kb742kb749kb757kb764kb771kb779kb786kb793kb801kb808kb816kb823kb830kb838kb845kb853kb860kb867kb875kb882kb890kb897kb904kb912kb919kb926kb934kb941kb949kb956kb963kb971kb978kb986kb993kb1000kb1008kb1015kb1022kb1030kb1037kb1045kb1052kb1059kb1067kb1074kb1082kb1089kb1096kb1104kb1111kb1118kb1126kb1133kb1141kb1148kb1155kb1163kb1170kb1178kb1185kb1192kb1200kb1207kb1214kb1222kb1229kb1237kb1244kb1251kb1259kb1266kb1274kb1281kb1288kb1296kb1303kb1311kb1318kb1325kb1333kb1340kb1347kb1355kb1362kb1370kb1377kb1384kb1392kb1399kb1407kb1414kb1421kb1429kb1436kb1443kb1451kb1458kb1466kb1473kb1480kb1488kb1495kb1503kb1510kb1517kb1525kb1532kb1539kb1547kb1554kb1562kb1569kb1576kb1584kb1591kb1599kb1606kb1613kb1621kb1628kb1635kb1643kb1650kb1658kb1665kb1672kb1680kb1687kb1695kb1702kb1709kb1717kb1724kb1732kb1739kb1746kb1754kb1761kb1768kb1776kb1783kb1791kb1798kb1805kb1813kb1820kb1828kb1835kb1842kb1850kb1857kb1864kb1872kb1879kb1887kb
3kb11kb18kb25kb33kb40kb48kb55kb62kb70kb77kb84kb92kb99kb107kb114kb121kb129kb136kb144kb151kb158kb166kb173kb180kb188kb195kb203kb210kb217kb225kb232kb240kb247kb254kb262kb269kb276kb284kb291kb299kb306kb313kb321kb328kb336kb343kb350kb358kb365kb372kb380kb387kb395kb402kb409kb417kb424kb432kb439kb446kb454kb461kb469kb476kb483kb491kb498kb505kb513kb520kb528kb535kb542kb550kb557kb565kb572kb579kb587kb594kb601kb609kb616kb624kb631kb638kb646kb653kb661kb668kb675kb683kb690kb697kb705kb712kb720kb727kb734kb742kb749kb757kb764kb771kb779kb786kb793kb801kb808kb816kb823kb830kb838kb845kb853kb860kb867kb875kb882kb890kb897kb904kb912kb919kb926kb934kb941kb949kb956kb963kb971kb978kb986kb993kb1000kb1008kb1015kb1022kb1030kb1037kb1045kb1052kb1059kb1067kb1074kb1082kb1089kb1096kb1104kb1111kb1118kb1126kb1133kb1141kb1148kb1155kb1163kb1170kb1178kb1185kb1192kb1200kb1207kb1214kb1222kb1229kb1237kb1244kb1251kb1259kb1266kb1274kb1281kb1288kb1296kb1303kb1311kb1318kb1325kb1333kb1340kb1347kb1355kb1362kb1370kb1377kb1384kb1392kb1399kb1407kb1414kb1421kb1429kb1436kb1443kb1451kb1458kb1466kb1473kb1480kb1488kb1495kb1503kb1510kb1517kb1525kb1532kb1539kb1547kb1554kb1562kb1569kb1576kb1584kb1591kb1599kb1606kb1613kb1621kb1628kb1635kb1643kb1650kb1658kb1665kb1672kb1680kb1687kb1695kb1702kb1709kb1717kb1724kb1732kb1739kb1746kb1754kb1761kb1768kb1776kb1783kb1791kb1798kb1805kb1813kb1820kb1828kb1835kb1842kb1850kb1857kb1864kb1872kb1879kb1887kb
3kb11kb18kb25kb33kb40kb48kb55kb62kb70kb77kb84kb92kb99kb107kb114kb121kb129kb136kb144kb151kb158kb166kb173kb180kb188kb195kb203kb210kb217kb225kb232kb240kb247kb254kb262kb269kb276kb284kb291kb299kb306kb313kb321kb328kb336kb343kb350kb358kb365kb372kb380kb387kb395kb402kb409kb417kb424kb432kb439kb446kb454kb461kb469kb476kb483kb491kb498kb505kb513kb520kb528kb535kb542kb550kb557kb565kb572kb579kb587kb594kb601kb609kb616kb624kb631kb638kb646kb653kb661kb668kb675kb683kb690kb697kb705kb712kb720kb727kb734kb742kb749kb757kb764kb771kb779kb786kb793kb801kb808kb816kb823kb830kb838kb845kb853kb860kb867kb875kb882kb890kb897kb904kb912kb919kb926kb934kb941kb949kb956kb963kb971kb978kb986kb993kb1000kb1008kb1015kb1022kb1030kb1037kb1045kb1052kb1059kb1067kb1074kb1082kb1089kb1096kb1104kb1111kb1118kb1126kb1133kb1141kb1148kb1155kb1163kb1170kb1178kb1185kb1192kb1200kb1207kb1214kb1222kb1229kb1237kb1244kb1251kb1259kb1266kb1274kb1281kb1288kb1296kb1303kb1311kb1318kb1325kb1333kb1340kb1347kb1355kb1362kb1370kb1377kb1384kb1392kb1399kb1407kb1414kb1421kb1429kb1436kb1443kb1451kb1458kb1466kb1473kb1480kb1488kb1495kb1503kb1510kb1517kb1525kb1532kb1539kb1547kb1554kb1562kb1569kb1576kb1584kb1591kb1599kb1606kb1613kb1621kb1628kb1635kb1643kb1650kb1658kb1665kb1672kb1680kb1687kb1695kb1702kb1709kb1717kb1724kb1732kb1739kb1746kb1754kb1761kb1768kb1776kb1783kb1791kb1798kb1805kb1813kb1820kb1828kb1835kb1842kb1850kb1857kb1864kb1872kb1879kb1887kb
3kb11kb18kb25kb33kb40kb48kb55kb62kb70kb77kb84kb92kb99kb107kb114kb121kb129kb136kb144kb151kb158kb166kb173kb180kb188kb195kb203kb210kb217kb225kb232kb240kb247kb254kb262kb269kb276kb284kb291kb299kb306kb313kb321kb328kb336kb343kb350kb358kb365kb372kb380kb387kb395kb402kb409kb417kb424kb432kb439kb446kb454kb461kb469kb476kb483kb491kb498kb505kb513kb520kb528kb535kb542kb550kb557kb565kb572kb579kb587kb594kb601kb609kb616kb624kb631kb638kb646kb653kb661kb668kb675kb683kb690kb697kb705kb712kb720kb727kb734kb742kb749kb757kb764kb771kb779kb786kb793kb801kb808kb816kb823kb830kb838kb845kb853kb860kb867kb875kb882kb890kb897kb904kb912kb919kb926kb934kb941kb949kb956kb963kb971kb978kb986kb993kb1000kb1008kb1015kb1022kb1030kb1037kb1045kb1052kb1059kb1067kb1074kb1082kb1089kb1096kb1104kb1111kb1118kb1126kb1133kb1141kb1148kb1155kb1163kb1170kb1178kb1185kb1192kb1200kb1207kb1214kb1222kb1229kb1237kb1244kb1251kb1259kb1266kb1274kb1281kb1288kb1296kb1303kb1311kb1318kb1325kb1333kb1340kb1347kb1355kb1362kb1370kb1377kb1384kb1392kb1399kb1407kb1414kb1421kb1429kb1436kb1443kb1451kb1458kb1466kb1473kb1480kb1488kb1495kb1503kb1510kb1517kb1525kb1532kb1539kb1547kb1554kb1562kb1569kb1576kb1584kb1591kb1599kb1606kb1613kb1621kb1628kb1635kb1643kb1650kb1658kb1665kb1672kb1680kb1687kb1695kb1702kb1709kb1717kb1724kb1732kb1739kb1746kb1754kb1761kb1768kb1776kb1783kb1791kb1798kb1805kb1813kb1820kb1828kb1835kb1842kb1850kb1857kb1864kb1872kb1879kb1887kb
3kb11kb18kb25kb33kb40kb48kb55kb62kb70kb77kb84kb92kb99kb107kb114kb121kb129kb136kb144kb151kb158kb166kb173kb180kb188kb195kb203kb210kb217kb225kb232kb240kb247kb254kb262kb269kb276kb284kb291kb299kb306kb313kb321kb328kb336kb343kb350kb358kb365kb372kb380kb387kb395kb402kb409kb417kb424kb432kb439kb446kb454kb461kb469kb476kb483kb491kb498kb505kb513kb520kb528kb535kb542kb550kb557kb565kb572kb579kb587kb594kb601kb609kb616kb624kb631kb638kb646kb653kb661kb668kb675kb683kb690kb697kb705kb712kb720kb727kb734kb742kb749kb757kb764kb771kb779kb786kb793kb801kb808kb816kb823kb830kb838kb845kb853kb860kb867kb875kb882kb890kb897kb904kb912kb919kb926kb934kb941kb949kb956kb963kb971kb978kb986kb993kb1000kb1008kb1015kb1022kb1030kb1037kb1045kb1052kb1059kb1067kb1074kb1082kb1089kb1096kb1104kb1111kb1118kb1126kb1133kb1141kb1148kb1155kb1163kb1170kb1178kb1185kb1192kb1200kb1207kb1214kb1222kb1229kb1237kb1244kb1251kb1259kb1266kb1274kb1281kb1288kb1296kb1303kb1311kb1318kb1325kb1333kb1340kb1347kb1355kb1362kb1370kb1377kb1384kb1392kb1399kb1407kb1414kb1421kb1429kb1436kb1443kb1451kb1458kb1466kb1473kb1480kb1488kb1495kb1503kb1510kb1517kb1525kb1532kb1539kb1547kb1554kb1562kb1569kb1576kb1584kb1591kb1599kb1606kb1613kb1621kb1628kb1635kb1643kb1650kb1658kb1665kb1672kb1680kb1687kb1695kb1702kb1709kb1717kb1724kb1732kb1739kb1746kb1754kb1761kb1768kb1776kb1783kb1791kb1798kb1805kb1813kb1820kb1828kb1835kb1842kb1850kb1857kb1864kb1872kb1879kb1887kb
3kb11kb18kb25kb33kb40kb48kb55kb62kb70kb77kb84kb92kb99kb107kb114kb121kb129kb136kb144kb151kb158kb166kb173kb180kb188kb195kb203kb210kb217kb225kb232kb240kb247kb254kb262kb269kb276kb284kb291kb299kb306kb313kb321kb328kb336kb343kb350kb358kb365kb372kb380kb387kb395kb402kb409kb417kb424kb432kb439kb446kb454kb461kb469kb476kb483kb491kb498kb505kb513kb520kb528kb535kb542kb550kb557kb565kb572kb579kb587kb594kb601kb609kb616kb624kb631kb638kb646kb653kb661kb668kb675kb683kb690kb697kb705kb712kb720kb727kb734kb742kb749kb757kb764kb771kb779kb786kb793kb801kb808kb816kb823kb830kb838kb845kb853kb860kb867kb875kb882kb890kb897kb904kb912kb919kb926kb934kb941kb949kb956kb963kb971kb978kb986kb993kb1000kb1008kb1015kb1022kb1030kb1037kb1045kb1052kb1059kb1067kb1074kb1082kb1089kb1096kb1104kb1111kb1118kb1126kb1133kb1141kb1148kb1155kb1163kb1170kb1178kb1185kb1192kb1200kb1207kb1214kb1222kb1229kb1237kb1244kb1251kb1259kb1266kb1274kb1281kb1288kb1296kb1303kb1311kb1318kb1325kb1333kb1340kb1347kb1355kb1362kb1370kb1377kb1384kb1392kb1399kb1407kb1414kb1421kb1429kb1436kb1443kb1451kb1458kb1466kb1473kb1480kb1488kb1495kb1503kb1510kb1517kb1525kb1532kb1539kb1547kb1554kb1562kb1569kb1576kb1584kb1591kb1599kb1606kb1613kb1621kb1628kb1635kb1643kb1650kb1658kb1665kb1672kb1680kb1687kb1695kb1702kb1709kb1717kb1724kb1732kb1739kb1746kb1754kb1761kb1768kb1776kb1783kb1791kb1798kb1805kb1813kb1820kb1828kb1835kb1842kb1850kb1857kb1864kb1872kb1879kb1887kb
7kb22kb36kb51kb66kb81kb96kb110kb125kb140kb155kb169kb184kb199kb214kb228kb243kb258kb273kb288kb302kb317kb332kb347kb361kb376kb391kb406kb421kb435kb450kb465kb480kb494kb509kb524kb539kb553kb568kb583kb598kb613kb627kb642kb657kb672kb686kb701kb716kb731kb745kb760kb775kb790kb805kb819kb834kb849kb864kb878kb893kb908kb923kb938kb952kb967kb982kb997kb1011kb1026kb1041kb1056kb1070kb1085kb1100kb1115kb1130kb1144kb1159kb1174kb1189kb1203kb1218kb1233kb1248kb1263kb1277kb1292kb1307kb1322kb1336kb1351kb1366kb1381kb1395kb1410kb1425kb1440kb1455kb1469kb1484kb1499kb1514kb1528kb1543kb1558kb1573kb1587kb1602kb1617kb1632kb1647kb1661kb1676kb1691kb1706kb1720kb1735kb1750kb1765kb1780kb1794kb1809kb1824kb1839kb1853kb1868kb1883kb
7kb22kb36kb51kb66kb81kb96kb110kb125kb140kb155kb169kb184kb199kb214kb228kb243kb258kb273kb288kb302kb317kb332kb347kb361kb376kb391kb406kb421kb435kb450kb465kb480kb494kb509kb524kb539kb553kb568kb583kb598kb613kb627kb642kb657kb672kb686kb701kb716kb731kb745kb760kb775kb790kb805kb819kb834kb849kb864kb878kb893kb908kb923kb938kb952kb967kb982kb997kb1011kb1026kb1041kb1056kb1070kb1085kb1100kb1115kb1130kb1144kb1159kb1174kb1189kb1203kb1218kb1233kb1248kb1263kb1277kb1292kb1307kb1322kb1336kb1351kb1366kb1381kb1395kb1410kb1425kb1440kb1455kb1469kb1484kb1499kb1514kb1528kb1543kb1558kb1573kb1587kb1602kb1617kb1632kb1647kb1661kb1676kb1691kb1706kb1720kb1735kb1750kb1765kb1780kb1794kb1809kb1824kb1839kb1853kb1868kb1883kb
7kb22kb36kb51kb66kb81kb96kb110kb125kb140kb155kb169kb184kb199kb214kb228kb243kb258kb273kb288kb302kb317kb332kb347kb361kb376kb391kb406kb421kb435kb450kb465kb480kb494kb509kb524kb539kb553kb568kb583kb598kb613kb627kb642kb657kb672kb686kb701kb716kb731kb745kb760kb775kb790kb805kb819kb834kb849kb864kb878kb893kb908kb923kb938kb952kb967kb982kb997kb1011kb1026kb1041kb1056kb1070kb1085kb1100kb1115kb1130kb1144kb1159kb1174kb1189kb1203kb1218kb1233kb1248kb1263kb1277kb1292kb1307kb1322kb1336kb1351kb1366kb1381kb1395kb1410kb1425kb1440kb1455kb1469kb1484kb1499kb1514kb1528kb1543kb1558kb1573kb1587kb1602kb1617kb1632kb1647kb1661kb1676kb1691kb1706kb1720kb1735kb1750kb1765kb1780kb1794kb1809kb1824kb1839kb1853kb1868kb1883kb
7kb22kb36kb51kb66kb81kb96kb110kb125kb140kb155kb169kb184kb199kb214kb228kb243kb258kb273kb288kb302kb317kb332kb347kb361kb376kb391kb406kb421kb435kb450kb465kb480kb494kb509kb524kb539kb553kb568kb583kb598kb613kb627kb642kb657kb672kb686kb701kb716kb731kb745kb760kb775kb790kb805kb819kb834kb849kb864kb878kb893kb908kb923kb938kb952kb967kb982kb997kb1011kb1026kb1041kb1056kb1070kb1085kb1100kb1115kb1130kb1144kb1159kb1174kb1189kb1203kb1218kb1233kb1248kb1263kb1277kb1292kb1307kb1322kb1336kb1351kb1366kb1381kb1395kb1410kb1425kb1440kb1455kb1469kb1484kb1499kb1514kb1528kb1543kb1558kb1573kb1587kb1602kb1617kb1632kb1647kb1661kb1676kb1691kb1706kb1720kb1735kb1750kb1765kb1780kb1794kb1809kb1824kb1839kb1853kb1868kb1883kb
7kb22kb36kb51kb66kb81kb96kb110kb125kb140kb155kb169kb184kb199kb214kb228kb243kb258kb273kb288kb302kb317kb332kb347kb361kb376kb391kb406kb421kb435kb450kb465kb480kb494kb509kb524kb539kb553kb568kb583kb598kb613kb627kb642kb657kb672kb686kb701kb716kb731kb745kb760kb775kb790kb805kb819kb834kb849kb864kb878kb893kb908kb923kb938kb952kb967kb982kb997kb1011kb1026kb1041kb1056kb1070kb1085kb1100kb1115kb1130kb1144kb1159kb1174kb1189kb1203kb1218kb1233kb1248kb1263kb1277kb1292kb1307kb1322kb1336kb1351kb1366kb1381kb1395kb1410kb1425kb1440kb1455kb1469kb1484kb1499kb1514kb1528kb1543kb1558kb1573kb1587kb1602kb1617kb1632kb1647kb1661kb1676kb1691kb1706kb1720kb1735kb1750kb1765kb1780kb1794kb1809kb1824kb1839kb1853kb1868kb1883kb
7kb22kb36kb51kb66kb81kb96kb110kb125kb140kb155kb169kb184kb199kb214kb228kb243kb258kb273kb288kb302kb317kb332kb347kb361kb376kb391kb406kb421kb435kb450kb465kb480kb494kb509kb524kb539kb553kb568kb583kb598kb613kb627kb642kb657kb672kb686kb701kb716kb731kb745kb760kb775kb790kb805kb819kb834kb849kb864kb878kb893kb908kb923kb938kb952kb967kb982kb997kb1011kb1026kb1041kb1056kb1070kb1085kb1100kb1115kb1130kb1144kb1159kb1174kb1189kb1203kb1218kb1233kb1248kb1263kb1277kb1292kb1307kb1322kb1336kb1351kb1366kb1381kb1395kb1410kb1425kb1440kb1455kb1469kb1484kb1499kb1514kb1528kb1543kb1558kb1573kb1587kb1602kb1617kb1632kb1647kb1661kb1676kb1691kb1706kb1720kb1735kb1750kb1765kb1780kb1794kb1809kb1824kb1839kb1853kb1868kb1883kb
7kb22kb36kb51kb66kb81kb96kb110kb125kb140kb155kb169kb184kb199kb214kb228kb243kb258kb273kb288kb302kb317kb332kb347kb361kb376kb391kb406kb421kb435kb450kb465kb480kb494kb509kb524kb539kb553kb568kb583kb598kb613kb627kb642kb657kb672kb686kb701kb716kb731kb745kb760kb775kb790kb805kb819kb834kb849kb864kb878kb893kb908kb923kb938kb952kb967kb982kb997kb1011kb1026kb1041kb1056kb1070kb1085kb1100kb1115kb1130kb1144kb1159kb1174kb1189kb1203kb1218kb1233kb1248kb1263kb1277kb1292kb1307kb1322kb1336kb1351kb1366kb1381kb1395kb1410kb1425kb1440kb1455kb1469kb1484kb1499kb1514kb1528kb1543kb1558kb1573kb1587kb1602kb1617kb1632kb1647kb1661kb1676kb1691kb1706kb1720kb1735kb1750kb1765kb1780kb1794kb1809kb1824kb1839kb1853kb1868kb1883kb
7kb22kb36kb51kb66kb81kb96kb110kb125kb140kb155kb169kb184kb199kb214kb228kb243kb258kb273kb288kb302kb317kb332kb347kb361kb376kb391kb406kb421kb435kb450kb465kb480kb494kb509kb524kb539kb553kb568kb583kb598kb613kb627kb642kb657kb672kb686kb701kb716kb731kb745kb760kb775kb790kb805kb819kb834kb849kb864kb878kb893kb908kb923kb938kb952kb967kb982kb997kb1011kb1026kb1041kb1056kb1070kb1085kb1100kb1115kb1130kb1144kb1159kb1174kb1189kb1203kb1218kb1233kb1248kb1263kb1277kb1292kb1307kb1322kb1336kb1351kb1366kb1381kb1395kb1410kb1425kb1440kb1455kb1469kb1484kb1499kb1514kb1528kb1543kb1558kb1573kb1587kb1602kb1617kb1632kb1647kb1661kb1676kb1691kb1706kb1720kb1735kb1750kb1765kb1780kb1794kb1809kb1824kb1839kb1853kb1868kb1883kb
7kb22kb36kb51kb66kb81kb96kb110kb125kb140kb155kb169kb184kb199kb214kb228kb243kb258kb273kb288kb302kb317kb332kb347kb361kb376kb391kb406kb421kb435kb450kb465kb480kb494kb509kb524kb539kb553kb568kb583kb598kb613kb627kb642kb657kb672kb686kb701kb716kb731kb745kb760kb775kb790kb805kb819kb834kb849kb864kb878kb893kb908kb923kb938kb952kb967kb982kb997kb1011kb1026kb1041kb1056kb1070kb1085kb1100kb1115kb1130kb1144kb1159kb1174kb1189kb1203kb1218kb1233kb1248kb1263kb1277kb1292kb1307kb1322kb1336kb1351kb1366kb1381kb1395kb1410kb1425kb1440kb1455kb1469kb1484kb1499kb1514kb1528kb1543kb1558kb1573kb1587kb1602kb1617kb1632kb1647kb1661kb1676kb1691kb1706kb1720kb1735kb1750kb1765kb1780kb1794kb1809kb1824kb1839kb1853kb1868kb1883kb
7kb22kb36kb51kb66kb81kb96kb110kb125kb140kb155kb169kb184kb199kb214kb228kb243kb258kb273kb288kb302kb317kb332kb347kb361kb376kb391kb406kb421kb435kb450kb465kb480kb494kb509kb524kb539kb553kb568kb583kb598kb613kb627kb642kb657kb672kb686kb701kb716kb731kb745kb760kb775kb790kb805kb819kb834kb849kb864kb878kb893kb908kb923kb938kb952kb967kb982kb997kb1011kb1026kb1041kb1056kb1070kb1085kb1100kb1115kb1130kb1144kb1159kb1174kb1189kb1203kb1218kb1233kb1248kb1263kb1277kb1292kb1307kb1322kb1336kb1351kb1366kb1381kb1395kb1410kb1425kb1440kb1455kb1469kb1484kb1499kb1514kb1528kb1543kb1558kb1573kb1587kb1602kb1617kb1632kb1647kb1661kb1676kb1691kb1706kb1720kb1735kb1750kb1765kb1780kb1794kb1809kb1824kb1839kb1853kb1868kb1883kb
7kb22kb36kb51kb66kb81kb96kb110kb125kb140kb155kb169kb184kb199kb214kb228kb243kb258kb273kb288kb302kb317kb332kb347kb361kb376kb391kb406kb421kb435kb450kb465kb480kb494kb509kb524kb539kb553kb568kb583kb598kb613kb627kb642kb657kb672kb686kb701kb716kb731kb745kb760kb775kb790kb805kb819kb834kb849kb864kb878kb893kb908kb923kb938kb952kb967kb982kb997kb1011kb1026kb1041kb1056kb1070kb1085kb1100kb1115kb1130kb1144kb1159kb1174kb1189kb1203kb1218kb1233kb1248kb1263kb1277kb1292kb1307kb1322kb1336kb1351kb1366kb1381kb1395kb1410kb1425kb1440kb1455kb1469kb1484kb1499kb1514kb1528kb1543kb1558kb1573kb1587kb1602kb1617kb1632kb1647kb1661kb1676kb1691kb1706kb1720kb1735kb1750kb1765kb1780kb1794kb1809kb1824kb1839kb1853kb1868kb1883kb
14kb44kb73kb103kb132kb162kb192kb221kb251kb280kb310kb339kb369kb398kb428kb457kb487kb517kb546kb576kb605kb635kb664kb694kb723kb753kb782kb812kb842kb871kb901kb930kb960kb989kb1019kb1048kb1078kb1107kb1137kb1166kb1196kb1226kb1255kb1285kb1314kb1344kb1373kb1403kb1432kb1462kb1491kb1521kb1551kb1580kb1610kb1639kb1669kb1698kb1728kb1757kb1787kb1816kb1846kb1876kb
14kb44kb73kb103kb132kb162kb192kb221kb251kb280kb310kb339kb369kb398kb428kb457kb487kb517kb546kb576kb605kb635kb664kb694kb723kb753kb782kb812kb842kb871kb901kb930kb960kb989kb1019kb1048kb1078kb1107kb1137kb1166kb1196kb1226kb1255kb1285kb1314kb1344kb1373kb1403kb1432kb1462kb1491kb1521kb1551kb1580kb1610kb1639kb1669kb1698kb1728kb1757kb1787kb1816kb1846kb1876kb
14kb44kb73kb103kb132kb162kb192kb221kb251kb280kb310kb339kb369kb398kb428kb457kb487kb517kb546kb576kb605kb635kb664kb694kb723kb753kb782kb812kb842kb871kb901kb930kb960kb989kb1019kb1048kb1078kb1107kb1137kb1166kb1196kb1226kb1255kb1285kb1314kb1344kb1373kb1403kb1432kb1462kb1491kb1521kb1551kb1580kb1610kb1639kb1669kb1698kb1728kb1757kb1787kb1816kb1846kb1876kb
14kb44kb73kb103kb132kb162kb192kb221kb251kb280kb310kb339kb369kb398kb428kb457kb487kb517kb546kb576kb605kb635kb664kb694kb723kb753kb782kb812kb842kb871kb901kb930kb960kb989kb1019kb1048kb1078kb1107kb1137kb1166kb1196kb1226kb1255kb1285kb1314kb1344kb1373kb1403kb1432kb1462kb1491kb1521kb1551kb1580kb1610kb1639kb1669kb1698kb1728kb1757kb1787kb1816kb1846kb1876kb
14kb44kb73kb103kb132kb162kb192kb221kb251kb280kb310kb339kb369kb398kb428kb457kb487kb517kb546kb576kb605kb635kb664kb694kb723kb753kb782kb812kb842kb871kb901kb930kb960kb989kb1019kb1048kb1078kb1107kb1137kb1166kb1196kb1226kb1255kb1285kb1314kb1344kb1373kb1403kb1432kb1462kb1491kb1521kb1551kb1580kb1610kb1639kb1669kb1698kb1728kb1757kb1787kb1816kb1846kb1876kb
14kb44kb73kb103kb132kb162kb192kb221kb251kb280kb310kb339kb369kb398kb428kb457kb487kb517kb546kb576kb605kb635kb664kb694kb723kb753kb782kb812kb842kb871kb901kb930kb960kb989kb1019kb1048kb1078kb1107kb1137kb1166kb1196kb1226kb1255kb1285kb1314kb1344kb1373kb1403kb1432kb1462kb1491kb1521kb1551kb1580kb1610kb1639kb1669kb1698kb1728kb1757kb1787kb1816kb1846kb1876kb
14kb44kb73kb103kb132kb162kb192kb221kb251kb280kb310kb339kb369kb398kb428kb457kb487kb517kb546kb576kb605kb635kb664kb694kb723kb753kb782kb812kb842kb871kb901kb930kb960kb989kb1019kb1048kb1078kb1107kb1137kb1166kb1196kb1226kb1255kb1285kb1314kb1344kb1373kb1403kb1432kb1462kb1491kb1521kb1551kb1580kb1610kb1639kb1669kb1698kb1728kb1757kb1787kb1816kb1846kb1876kb
14kb44kb73kb103kb132kb162kb192kb221kb251kb280kb310kb339kb369kb398kb428kb457kb487kb517kb546kb576kb605kb635kb664kb694kb723kb753kb782kb812kb842kb871kb901kb930kb960kb989kb1019kb1048kb1078kb1107kb1137kb1166kb1196kb1226kb1255kb1285kb1314kb1344kb1373kb1403kb1432kb1462kb1491kb1521kb1551kb1580kb1610kb1639kb1669kb1698kb1728kb1757kb1787kb1816kb1846kb1876kb
14kb44kb73kb103kb132kb162kb192kb221kb251kb280kb310kb339kb369kb398kb428kb457kb487kb517kb546kb576kb605kb635kb664kb694kb723kb753kb782kb812kb842kb871kb901kb930kb960kb989kb1019kb1048kb1078kb1107kb1137kb1166kb1196kb1226kb1255kb1285kb1314kb1344kb1373kb1403kb1432kb1462kb1491kb1521kb1551kb1580kb1610kb1639kb1669kb1698kb1728kb1757kb1787kb1816kb1846kb1876kb
14kb44kb73kb103kb132kb162kb192kb221kb251kb280kb310kb339kb369kb398kb428kb457kb487kb517kb546kb576kb605kb635kb664kb694kb723kb753kb782kb812kb842kb871kb901kb930kb960kb989kb1019kb1048kb1078kb1107kb1137kb1166kb1196kb1226kb1255kb1285kb1314kb1344kb1373kb1403kb1432kb1462kb1491kb1521kb1551kb1580kb1610kb1639kb1669kb1698kb1728kb1757kb1787kb1816kb1846kb1876kb
14kb44kb73kb103kb132kb162kb192kb221kb251kb280kb310kb339kb369kb398kb428kb457kb487kb517kb546kb576kb605kb635kb664kb694kb723kb753kb782kb812kb842kb871kb901kb930kb960kb989kb1019kb1048kb1078kb1107kb1137kb1166kb1196kb1226kb1255kb1285kb1314kb1344kb1373kb1403kb1432kb1462kb1491kb1521kb1551kb1580kb1610kb1639kb1669kb1698kb1728kb1757kb1787kb1816kb1846kb1876kb
29kb88kb147kb206kb265kb324kb384kb443kb502kb561kb620kb679kb738kb797kb856kb915kb974kb1034kb1093kb1152kb1211kb1270kb1329kb1388kb1447kb1506kb1565kb1624kb1684kb1743kb1802kb1861kb
29kb88kb147kb206kb265kb324kb384kb443kb502kb561kb620kb679kb738kb797kb856kb915kb974kb1034kb1093kb1152kb1211kb1270kb1329kb1388kb1447kb1506kb1565kb1624kb1684kb1743kb1802kb1861kb
29kb88kb147kb206kb265kb324kb384kb443kb502kb561kb620kb679kb738kb797kb856kb915kb974kb1034kb1093kb1152kb1211kb1270kb1329kb1388kb1447kb1506kb1565kb1624kb1684kb1743kb1802kb1861kb
29kb88kb147kb206kb265kb324kb384kb443kb502kb561kb620kb679kb738kb797kb856kb915kb974kb1034kb1093kb1152kb1211kb1270kb1329kb1388kb1447kb1506kb1565kb1624kb1684kb1743kb1802kb1861kb
29kb88kb147kb206kb265kb324kb384kb443kb502kb561kb620kb679kb738kb797kb856kb915kb974kb1034kb1093kb1152kb1211kb1270kb1329kb1388kb1447kb1506kb1565kb1624kb1684kb1743kb1802kb1861kb
29kb88kb147kb206kb265kb324kb384kb443kb502kb561kb620kb679kb738kb797kb856kb915kb974kb1034kb1093kb1152kb1211kb1270kb1329kb1388kb1447kb1506kb1565kb1624kb1684kb1743kb1802kb1861kb
29kb88kb147kb206kb265kb324kb384kb443kb502kb561kb620kb679kb738kb797kb856kb915kb974kb1034kb1093kb1152kb1211kb1270kb1329kb1388kb1447kb1506kb1565kb1624kb1684kb1743kb1802kb1861kb
29kb88kb147kb206kb265kb324kb384kb443kb502kb561kb620kb679kb738kb797kb856kb915kb974kb1034kb1093kb1152kb1211kb1270kb1329kb1388kb1447kb1506kb1565kb1624kb1684kb1743kb1802kb1861kb
29kb88kb147kb206kb265kb324kb384kb443kb502kb561kb620kb679kb738kb797kb856kb915kb974kb1034kb1093kb1152kb1211kb1270kb1329kb1388kb1447kb1506kb1565kb1624kb1684kb1743kb1802kb1861kb
29kb88kb147kb206kb265kb324kb384kb443kb502kb561kb620kb679kb738kb797kb856kb915kb974kb1034kb1093kb1152kb1211kb1270kb1329kb1388kb1447kb1506kb1565kb1624kb1684kb1743kb1802kb1861kb
29kb88kb147kb206kb265kb324kb384kb443kb502kb561kb620kb679kb738kb797kb856kb915kb974kb1034kb1093kb1152kb1211kb1270kb1329kb1388kb1447kb1506kb1565kb1624kb1684kb1743kb1802kb1861kb
59kb 177kb 295kb 413kb 531kb 649kb 768kb 886kb 1004kb 1122kb 1240kb 1359kb 1477kb 1595kb 1713kb 1831kb
59kb 177kb 295kb 413kb 531kb 649kb 768kb 886kb 1004kb 1122kb 1240kb 1359kb 1477kb 1595kb 1713kb 1831kb
59kb 177kb 295kb 413kb 531kb 649kb 768kb 886kb 1004kb 1122kb 1240kb 1359kb 1477kb 1595kb 1713kb 1831kb
59kb 177kb 295kb 413kb 531kb 649kb 768kb 886kb 1004kb 1122kb 1240kb 1359kb 1477kb 1595kb 1713kb 1831kb
59kb 177kb 295kb 413kb 531kb 649kb 768kb 886kb 1004kb 1122kb 1240kb 1359kb 1477kb 1595kb 1713kb 1831kb
59kb 177kb 295kb 413kb 531kb 649kb 768kb 886kb 1004kb 1122kb 1240kb 1359kb 1477kb 1595kb 1713kb 1831kb
59kb 177kb 295kb 413kb 531kb 649kb 768kb 886kb 1004kb 1122kb 1240kb 1359kb 1477kb 1595kb 1713kb 1831kb
59kb 177kb 295kb 413kb 531kb 649kb 768kb 886kb 1004kb 1122kb 1240kb 1359kb 1477kb 1595kb 1713kb 1831kb
59kb 177kb 295kb 413kb 531kb 649kb 768kb 886kb 1004kb 1122kb 1240kb 1359kb 1477kb 1595kb 1713kb 1831kb
59kb 177kb 295kb 413kb 531kb 649kb 768kb 886kb 1004kb 1122kb 1240kb 1359kb 1477kb 1595kb 1713kb 1831kb
59kb 177kb 295kb 413kb 531kb 649kb 768kb 886kb 1004kb 1122kb 1240kb 1359kb 1477kb 1595kb 1713kb 1831kb
118kb 354kb 590kb 827kb 1063kb 1299kb 1536kb 1772kb
118kb 354kb 590kb 827kb 1063kb 1299kb 1536kb 1772kb
118kb 354kb 590kb 827kb 1063kb 1299kb 1536kb 1772kb
118kb 354kb 590kb 827kb 1063kb 1299kb 1536kb 1772kb
118kb 354kb 590kb 827kb 1063kb 1299kb 1536kb 1772kb
118kb 354kb 590kb 827kb 1063kb 1299kb 1536kb 1772kb
118kb 354kb 590kb 827kb 1063kb 1299kb 1536kb 1772kb
118kb 354kb 590kb 827kb 1063kb 1299kb 1536kb 1772kb
118kb 354kb 590kb 827kb 1063kb 1299kb 1536kb 1772kb
118kb 354kb 590kb 827kb 1063kb 1299kb 1536kb 1772kb
118kb 354kb 590kb 827kb 1063kb 1299kb 1536kb 1772kb
236kb 709kb 1181kb 1654kb
236kb 709kb 1181kb 1654kb
236kb 709kb 1181kb 1654kb
236kb 709kb 1181kb 1654kb
236kb 709kb 1181kb 1654kb
236kb 709kb 1181kb 1654kb
236kb 709kb 1181kb 1654kb
236kb 709kb 1181kb 1654kb
236kb 709kb 1181kb 1654kb
236kb 709kb 1181kb 1654kb
236kb 709kb 1181kb 1654kb
>gi|240248234|emb|AJ74
>gi|89143280|emb|AM233
>gi|110319990|emb|AM28
>gi|115128880|gb|CP000
>gi|118422521|gb|CP000
>gi|134048946|gb|CP000
>gi|156251972|gb|CP000
>gi|187711822|gb|CP000
>gi|282158286|gb|CP001
>gi|377828067|gb|CP003
>gi|377826522|gb|CP003
1892 kb
1895 kb
1892 kb
1895 kb
1910 kb
1898 kb
1890 kb
1893 kb
1892 kb
1968 kb
1892 kb
472kb 945kb 1418kb
472kb 945kb 1418kb
472kb 945kb 1418kb
472kb 945kb 1418kb
472kb 945kb 1418kb
472kb 945kb 1418kb
472kb 945kb 1418kb
472kb 945kb 1418kb
472kb 945kb 1418kb
472kb 945kb 1418kb
472kb 945kb 1418kb
M-GCAT: interactively and efficiently constructing large-scale multiple genome comparison frameworks in closely related species. Treangen TJ, Messeguer X (2006) BMC Bioinformatics
Challenges • Newly available long inaccurate sequences
– Finding shared sequence is an N2 algorithm – Use exact seeds and extend but high error-‐rates require very small seeds
– Consequently, no method to use these sequences
• Assemble metagenomic and single cell datasets – AssumpHon of uniform coverage is violated, requiring novel methods to idenHfy repeats
– AssumpHon of simple paths with no forks is violated by differences between closely related organisms
• Automated, Accessible, and Reproducible Results – Current methods require manual parameter tuning and are error-‐prone
– Easy-‐to-‐use visual representaHons of assembly graph structure • NP-‐hard formulaHon
– HeurisHc for layout in linear Hme works well in pracHce – HeurisHc for orientaHon (known 2-‐approximate) 12
Long Sequences Simply the Graph
Assembly complexity of prokaryotic genomes using short reads. Kingsford C, Schatz MC, Pop M (2010) BMC Bioinformatics.
k = 50 k = 1,000 k = 5,000
13
! !"
!
"
#
$
%
#
! !"
Hybrid error correction and de novo assembly of single-molecule sequencing reads. Koren, S, et al. (2012) Nature Biotech In Review.
The Challenge of Long Reads
14
Correction results of 20x PacBio coverage of E. coli K12 corrected using 50x Illumina
PacBio Pre Correction Read Length
Length (bp)
% R
eads
0 1000 2000 3000 4000
0.00
000.
0005
0.00
100.
0015
Pre Correction Coverage
% Coverage
% R
eads
0 10 30 50 70 90
00.
10.
30.
50.
91
Pre Correction Identity
% Identity
% R
eads
0 10 30 50 70 90
00.
10.
30.
50.
91
PacBio Post Correction Read Length
Length (bp)
% R
eads
0 1000 2000 3000 4000
0.00
000.
0005
0.00
100.
0015
Post Correction Coverage
% Coverage
% R
eads
0 10 30 50 70 90
00.
10.
91
Post Correction Identity
% Identity
% R
eads
0 10 30 50 70 90
00.
10.
91
CorrecHon Results
15
Recipe #1 Recipe #2
20x PBcR (>6Kbp)
50x PBcR 10x 8Kbp
1 1
1 1
2,419,630 2,415,188
2,419,630 2,415,188
One Chromosome, One ConHg • USDA collaboraHon, Bibersteinia trehalosi 192
– 2.4 Mbp genome, 5.5 Kbp max repeat
Assembler S. aureus R. sphaeroides Previously Possible
Data 45x
Illumina 45x 3Kbp
45x Ilumina 45x 3Kbp
50x 454 50x 8Kbp
Total Scaffolds 12 34 1
Total Contigs 60 204 25
N50 Contig 96,740 42,455 190,476
Max Contig 234,488 106,467 347,035
16
GAGE: A cri)cal evalua)on of genome assemblies and assembly algorithms. Salzberg SL, Phillippy AM, Zimin AV, Puiu D, Magoc T, Koren S, Treangen TJ, Schatz MC, Delcher AL, Roberts M,
Marcais G, Pop M, Yorke JA (2011). Genome Research!
Assembling A Vertebrate
0 20 40 60 80 100
020
4060
8010
0
% CDS Coverage
% N
umbe
r of C
DS
PBcR454Illumina
0 20 40 60 80 100
020
4060
8010
0
% CDS Coverage
% N
umbe
r of C
DS
PBcR454Illumina
Mapped to contig Mapped to scaffolda) b)
• Accurate correcHon for vertebrates – >99.5% read correcHon accuracy – Good quality of assembly
17
On To Metagenomics • Newly available long inaccurate sequences
– Finding shared sequence is an N2 algorithm – Use exact seeds and extend but high error-‐rates require very small seeds
– Consequently, no method to use these sequences
• Assemble metagenomic and single cell datasets – AssumpHon of uniform coverage is violated, requiring novel methods to idenHfy repeats
– AssumpHon of simple paths with no forks is violated by differences between closely related organisms
• Automated, Accessible, and Reproducible Results – Current methods require manual parameter tuning and are error-‐prone
– Easy-‐to-‐use visual representaHons of assembly graph structure • NP-‐hard formulaHon
– HeurisHc for layout in linear Hme works well in pracHce – HeurisHc for orientaHon (known 2-‐approximate) 18
Previous Approaches
Coverage observed across large uniHgs
Higher coverage indicates a Repeat
• Manually increase threshold for what is a repeat
Largest ContigsNumber Number Scaffold in mixedContigs Scaffolds (Kbp) Scaffolds
Celera Assembler 18,577 1,600 29 0.44%with metagenomics settings 8,128 422 864 17.79%
Bambus initial 8,128 328 972 33.77%with Secondary Signal detection 8,128 291 430 0.98%
• Leads to assembly errors
The Sorcerer II Global Ocean Sampling expedi)on: northwest Atlan)c through eastern tropical Pacific. Rusch DB, et. al. (2007) PLoS Biology!
• Manually increase threshold for what is shared sequence
19
The Challenge of Uneven Coverage
• Repeat affects global structure of the graph • Detect using number of shortest paths it is on
129 119
112
113
118
42
70
83
100 120
145
15
51
58
75
8
88
20
Use Local Coverage Metrics • Split graph into strongly connected components – Compute A-‐stat for each component – = arrival rate = α = – is Binomial
• We approximate using Poisson with giving us
– For a 2-‐copy repeat, the arrival rate is doubled, giving us
–
contigs oflength reads ofnumber
ραλ == np
!
"ke #"
k!
!
2"ke #2"
k!
A! stat = P(unique)P(2! copy)
= (loge)! ! (log2)k
!
P(read starting at a base)
!
P(k starts in " bases)
21
Repeat DetecHon is Improved
Bambus 2: Scaffolding Metagenomes. Koren S, Treangen TJ, Pop M (2011) Bioinforma3cs.! 22
The Challenge of VariaHons
!"#$%&'($!"" ))**+*,,+*+,),+,++,,,))+)++*)+*))+))+)),)))**,**!"#$%''"$!""-./01,*/-.,2+/-3&4-!"%##5 ))**+*,,+*+,),+,++,,,))+)++*)+*))+))+)),)))**,**---------------- 666666666666666666666666666666666666666666666666
!"#$%&'($!"" ,)),)*,,,)*,,),)+)***+*,))++*)))+**++)+,))**+*,+!"#$%''"$!""-./01,*/-.,2+/-3&4-!"%##5 )),,+*+**++,,),),))*,),))))))***+**++*,**+,++,,+
-6-6-6-----66666-6-6----66------66666-------6-66
!"#$%&'($!"" *+*)*+**,+)+*+))+)++))+*+)+,),,,*,+,)*+,))+*++),!"#$%''"$!""-./01,*/-.,2+/-3&4-!"%##5 )+*)*++*,,)*)*))**,+*777777+)*)**,+,*,,)**+*,+*,
-66666-66-6---66---6--------6---6666------66-6-6!"#$%&'($!"" ))))+*)+)++),)+++*))),*++,)),*777+++,))**+)+,*)*!"#$%''"$!""-./01,*/-.,2+/-3&4-!"%##5 *,)*,**)+,,),)+,**),*+**)))))*)**+,+*,))))*,+*+*
--6--6-----6666--66---6---66-6---6-6--6------6-6
Bambus 2: Scaffolding Metagenomes. Koren S, Treangen TJ, Pop M (2011) Bioinforma3cs.! 24
The Challenge of Reproducibility
metAMOS: A modular and open source metagenomic assembly and analysis pipeline Treangen TJ*, Koren S*, Sommer D, Astrovskaya I, Liu B, Darling AE, and Pop M. Genome Biology In Prep.
MapReads
FindORFS
FindRepeats
Annotate
Scaffold
Propagate
Classify
Postprocess/Results
MapReads
FindORFS
FastQCfastx_toolkit
MetaGeneMark (Zhu et al 2010)FragGeneScan (Rho et al 2010)Glimmer-MG (Kelley et al 2012)
Bowtie (Langmead et al 2009)
BLASTMetaPhyler (Liu et al 2011)PHMMER (Eddy 2011)PhyloSift (In prep)PhymmBL (Brady et al 2011)NB/RITA (McDonald et al 2011)
Bambus 2 (Koren et al 2011)
Krona (Ondov et al 2011)
CA (Miller et al 2008)Meta-IDBA (Peng et al 2011)Minimus (Sommer et al 2008)Newbler SOAPdenovo (Li et al 2010)Velvet (Zerbino et al 2008)Velvet-SC (Chitsaz et al 2011)
Preprocess
Assemble
{
{{
{
{{
= Start
= Existing
= Novel
{Propagate classificationsvia assembly graph andmake final classifications ofassembled contigs/scaffolds
{
26
TradiHonal Assembly
Table 4.3: Assembly contiguity on two NGS prokaryotic datasets. Assem-blies of Staphylococcus aureus (genome size 2 872 915) and Rhodobacter sphaeroides(genome size 4 603 060). For all assemblies, N50 values are based on the same genomesize. The Errors column contains the number of mojoins plus indel errors > 5bp forcontigs, and the total number of misjoins for sca!olds. Corrected N50 values werecomputed after correcting contigs and sca!olds by breaking them at each error. Seethe GAGE publication [146] for details on how errors were identified.Genome Assembler Contigs Sca!olds
Num N50 Errors N50 corr (kb) Num N50 Errors N50 corr. (kb)
S. aureus ABySS 301 29.2 14 24.8 246 34 1 28Allpaths-LG 60 96.7 16 66.2 12 1 092 0 1 092Bambus 2 164 29.5 15 26.0 16 1 089 0 1 089CABOG Could not run: incompatible read lengths in one library.MSR-CA 94 59.2 22 48.2 17 2 412 3 1 022SGA 252 4.0 6 4.0 456 208 1 208SOAPdenovo 107 288.2 48 62.7 99 332 8 284Velvet 162 48.4 28 41.5 45 762 17 162
R. sphaeroides ABySS 1915 5.9 55 4.2 1 701 9 3 5Allpaths-LG 204 42.5 43 34.4 34 3 192 0 3 092Bambus 2 376 21.0 25 19.5 92 2 478 2 2 478CABOG 322 20.2 34 17.9 130 66 5 55MSR-CA 395 22.1 42 19.1 43 2 976 5 2 966SGA 3 067 4.5 8 2.9 2 096 51 0 51SOAPdenovo 204 131.7 414 14.3 166 660 3 658Velvet 583 15.7 35 14.5 178 353 6 270
Table 4.4: Assembly correctness on two NGS prokaryotic datasets. Assem-blies of Staphylococcus aureus (genome size 2 872 915) and Rhodobacter sphaeroides(genome size 4 603 060). See the GAGE publication [146] for details on how errorswere identified.Genome Assembler SNPs Indels Contigs Sca!olds
! 5bp > 5bp Misjoins Inv Reloc Misjoins Inv Reloc
S. aureus ABySS 258 20 9 5 3 2 1 1 0Allpaths-LG 79 4 12 4 0 4 0 0 0Bambus 2 39 12 11 4 1 3 0 0 0MSR-CA 191 23 10 12 6 6 3 3 0SGA 32 2 2 4 1 3 0 0 0SOAPdenovo 246 25 31 17 1 16 8 1 7Velvet 217 6 14 14 5 9 17 5 12
R. sphaeroides ABySS 692 208 34 21 2 19 3 0 3Allpaths-LG 218 150 37 6 0 6 0 0 0Bambus 2 193 136 23 2 1 1 2 0 2CABOG 536 145 24 10 1 9 5 4 1MSR-CA 807 179 32 10 1 9 5 2 3SGA 336 116 4 4 0 4 0 0 0SOAPdenovo 527 155 406 8 0 8 3 1 2Velvet 413 148 27 8 0 8 6 6 7
We ran Bambus 2 to sca!old unitigs from CA-met and Minimus [164]. As seen
in Figure 4.6, for all genomes, Bambus 2 outperforms CA. For all but one genome,
92
GAGE: A cri)cal evalua)on of genome assemblies and assembly algorithms. Salzberg SL, Phillippy AM, Zimin AV, Puiu D, Magoc T, Koren S, Treangen TJ, Schatz MC, Delcher AL, Roberts M,
Marcais G, Pop M, Yorke JA (2011). Genome Research! 27
Metagenomic Assembly
0
5
10
15
20
25
SOAP.utg metAMOS
(SOAP.utg)
SOAP Meta-IDBA velvet CA CAdeg
# of genomes >90% covered
Mock Even Mock Staggered
28
Metagenomic Error Reduced
0
5
10
15
20
25
30
SOAP.utg metAMOS
(SOAP.utg)
SOAP Meta-IDBA velvet CA CAdeg
Errors per MB (Total errors/Total contig length)
Error rate
Mock Even Mock Staggered
29
Metagenomic Assembly
30
N. meningitidis
S. gallolyticus
S. agalactiae
S. suis
A. pleuropneumoniae
A. succinogenes
M. succiniciproducens
S. gordonii
P. multocida
S. pneumoniae
S. sanguinis
C. concisus
A. aphrophilus
S. mitis
R. mucilaginosa
10% 20% 30% 40% 50% 60% >70%
soap.utg
metAMOS
soap.utg Meta-IDBA
metAMOS
Meta-IDBA
>10X
covera
ge
Recruited reference genome coverage (%aligned contigs)
S. thermophilus
V. parvula
A. actinomycetemcomitans
A. parvulum
H. ducreyi
0 100 200 300 400 500 600
Errors per MB (Total Errors / Total length of aligned contigs)
soap.utg
metAMOS
soap.utg Meta-IDBA
metAMOS
Meta-IDBA
Table 5.3: Assembly of the HMP tongue dorsum dataset with metAMOS.Stats comparing two assemblies generated within metAMOS (using Meta-IDBAand SOAPdenovo, MetaGeneMark and Bambus 2) of tongue dorsum female sample(HMP, SRS077736). Unitigs indicate initial output of each assembler. Contigs arereported after sca!olding by Bambus 2 (splitting sca!olds at Ns).
Unitigs Contigs Sca!oldsAssembler Total BP # Max # Max # Max
Meta-IDBA 119075 843 678 034 220 488 673 291 220 488 644 997 443823SOAPdenovo 101769 360 451 765 116 181 292 706 238 051 287 108 238 051
5.3.5 HMP tongue dorsum
Our second analysis was performed on real data (HMP tongue dorsum female
sample). For this sample, the true and complete composition of the community is
unknown; instead we constructed a reference genome set from the genomes identi-
fied by the HMP to have high similarity to the sequences within the sample. This
dataset was previously assembled with Meta-IDBA and the published results demon-
strated that Meta-IDBA was able to generate larger contigs than SOAPdenovo [117].
We used both SOAPdenovo and Meta-IDBA as starting points for the metAMOS
pipeline. The results shown in Table 5.3 show that while Meta-IDBA produces a
significantly larger maximum unitig (doubling that obtained by SOAP.utg), the re-
sulting contigs and sca!olds are much closer in length. Focusing on producing larger
unitigs in an initial assembly leads to higher error rates (Table 5.1) while metAMOS
produces accurate unitigs and contiguous contigs/sca!olds. Figure 5.10 shows the
Krona [114] plot for the sample which is automatically generated by metAMOS.
The figure allows both for an overview of the taxonomic composition in a dataset as
well as allowing interactive navigation to explore specific branches of the taxonomy.
To evaluate the correctness of these assemblies, we aligned them against our
set of reference genomes. In Figure 5.11 we show the percentage of each reference
genome covered by correctly assembled contigs. While both assemblers (SOAP-
denovo and Meta-IDBA) vary in their ability to reconstruct individual genomes,
167
Single-‐Cell Assembly Table 4.6: Assembly contiguity on two single-cell prokaryotic datasets.Assemblies of Staphylococcus aureus (genome size 2 872 915) and Rhodobactersphaeroides (genome size 4 603 060). For all assemblies, N50 values are based onthe same genome size. The Errors column contains the number of mojoins plusindel errors > 5bp for contigs, and the total number of misjoins for sca!olds. Cor-rected N50 values were computed after correcting contigs and sca!olds by breakingthem at each error. See the GAGE publication [146] for details on how errors wereidentified.Genome Assembler Contigs Sca!olds
Num N50 Errors N50 corr (kb) Num N50 Errors N50 corr. (kb)
E. coli Lane 6 Euler+Velvet-SC [18] 501 32.0 - - - - - -Velvet-SC 220 56.5 23 52.1 - - - -Velvet-SC+Bambus 2 204 60.7 24 54.4 193 65.0 0 59.6
S. aureus Lane 7 Euler+Velvet-SC [18] 355 32.3 - - - - - -Velvet-SC 175 37.5 19 34.9 - - - -Velvet-SC+Bambus 2 141 45.8 19 42.7 136 48.4 4 40.9
The table shows that Bambus 2 is able to generate longer contigs than those
from Velvet-SC alone. The assemblies show a 4.4% and 22% corrected N50 gain while
introducing one and zero errors in E. coli K12 and S. aureus USA300, respectively.
While the gain is small, the datasets only have short Illumina paired-ends (260bp)
and Bambus 2 is still able to improve on the state-of the-art. Based on the assembly
of S. aureus from Salzberg et. al. [146], Bambus 2 contiguity increases 22–fold when
long (3Kbp) Illumina mate-pairs are available (from 50Kbp to 1 082Kbp sca!old
corrected N50). Therefore, we expect Bambus 2 can significantly improve single-cell
assemblies.
4.5 Discussion
The repeat detection procedures used in Bambus 2 are sensitive without sac-
rificing specificity, and have been applied to the assembly of single genomes. The
sca!olds generated by Bambus 2 cover a large percentage of the genomes in the
103
Efficient de novo assembly of single-‐cell bacterial genomes from short-‐read data sets. Chitsaz H, Yee-‐Greenbaum JL, Tesler G, Lombardo MJ, Dupont CL, Badger JH, Novotny M, Rusch DB, Fraser LJ,
Gormley NA, Schulz-‐Trieglaff O, Smith JP, Evers DJ, Pevzner PA, Lasken RL (2011). Nat Biotech! 31
Enable Novel Biology
• AutomaHcally finish bacterial genomes – Enable comparaHve analysis on a scale not previously possible
• Analyze metagenomic datasets and more accurately idenHfy funcHonal potenHal – Accurately represent closely-‐related organisms in a sample
• Allow reproducible and consistent analysis • Independent of technology
32
Conclusion • Demonstrated on mulHple domains, outperforming domain-‐specific tools
• Applicable to other areas of computer science – CorrecHon of high-‐error strings using a consensus approach
– OpenMP implementaHon of k-‐betweenness vertex centrality with automated outlier detecHon
– Linear-‐Hme heurisHc linear layout algorithm
• Future Work – Applying long-‐reads to metagenomic and single-‐cell assembly
– AutomaHcally scaling # threads for repeat detecHon – Confidence assignments to biological variants
33
Sooware Publications
Journal Articles
[J1] Koren S, Schatz, M. C., Walenz, B. P., Martin, J., Howard, J., Ganapathy, G., Wang, Z., Rasko,D. A., McCombie, W. R., Jarvis, E. D., and Phillippy, A. M. Hybrid error correction and de novoassembly of single-molecule sequencing reads. Nature Biotechnology In Review (2012).
[J2] Prüfer, K., Munch, K., Hellmann, I., Akagi, K., Miller, J. R., Walenz, B., Koren S, Sutton, G.,Kodira, C., Winer, R., Knight, J. R., Mullikin, J. C., Meader, S. J., Ponting, C. P., Lunter, G.,Higashino, S., Hobolth, A., Dutheil, J., Karakoç, E., Alkan, C., Sajjadian, S., Catacchio, C. R.,Ventura, M., Marques-Bonet, T., Eichler, E. E., AndrO, C., Atencia, R., Mugisha, L., Patterson,N., Siebauer, M., Good, J. M., Fischer, A., Ptak, S. E., Lachmann, M., Symer, D. E., Mailund, T.,Schierup, M. H., Andrés, A. M., Kelso, J., and Pääbo, S. The bonobo genome compared with thegenomes of chimpanzee and human. Nature In Review (2012).
[J3] Treangen*, T. J., Koren S*, Sommer, D., Astrovskaya, I., Liu, B., Darling, A. E., and Pop, M.metAMOS: A modular and open source metagenomic assembly and analysis pipeline. Genome BiologyIn Review (2012).
[J4] Earl, D. A., Bradnam, K., St. John, J., Darling, A., Lin, D., Faas, J., Yu, H. O. K., Vince, B., Zerbino,D. R., Diekhans, M., Nguyen, N., Nuwantha, P., Sung, A. W.-K., Ning, Z., Haimel, M., Simpson,J. T., Fronseca, N. A., Birol, Ä., Docking, T. R., Ho, I. Y., Rokhsar, D. S., Chikhi, R., Lavenier,D., Chapuis, G., Naquin, D., Maillet, N., Schatz, M. C., Kelly, D. R., Phillippy, A. M., Koren, S,Yang, S.-P., Wu, W., Chou, W.-C., Srivastava, A., Shaw, T. I., Ruby, J. G., Skewes-Cox, P., Betegon,M., Dimon, M. T., Solovyev, V., Kosarev, P., Vorobyev, D., Ramirez-Gonzalez, R., Leggett, R.,MacLean, D., Xia, F., Luo, R., L, Z., Xie, Y., Liu, B., Gnerre, S., MacCallum, I., Przybylski, D.,Ribeiro, F. J., Yin, S., Sharpe, T., Hall, G., Kersey, P. J., Durbin, R., Jackman, S. D., Chapman,J. A., Huang, X., DeRisi, J. L., Caccamo, M., Li, Y., Jaffe, D. B., Green, R., Haussler, D., Korf, I.,and Paten, B. Assemblathon 1: A competitive assessment of de novo short read assembly methods.Genome Research (2011).
[J5] Koren, S, Treangen, T. J., and Pop, M. Bambus 2: scaffolding metagenomes. Bioinformatics 27, 21(2011), 2964–2971.
[J6] Salzberg, S. L., Phillippy, A. M., Zimin, A. V., Puiu, D., Magoc, T., Koren, S, Treangen, T., Schatz,M. C., Delcher, A. L., Roberts, M., Marcais, G., Pop, M., and Yorke, J. A. GAGE: a criticalevaluation of genome assemblies and assembly algorithms. Genome Research (2011).
[J7] Treangen, T. J., Sommer, D. D., Angly, F. E., Koren, S., and Pop, M. Next generation sequenceassembly with amos. Current Protocols in Bioinformatics 33 (2011), 11.8.1–11.8.18.
[J8] Koren, S., Miller, J., Walenz, B., and Sutton, G. An algorithm for automated closure duringassembly (highly accessed). BMC Bioinformatics 11, 1 (2010), 457.
[J9] Miller, J., Koren, S., and Sutton, G. Assembly algorithms for next-generation sequencing data.Genomics (2010).
[J10] Rausch, T., Koren, S., Denisov, G., Weese, D., Emde, A., Doring, A., and Reinert, K. A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads. Bioin-formatics 25, 9 (2009), 1118.
[J11] Miller, J., Delcher, A., Koren, S., Venter, E., Walenz, B., Brownley, A., Johnson, J., Li, K., Mo-barry, C., and Sutton, G. Aggressive assembly of pyrosequencing reads with mates. Bioinformatics24, 24 (2008), 2818.
2
Publications
Journal Articles
[J1] Koren S, Schatz, M. C., Walenz, B. P., Martin, J., Howard, J., Ganapathy, G., Wang, Z., Rasko,D. A., McCombie, W. R., Jarvis, E. D., and Phillippy, A. M. Hybrid error correction and de novoassembly of single-molecule sequencing reads. Nature Biotechnology In Review (2012).
[J2] Prüfer, K., Munch, K., Hellmann, I., Akagi, K., Miller, J. R., Walenz, B., Koren S, Sutton, G.,Kodira, C., Winer, R., Knight, J. R., Mullikin, J. C., Meader, S. J., Ponting, C. P., Lunter, G.,Higashino, S., Hobolth, A., Dutheil, J., Karakoç, E., Alkan, C., Sajjadian, S., Catacchio, C. R.,Ventura, M., Marques-Bonet, T., Eichler, E. E., AndrO, C., Atencia, R., Mugisha, L., Patterson,N., Siebauer, M., Good, J. M., Fischer, A., Ptak, S. E., Lachmann, M., Symer, D. E., Mailund, T.,Schierup, M. H., Andrés, A. M., Kelso, J., and Pääbo, S. The bonobo genome compared with thegenomes of chimpanzee and human. Nature In Review (2012).
[J3] Treangen*, T. J., Koren S*, Sommer, D., Astrovskaya, I., Liu, B., Darling, A. E., and Pop, M.metAMOS: A modular and open source metagenomic assembly and analysis pipeline. Genome BiologyIn Review (2012).
[J4] Earl, D. A., Bradnam, K., St. John, J., Darling, A., Lin, D., Faas, J., Yu, H. O. K., Vince, B., Zerbino,D. R., Diekhans, M., Nguyen, N., Nuwantha, P., Sung, A. W.-K., Ning, Z., Haimel, M., Simpson,J. T., Fronseca, N. A., Birol, Ä., Docking, T. R., Ho, I. Y., Rokhsar, D. S., Chikhi, R., Lavenier,D., Chapuis, G., Naquin, D., Maillet, N., Schatz, M. C., Kelly, D. R., Phillippy, A. M., Koren, S,Yang, S.-P., Wu, W., Chou, W.-C., Srivastava, A., Shaw, T. I., Ruby, J. G., Skewes-Cox, P., Betegon,M., Dimon, M. T., Solovyev, V., Kosarev, P., Vorobyev, D., Ramirez-Gonzalez, R., Leggett, R.,MacLean, D., Xia, F., Luo, R., L, Z., Xie, Y., Liu, B., Gnerre, S., MacCallum, I., Przybylski, D.,Ribeiro, F. J., Yin, S., Sharpe, T., Hall, G., Kersey, P. J., Durbin, R., Jackman, S. D., Chapman,J. A., Huang, X., DeRisi, J. L., Caccamo, M., Li, Y., Jaffe, D. B., Green, R., Haussler, D., Korf, I.,and Paten, B. Assemblathon 1: A competitive assessment of de novo short read assembly methods.Genome Research (2011).
[J5] Koren, S, Treangen, T. J., and Pop, M. Bambus 2: scaffolding metagenomes. Bioinformatics 27, 21(2011), 2964–2971.
[J6] Salzberg, S. L., Phillippy, A. M., Zimin, A. V., Puiu, D., Magoc, T., Koren, S, Treangen, T., Schatz,M. C., Delcher, A. L., Roberts, M., Marcais, G., Pop, M., and Yorke, J. A. GAGE: a criticalevaluation of genome assemblies and assembly algorithms. Genome Research (2011).
[J7] Treangen, T. J., Sommer, D. D., Angly, F. E., Koren, S., and Pop, M. Next generation sequenceassembly with amos. Current Protocols in Bioinformatics 33 (2011), 11.8.1–11.8.18.
[J8] Koren, S., Miller, J., Walenz, B., and Sutton, G. An algorithm for automated closure duringassembly (highly accessed). BMC Bioinformatics 11, 1 (2010), 457.
[J9] Miller, J., Koren, S., and Sutton, G. Assembly algorithms for next-generation sequencing data.Genomics (2010).
[J10] Rausch, T., Koren, S., Denisov, G., Weese, D., Emde, A., Doring, A., and Reinert, K. A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads. Bioin-formatics 25, 9 (2009), 1118.
[J11] Miller, J., Delcher, A., Koren, S., Venter, E., Walenz, B., Brownley, A., Johnson, J., Li, K., Mo-barry, C., and Sutton, G. Aggressive assembly of pyrosequencing reads with mates. Bioinformatics24, 24 (2008), 2818.
2
Publications
Journal Articles
[J1] Koren S, Schatz, M. C., Walenz, B. P., Martin, J., Howard, J., Ganapathy, G., Wang, Z., Rasko,D. A., McCombie, W. R., Jarvis, E. D., and Phillippy, A. M. Hybrid error correction and de novoassembly of single-molecule sequencing reads. Nature Biotechnology In Review (2012).
[J2] Prüfer, K., Munch, K., Hellmann, I., Akagi, K., Miller, J. R., Walenz, B., Koren S, Sutton, G.,Kodira, C., Winer, R., Knight, J. R., Mullikin, J. C., Meader, S. J., Ponting, C. P., Lunter, G.,Higashino, S., Hobolth, A., Dutheil, J., Karakoç, E., Alkan, C., Sajjadian, S., Catacchio, C. R.,Ventura, M., Marques-Bonet, T., Eichler, E. E., AndrO, C., Atencia, R., Mugisha, L., Patterson,N., Siebauer, M., Good, J. M., Fischer, A., Ptak, S. E., Lachmann, M., Symer, D. E., Mailund, T.,Schierup, M. H., Andrés, A. M., Kelso, J., and Pääbo, S. The bonobo genome compared with thegenomes of chimpanzee and human. Nature In Review (2012).
[J3] Treangen*, T. J., Koren S*, Sommer, D., Astrovskaya, I., Liu, B., Darling, A. E., and Pop, M.metAMOS: A modular and open source metagenomic assembly and analysis pipeline. Genome BiologyIn Review (2012).
[J4] Earl, D. A., Bradnam, K., St. John, J., Darling, A., Lin, D., Faas, J., Yu, H. O. K., Vince, B., Zerbino,D. R., Diekhans, M., Nguyen, N., Nuwantha, P., Sung, A. W.-K., Ning, Z., Haimel, M., Simpson,J. T., Fronseca, N. A., Birol, Ä., Docking, T. R., Ho, I. Y., Rokhsar, D. S., Chikhi, R., Lavenier,D., Chapuis, G., Naquin, D., Maillet, N., Schatz, M. C., Kelly, D. R., Phillippy, A. M., Koren, S,Yang, S.-P., Wu, W., Chou, W.-C., Srivastava, A., Shaw, T. I., Ruby, J. G., Skewes-Cox, P., Betegon,M., Dimon, M. T., Solovyev, V., Kosarev, P., Vorobyev, D., Ramirez-Gonzalez, R., Leggett, R.,MacLean, D., Xia, F., Luo, R., L, Z., Xie, Y., Liu, B., Gnerre, S., MacCallum, I., Przybylski, D.,Ribeiro, F. J., Yin, S., Sharpe, T., Hall, G., Kersey, P. J., Durbin, R., Jackman, S. D., Chapman,J. A., Huang, X., DeRisi, J. L., Caccamo, M., Li, Y., Jaffe, D. B., Green, R., Haussler, D., Korf, I.,and Paten, B. Assemblathon 1: A competitive assessment of de novo short read assembly methods.Genome Research (2011).
[J5] Koren, S, Treangen, T. J., and Pop, M. Bambus 2: scaffolding metagenomes. Bioinformatics 27, 21(2011), 2964–2971.
[J6] Salzberg, S. L., Phillippy, A. M., Zimin, A. V., Puiu, D., Magoc, T., Koren, S, Treangen, T., Schatz,M. C., Delcher, A. L., Roberts, M., Marcais, G., Pop, M., and Yorke, J. A. GAGE: a criticalevaluation of genome assemblies and assembly algorithms. Genome Research (2011).
[J7] Treangen, T. J., Sommer, D. D., Angly, F. E., Koren, S., and Pop, M. Next generation sequenceassembly with amos. Current Protocols in Bioinformatics 33 (2011), 11.8.1–11.8.18.
[J8] Koren, S., Miller, J., Walenz, B., and Sutton, G. An algorithm for automated closure duringassembly (highly accessed). BMC Bioinformatics 11, 1 (2010), 457.
[J9] Miller, J., Koren, S., and Sutton, G. Assembly algorithms for next-generation sequencing data.Genomics (2010).
[J10] Rausch, T., Koren, S., Denisov, G., Weese, D., Emde, A., Doring, A., and Reinert, K. A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads. Bioin-formatics 25, 9 (2009), 1118.
[J11] Miller, J., Delcher, A., Koren, S., Venter, E., Walenz, B., Brownley, A., Johnson, J., Li, K., Mo-barry, C., and Sutton, G. Aggressive assembly of pyrosequencing reads with mates. Bioinformatics24, 24 (2008), 2818.
2
34
Celera Assembler hQp://wgs-‐assembler.sf.net
Publications
Journal Articles
[J1] Koren S, Schatz, M. C., Walenz, B. P., Martin, J., Howard, J., Ganapathy, G., Wang, Z., Rasko,D. A., McCombie, W. R., Jarvis, E. D., and Phillippy, A. M. Hybrid error correction and de novoassembly of single-molecule sequencing reads. Nature Biotechnology In Review (2012).
[J2] Prüfer, K., Munch, K., Hellmann, I., Akagi, K., Miller, J. R., Walenz, B., Koren S, Sutton, G.,Kodira, C., Winer, R., Knight, J. R., Mullikin, J. C., Meader, S. J., Ponting, C. P., Lunter, G.,Higashino, S., Hobolth, A., Dutheil, J., Karakoç, E., Alkan, C., Sajjadian, S., Catacchio, C. R.,Ventura, M., Marques-Bonet, T., Eichler, E. E., AndrO, C., Atencia, R., Mugisha, L., Patterson,N., Siebauer, M., Good, J. M., Fischer, A., Ptak, S. E., Lachmann, M., Symer, D. E., Mailund, T.,Schierup, M. H., Andrés, A. M., Kelso, J., and Pääbo, S. The bonobo genome compared with thegenomes of chimpanzee and human. Nature In Review (2012).
[J3] Treangen*, T. J., Koren S*, Sommer, D., Astrovskaya, I., Liu, B., Darling, A. E., and Pop, M.metAMOS: A modular and open source metagenomic assembly and analysis pipeline. Genome BiologyIn Review (2012).
[J4] Earl, D. A., Bradnam, K., St. John, J., Darling, A., Lin, D., Faas, J., Yu, H. O. K., Vince, B., Zerbino,D. R., Diekhans, M., Nguyen, N., Nuwantha, P., Sung, A. W.-K., Ning, Z., Haimel, M., Simpson,J. T., Fronseca, N. A., Birol, Ä., Docking, T. R., Ho, I. Y., Rokhsar, D. S., Chikhi, R., Lavenier,D., Chapuis, G., Naquin, D., Maillet, N., Schatz, M. C., Kelly, D. R., Phillippy, A. M., Koren, S,Yang, S.-P., Wu, W., Chou, W.-C., Srivastava, A., Shaw, T. I., Ruby, J. G., Skewes-Cox, P., Betegon,M., Dimon, M. T., Solovyev, V., Kosarev, P., Vorobyev, D., Ramirez-Gonzalez, R., Leggett, R.,MacLean, D., Xia, F., Luo, R., L, Z., Xie, Y., Liu, B., Gnerre, S., MacCallum, I., Przybylski, D.,Ribeiro, F. J., Yin, S., Sharpe, T., Hall, G., Kersey, P. J., Durbin, R., Jackman, S. D., Chapman,J. A., Huang, X., DeRisi, J. L., Caccamo, M., Li, Y., Jaffe, D. B., Green, R., Haussler, D., Korf, I.,and Paten, B. Assemblathon 1: A competitive assessment of de novo short read assembly methods.Genome Research (2011).
[J5] Koren, S, Treangen, T. J., and Pop, M. Bambus 2: scaffolding metagenomes. Bioinformatics 27, 21(2011), 2964–2971.
[J6] Salzberg, S. L., Phillippy, A. M., Zimin, A. V., Puiu, D., Magoc, T., Koren, S, Treangen, T., Schatz,M. C., Delcher, A. L., Roberts, M., Marcais, G., Pop, M., and Yorke, J. A. GAGE: a criticalevaluation of genome assemblies and assembly algorithms. Genome Research (2011).
[J7] Treangen, T. J., Sommer, D. D., Angly, F. E., Koren, S., and Pop, M. Next generation sequenceassembly with amos. Current Protocols in Bioinformatics 33 (2011), 11.8.1–11.8.18.
[J8] Koren, S., Miller, J., Walenz, B., and Sutton, G. An algorithm for automated closure duringassembly (highly accessed). BMC Bioinformatics 11, 1 (2010), 457.
[J9] Miller, J., Koren, S., and Sutton, G. Assembly algorithms for next-generation sequencing data.Genomics (2010).
[J10] Rausch, T., Koren, S., Denisov, G., Weese, D., Emde, A., Doring, A., and Reinert, K. A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads. Bioin-formatics 25, 9 (2009), 1118.
[J11] Miller, J., Delcher, A., Koren, S., Venter, E., Walenz, B., Brownley, A., Johnson, J., Li, K., Mo-barry, C., and Sutton, G. Aggressive assembly of pyrosequencing reads with mates. Bioinformatics24, 24 (2008), 2818.
2
Publications
Journal Articles
[J1] Koren S, Schatz, M. C., Walenz, B. P., Martin, J., Howard, J., Ganapathy, G., Wang, Z., Rasko,D. A., McCombie, W. R., Jarvis, E. D., and Phillippy, A. M. Hybrid error correction and de novoassembly of single-molecule sequencing reads. Nature Biotechnology In Review (2012).
[J2] Prüfer, K., Munch, K., Hellmann, I., Akagi, K., Miller, J. R., Walenz, B., Koren S, Sutton, G.,Kodira, C., Winer, R., Knight, J. R., Mullikin, J. C., Meader, S. J., Ponting, C. P., Lunter, G.,Higashino, S., Hobolth, A., Dutheil, J., Karakoç, E., Alkan, C., Sajjadian, S., Catacchio, C. R.,Ventura, M., Marques-Bonet, T., Eichler, E. E., AndrO, C., Atencia, R., Mugisha, L., Patterson,N., Siebauer, M., Good, J. M., Fischer, A., Ptak, S. E., Lachmann, M., Symer, D. E., Mailund, T.,Schierup, M. H., Andrés, A. M., Kelso, J., and Pääbo, S. The bonobo genome compared with thegenomes of chimpanzee and human. Nature In Review (2012).
[J3] Treangen*, T. J., Koren S*, Sommer, D., Astrovskaya, I., Liu, B., Darling, A. E., and Pop, M.metAMOS: A modular and open source metagenomic assembly and analysis pipeline. Genome BiologyIn Review (2012).
[J4] Earl, D. A., Bradnam, K., St. John, J., Darling, A., Lin, D., Faas, J., Yu, H. O. K., Vince, B., Zerbino,D. R., Diekhans, M., Nguyen, N., Nuwantha, P., Sung, A. W.-K., Ning, Z., Haimel, M., Simpson,J. T., Fronseca, N. A., Birol, Ä., Docking, T. R., Ho, I. Y., Rokhsar, D. S., Chikhi, R., Lavenier,D., Chapuis, G., Naquin, D., Maillet, N., Schatz, M. C., Kelly, D. R., Phillippy, A. M., Koren, S,Yang, S.-P., Wu, W., Chou, W.-C., Srivastava, A., Shaw, T. I., Ruby, J. G., Skewes-Cox, P., Betegon,M., Dimon, M. T., Solovyev, V., Kosarev, P., Vorobyev, D., Ramirez-Gonzalez, R., Leggett, R.,MacLean, D., Xia, F., Luo, R., L, Z., Xie, Y., Liu, B., Gnerre, S., MacCallum, I., Przybylski, D.,Ribeiro, F. J., Yin, S., Sharpe, T., Hall, G., Kersey, P. J., Durbin, R., Jackman, S. D., Chapman,J. A., Huang, X., DeRisi, J. L., Caccamo, M., Li, Y., Jaffe, D. B., Green, R., Haussler, D., Korf, I.,and Paten, B. Assemblathon 1: A competitive assessment of de novo short read assembly methods.Genome Research (2011).
[J5] Koren, S, Treangen, T. J., and Pop, M. Bambus 2: scaffolding metagenomes. Bioinformatics 27, 21(2011), 2964–2971.
[J6] Salzberg, S. L., Phillippy, A. M., Zimin, A. V., Puiu, D., Magoc, T., Koren, S, Treangen, T., Schatz,M. C., Delcher, A. L., Roberts, M., Marcais, G., Pop, M., and Yorke, J. A. GAGE: a criticalevaluation of genome assemblies and assembly algorithms. Genome Research (2011).
[J7] Treangen, T. J., Sommer, D. D., Angly, F. E., Koren, S., and Pop, M. Next generation sequenceassembly with amos. Current Protocols in Bioinformatics 33 (2011), 11.8.1–11.8.18.
[J8] Koren, S., Miller, J., Walenz, B., and Sutton, G. An algorithm for automated closure duringassembly (highly accessed). BMC Bioinformatics 11, 1 (2010), 457.
[J9] Miller, J., Koren, S., and Sutton, G. Assembly algorithms for next-generation sequencing data.Genomics (2010).
[J10] Rausch, T., Koren, S., Denisov, G., Weese, D., Emde, A., Doring, A., and Reinert, K. A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads. Bioin-formatics 25, 9 (2009), 1118.
[J11] Miller, J., Delcher, A., Koren, S., Venter, E., Walenz, B., Brownley, A., Johnson, J., Li, K., Mo-barry, C., and Sutton, G. Aggressive assembly of pyrosequencing reads with mates. Bioinformatics24, 24 (2008), 2818.
2
Bambus 2 / AMOS hQp://www.cbcb.umd.edu/sooware/bambus/
metAMOS hQps://github.com/treangen/metAMOS/wiki
hQp://sourceforge.net/apps/mediawiki/wgs-‐assembler/index.php?Htle=PacBioToCA
SMRT Sequencing
Time
Inte
nsity
http://www.pacificbiosciences.com/assets/files/pacbio_technology_backgrounder.pdf
Imaging of florescent phospholinked labeled nucleotides as they are incorporated by a polymerase anchored to a Zero-Mode Waveguide (ZMW).
Select Assembly PublicaHons Publications
Journal Articles
[J1] Koren S, Schatz, M. C., Walenz, B. P., Martin, J., Howard, J., Ganapathy, G., Wang, Z., Rasko,D. A., McCombie, W. R., Jarvis, E. D., and Phillippy, A. M. Hybrid error correction and de novoassembly of single-molecule sequencing reads. Nature Biotechnology In Review (2012).
[J2] Prüfer, K., Munch, K., Hellmann, I., Akagi, K., Miller, J. R., Walenz, B., Koren S, Sutton, G.,Kodira, C., Winer, R., Knight, J. R., Mullikin, J. C., Meader, S. J., Ponting, C. P., Lunter, G.,Higashino, S., Hobolth, A., Dutheil, J., Karakoç, E., Alkan, C., Sajjadian, S., Catacchio, C. R.,Ventura, M., Marques-Bonet, T., Eichler, E. E., AndrO, C., Atencia, R., Mugisha, L., Patterson,N., Siebauer, M., Good, J. M., Fischer, A., Ptak, S. E., Lachmann, M., Symer, D. E., Mailund, T.,Schierup, M. H., Andrés, A. M., Kelso, J., and Pääbo, S. The bonobo genome compared with thegenomes of chimpanzee and human. Nature In Review (2012).
[J3] Treangen*, T. J., Koren S*, Sommer, D., Astrovskaya, I., Liu, B., Darling, A. E., and Pop, M.metAMOS: A modular and open source metagenomic assembly and analysis pipeline. Genome BiologyIn Review (2012).
[J4] Earl, D. A., Bradnam, K., St. John, J., Darling, A., Lin, D., Faas, J., Yu, H. O. K., Vince, B., Zerbino,D. R., Diekhans, M., Nguyen, N., Nuwantha, P., Sung, A. W.-K., Ning, Z., Haimel, M., Simpson,J. T., Fronseca, N. A., Birol, Ä., Docking, T. R., Ho, I. Y., Rokhsar, D. S., Chikhi, R., Lavenier,D., Chapuis, G., Naquin, D., Maillet, N., Schatz, M. C., Kelly, D. R., Phillippy, A. M., Koren, S,Yang, S.-P., Wu, W., Chou, W.-C., Srivastava, A., Shaw, T. I., Ruby, J. G., Skewes-Cox, P., Betegon,M., Dimon, M. T., Solovyev, V., Kosarev, P., Vorobyev, D., Ramirez-Gonzalez, R., Leggett, R.,MacLean, D., Xia, F., Luo, R., L, Z., Xie, Y., Liu, B., Gnerre, S., MacCallum, I., Przybylski, D.,Ribeiro, F. J., Yin, S., Sharpe, T., Hall, G., Kersey, P. J., Durbin, R., Jackman, S. D., Chapman,J. A., Huang, X., DeRisi, J. L., Caccamo, M., Li, Y., Jaffe, D. B., Green, R., Haussler, D., Korf, I.,and Paten, B. Assemblathon 1: A competitive assessment of de novo short read assembly methods.Genome Research (2011).
[J5] Koren, S, Treangen, T. J., and Pop, M. Bambus 2: scaffolding metagenomes. Bioinformatics 27, 21(2011), 2964–2971.
[J6] Salzberg, S. L., Phillippy, A. M., Zimin, A. V., Puiu, D., Magoc, T., Koren, S, Treangen, T., Schatz,M. C., Delcher, A. L., Roberts, M., Marcais, G., Pop, M., and Yorke, J. A. GAGE: a criticalevaluation of genome assemblies and assembly algorithms. Genome Research (2011).
[J7] Treangen, T. J., Sommer, D. D., Angly, F. E., Koren, S., and Pop, M. Next generation sequenceassembly with amos. Current Protocols in Bioinformatics 33 (2011), 11.8.1–11.8.18.
[J8] Koren, S., Miller, J., Walenz, B., and Sutton, G. An algorithm for automated closure duringassembly (highly accessed). BMC Bioinformatics 11, 1 (2010), 457.
[J9] Miller, J., Koren, S., and Sutton, G. Assembly algorithms for next-generation sequencing data.Genomics (2010).
[J10] Rausch, T., Koren, S., Denisov, G., Weese, D., Emde, A., Doring, A., and Reinert, K. A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads. Bioin-formatics 25, 9 (2009), 1118.
[J11] Miller, J., Delcher, A., Koren, S., Venter, E., Walenz, B., Brownley, A., Johnson, J., Li, K., Mo-barry, C., and Sutton, G. Aggressive assembly of pyrosequencing reads with mates. Bioinformatics24, 24 (2008), 2818.
2
Publications
Journal Articles
[J1] Koren S, Schatz, M. C., Walenz, B. P., Martin, J., Howard, J., Ganapathy, G., Wang, Z., Rasko,D. A., McCombie, W. R., Jarvis, E. D., and Phillippy, A. M. Hybrid error correction and de novoassembly of single-molecule sequencing reads. Nature Biotechnology In Review (2012).
[J2] Prüfer, K., Munch, K., Hellmann, I., Akagi, K., Miller, J. R., Walenz, B., Koren S, Sutton, G.,Kodira, C., Winer, R., Knight, J. R., Mullikin, J. C., Meader, S. J., Ponting, C. P., Lunter, G.,Higashino, S., Hobolth, A., Dutheil, J., Karakoç, E., Alkan, C., Sajjadian, S., Catacchio, C. R.,Ventura, M., Marques-Bonet, T., Eichler, E. E., AndrO, C., Atencia, R., Mugisha, L., Patterson,N., Siebauer, M., Good, J. M., Fischer, A., Ptak, S. E., Lachmann, M., Symer, D. E., Mailund, T.,Schierup, M. H., Andrés, A. M., Kelso, J., and Pääbo, S. The bonobo genome compared with thegenomes of chimpanzee and human. Nature In Review (2012).
[J3] Treangen*, T. J., Koren S*, Sommer, D., Astrovskaya, I., Liu, B., Darling, A. E., and Pop, M.metAMOS: A modular and open source metagenomic assembly and analysis pipeline. Genome BiologyIn Review (2012).
[J4] Earl, D. A., Bradnam, K., St. John, J., Darling, A., Lin, D., Faas, J., Yu, H. O. K., Vince, B., Zerbino,D. R., Diekhans, M., Nguyen, N., Nuwantha, P., Sung, A. W.-K., Ning, Z., Haimel, M., Simpson,J. T., Fronseca, N. A., Birol, Ä., Docking, T. R., Ho, I. Y., Rokhsar, D. S., Chikhi, R., Lavenier,D., Chapuis, G., Naquin, D., Maillet, N., Schatz, M. C., Kelly, D. R., Phillippy, A. M., Koren, S,Yang, S.-P., Wu, W., Chou, W.-C., Srivastava, A., Shaw, T. I., Ruby, J. G., Skewes-Cox, P., Betegon,M., Dimon, M. T., Solovyev, V., Kosarev, P., Vorobyev, D., Ramirez-Gonzalez, R., Leggett, R.,MacLean, D., Xia, F., Luo, R., L, Z., Xie, Y., Liu, B., Gnerre, S., MacCallum, I., Przybylski, D.,Ribeiro, F. J., Yin, S., Sharpe, T., Hall, G., Kersey, P. J., Durbin, R., Jackman, S. D., Chapman,J. A., Huang, X., DeRisi, J. L., Caccamo, M., Li, Y., Jaffe, D. B., Green, R., Haussler, D., Korf, I.,and Paten, B. Assemblathon 1: A competitive assessment of de novo short read assembly methods.Genome Research (2011).
[J5] Koren, S, Treangen, T. J., and Pop, M. Bambus 2: scaffolding metagenomes. Bioinformatics 27, 21(2011), 2964–2971.
[J6] Salzberg, S. L., Phillippy, A. M., Zimin, A. V., Puiu, D., Magoc, T., Koren, S, Treangen, T., Schatz,M. C., Delcher, A. L., Roberts, M., Marcais, G., Pop, M., and Yorke, J. A. GAGE: a criticalevaluation of genome assemblies and assembly algorithms. Genome Research (2011).
[J7] Treangen, T. J., Sommer, D. D., Angly, F. E., Koren, S., and Pop, M. Next generation sequenceassembly with amos. Current Protocols in Bioinformatics 33 (2011), 11.8.1–11.8.18.
[J8] Koren, S., Miller, J., Walenz, B., and Sutton, G. An algorithm for automated closure duringassembly (highly accessed). BMC Bioinformatics 11, 1 (2010), 457.
[J9] Miller, J., Koren, S., and Sutton, G. Assembly algorithms for next-generation sequencing data.Genomics (2010).
[J10] Rausch, T., Koren, S., Denisov, G., Weese, D., Emde, A., Doring, A., and Reinert, K. A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads. Bioin-formatics 25, 9 (2009), 1118.
[J11] Miller, J., Delcher, A., Koren, S., Venter, E., Walenz, B., Brownley, A., Johnson, J., Li, K., Mo-barry, C., and Sutton, G. Aggressive assembly of pyrosequencing reads with mates. Bioinformatics24, 24 (2008), 2818.
2
Publications
Journal Articles
[J1] Koren S, Schatz, M. C., Walenz, B. P., Martin, J., Howard, J., Ganapathy, G., Wang, Z., Rasko,D. A., McCombie, W. R., Jarvis, E. D., and Phillippy, A. M. Hybrid error correction and de novoassembly of single-molecule sequencing reads. Nature Biotechnology In Review (2012).
[J2] Prüfer, K., Munch, K., Hellmann, I., Akagi, K., Miller, J. R., Walenz, B., Koren S, Sutton, G.,Kodira, C., Winer, R., Knight, J. R., Mullikin, J. C., Meader, S. J., Ponting, C. P., Lunter, G.,Higashino, S., Hobolth, A., Dutheil, J., Karakoç, E., Alkan, C., Sajjadian, S., Catacchio, C. R.,Ventura, M., Marques-Bonet, T., Eichler, E. E., AndrO, C., Atencia, R., Mugisha, L., Patterson,N., Siebauer, M., Good, J. M., Fischer, A., Ptak, S. E., Lachmann, M., Symer, D. E., Mailund, T.,Schierup, M. H., Andrés, A. M., Kelso, J., and Pääbo, S. The bonobo genome compared with thegenomes of chimpanzee and human. Nature In Review (2012).
[J3] Treangen*, T. J., Koren S*, Sommer, D., Astrovskaya, I., Liu, B., Darling, A. E., and Pop, M.metAMOS: A modular and open source metagenomic assembly and analysis pipeline. Genome BiologyIn Review (2012).
[J4] Earl, D. A., Bradnam, K., St. John, J., Darling, A., Lin, D., Faas, J., Yu, H. O. K., Vince, B., Zerbino,D. R., Diekhans, M., Nguyen, N., Nuwantha, P., Sung, A. W.-K., Ning, Z., Haimel, M., Simpson,J. T., Fronseca, N. A., Birol, Ä., Docking, T. R., Ho, I. Y., Rokhsar, D. S., Chikhi, R., Lavenier,D., Chapuis, G., Naquin, D., Maillet, N., Schatz, M. C., Kelly, D. R., Phillippy, A. M., Koren, S,Yang, S.-P., Wu, W., Chou, W.-C., Srivastava, A., Shaw, T. I., Ruby, J. G., Skewes-Cox, P., Betegon,M., Dimon, M. T., Solovyev, V., Kosarev, P., Vorobyev, D., Ramirez-Gonzalez, R., Leggett, R.,MacLean, D., Xia, F., Luo, R., L, Z., Xie, Y., Liu, B., Gnerre, S., MacCallum, I., Przybylski, D.,Ribeiro, F. J., Yin, S., Sharpe, T., Hall, G., Kersey, P. J., Durbin, R., Jackman, S. D., Chapman,J. A., Huang, X., DeRisi, J. L., Caccamo, M., Li, Y., Jaffe, D. B., Green, R., Haussler, D., Korf, I.,and Paten, B. Assemblathon 1: A competitive assessment of de novo short read assembly methods.Genome Research (2011).
[J5] Koren, S, Treangen, T. J., and Pop, M. Bambus 2: scaffolding metagenomes. Bioinformatics 27, 21(2011), 2964–2971.
[J6] Salzberg, S. L., Phillippy, A. M., Zimin, A. V., Puiu, D., Magoc, T., Koren, S, Treangen, T., Schatz,M. C., Delcher, A. L., Roberts, M., Marcais, G., Pop, M., and Yorke, J. A. GAGE: a criticalevaluation of genome assemblies and assembly algorithms. Genome Research (2011).
[J7] Treangen, T. J., Sommer, D. D., Angly, F. E., Koren, S., and Pop, M. Next generation sequenceassembly with amos. Current Protocols in Bioinformatics 33 (2011), 11.8.1–11.8.18.
[J8] Koren, S., Miller, J., Walenz, B., and Sutton, G. An algorithm for automated closure duringassembly (highly accessed). BMC Bioinformatics 11, 1 (2010), 457.
[J9] Miller, J., Koren, S., and Sutton, G. Assembly algorithms for next-generation sequencing data.Genomics (2010).
[J10] Rausch, T., Koren, S., Denisov, G., Weese, D., Emde, A., Doring, A., and Reinert, K. A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads. Bioin-formatics 25, 9 (2009), 1118.
[J11] Miller, J., Delcher, A., Koren, S., Venter, E., Walenz, B., Brownley, A., Johnson, J., Li, K., Mo-barry, C., and Sutton, G. Aggressive assembly of pyrosequencing reads with mates. Bioinformatics24, 24 (2008), 2818.
2
Publications
Journal Articles
[J1] Koren S, Schatz, M. C., Walenz, B. P., Martin, J., Howard, J., Ganapathy, G., Wang, Z., Rasko,D. A., McCombie, W. R., Jarvis, E. D., and Phillippy, A. M. Hybrid error correction and de novoassembly of single-molecule sequencing reads. Nature Biotechnology In Review (2012).
[J2] Prüfer, K., Munch, K., Hellmann, I., Akagi, K., Miller, J. R., Walenz, B., Koren S, Sutton, G.,Kodira, C., Winer, R., Knight, J. R., Mullikin, J. C., Meader, S. J., Ponting, C. P., Lunter, G.,Higashino, S., Hobolth, A., Dutheil, J., Karakoç, E., Alkan, C., Sajjadian, S., Catacchio, C. R.,Ventura, M., Marques-Bonet, T., Eichler, E. E., AndrO, C., Atencia, R., Mugisha, L., Patterson,N., Siebauer, M., Good, J. M., Fischer, A., Ptak, S. E., Lachmann, M., Symer, D. E., Mailund, T.,Schierup, M. H., Andrés, A. M., Kelso, J., and Pääbo, S. The bonobo genome compared with thegenomes of chimpanzee and human. Nature In Review (2012).
[J3] Treangen*, T. J., Koren S*, Sommer, D., Astrovskaya, I., Liu, B., Darling, A. E., and Pop, M.metAMOS: A modular and open source metagenomic assembly and analysis pipeline. Genome BiologyIn Review (2012).
[J4] Earl, D. A., Bradnam, K., St. John, J., Darling, A., Lin, D., Faas, J., Yu, H. O. K., Vince, B., Zerbino,D. R., Diekhans, M., Nguyen, N., Nuwantha, P., Sung, A. W.-K., Ning, Z., Haimel, M., Simpson,J. T., Fronseca, N. A., Birol, Ä., Docking, T. R., Ho, I. Y., Rokhsar, D. S., Chikhi, R., Lavenier,D., Chapuis, G., Naquin, D., Maillet, N., Schatz, M. C., Kelly, D. R., Phillippy, A. M., Koren, S,Yang, S.-P., Wu, W., Chou, W.-C., Srivastava, A., Shaw, T. I., Ruby, J. G., Skewes-Cox, P., Betegon,M., Dimon, M. T., Solovyev, V., Kosarev, P., Vorobyev, D., Ramirez-Gonzalez, R., Leggett, R.,MacLean, D., Xia, F., Luo, R., L, Z., Xie, Y., Liu, B., Gnerre, S., MacCallum, I., Przybylski, D.,Ribeiro, F. J., Yin, S., Sharpe, T., Hall, G., Kersey, P. J., Durbin, R., Jackman, S. D., Chapman,J. A., Huang, X., DeRisi, J. L., Caccamo, M., Li, Y., Jaffe, D. B., Green, R., Haussler, D., Korf, I.,and Paten, B. Assemblathon 1: A competitive assessment of de novo short read assembly methods.Genome Research (2011).
[J5] Koren, S, Treangen, T. J., and Pop, M. Bambus 2: scaffolding metagenomes. Bioinformatics 27, 21(2011), 2964–2971.
[J6] Salzberg, S. L., Phillippy, A. M., Zimin, A. V., Puiu, D., Magoc, T., Koren, S, Treangen, T., Schatz,M. C., Delcher, A. L., Roberts, M., Marcais, G., Pop, M., and Yorke, J. A. GAGE: a criticalevaluation of genome assemblies and assembly algorithms. Genome Research (2011).
[J7] Treangen, T. J., Sommer, D. D., Angly, F. E., Koren, S., and Pop, M. Next generation sequenceassembly with amos. Current Protocols in Bioinformatics 33 (2011), 11.8.1–11.8.18.
[J8] Koren, S., Miller, J., Walenz, B., and Sutton, G. An algorithm for automated closure duringassembly (highly accessed). BMC Bioinformatics 11, 1 (2010), 457.
[J9] Miller, J., Koren, S., and Sutton, G. Assembly algorithms for next-generation sequencing data.Genomics (2010).
[J10] Rausch, T., Koren, S., Denisov, G., Weese, D., Emde, A., Doring, A., and Reinert, K. A consistency-based consensus algorithm for de novo and reference-guided sequence assembly of short reads. Bioin-formatics 25, 9 (2009), 1118.
[J11] Miller, J., Delcher, A., Koren, S., Venter, E., Walenz, B., Brownley, A., Johnson, J., Li, K., Mo-barry, C., and Sutton, G. Aggressive assembly of pyrosequencing reads with mates. Bioinformatics24, 24 (2008), 2818.
2
35
Bonobo Sequencing Project
GAGE: Assembly CompeHHon
Surveys
UMD Pop Lab Mihai Pop
Lee Mendelowitz Todd Treangen Dan Sommer
Bo Liu Henry Lin
Mohammadreza Ghodsi Irina Astrovskaya Christopher Hill Chengxi Ye
Joseph Paulson
USDA JCVI Tim Smith
Gregory Harhay Dayna Harhay ScoQ McVey
Brian P. Walenz Jason R. Miller Granger SuQon
JGI Duke University Jeffrey MarHn Zhong Wang
Jason Howard Ganeshkumar Ganapathy
Erich D. Jarvis
CSHL UMD SOM W. Richard McCombie
Michael Schatz David A. Rasko
NBACC JHU Adam Phillippy Brian Ondov
Steven L. Salzberg Daniela Puiu Tanja Magoc
UC Davis Aaron E. Darling
Acknowledgements
36