multiple sequence alignment (msa) and phylogeny
DESCRIPTION
Multiple Sequence Alignment (MSA) and Phylogeny. Clustal X. Input: multiple sequence Fasta file. >gi|21536452|ref|NP_002762.2| mesotrypsin preproprotein [Homo sapiens] MNPFLILAFVGAAVAVPFDDDDKIVGGYTCEENSLPYQVSLNSGSHFCGGSLISEQWVVSAAHCYKTRIQ - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/1.jpg)
Multiple Multiple Sequence Sequence
Alignment (MSA)Alignment (MSA)andand
Phylogeny Phylogeny
![Page 2: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/2.jpg)
Clustal XClustal X
![Page 3: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/3.jpg)
Input: multiple sequence Fasta fileInput: multiple sequence Fasta file>gi|21536452|ref|NP_002762.2| mesotrypsin preproprotein [Homo sapiens]>gi|21536452|ref|NP_002762.2| mesotrypsin preproprotein [Homo sapiens]MNPFLILAFVGAAVAVPFDDDDKIVGGYTCEENSLPYQVSLNSGSHFCGGSLISEQWVVSAAHCYKTRIQMNPFLILAFVGAAVAVPFDDDDKIVGGYTCEENSLPYQVSLNSGSHFCGGSLISEQWVVSAAHCYKTRIQVRLGEHNIKVLEGNEQFINAAKIIRHPKYNRDTLDNDIMLIKLSSPAVINARVSTISLPTAPPAAGTECLVRLGEHNIKVLEGNEQFINAAKIIRHPKYNRDTLDNDIMLIKLSSPAVINARVSTISLPTAPPAAGTECLISGWGNTLSFGADYPDELKCLDAPVLTQAECKASYPGKITNSMFCVGFLEGGKDSCQRDSGGPVVCNGQLISGWGNTLSFGADYPDELKCLDAPVLTQAECKASYPGKITNSMFCVGFLEGGKDSCQRDSGGPVVCNGQLQGVVSWGHGCAWKNRPGVYTKVYNYVDWIKDTIAANSQGVVSWGHGCAWKNRPGVYTKVYNYVDWIKDTIAANS
>gi|114051746|ref|NP_001040585.1| protease, serine, 2 [Macaca mulatta]>gi|114051746|ref|NP_001040585.1| protease, serine, 2 [Macaca mulatta]MNPLLILAFVGVAVAAPFDDDDKIVGGYTCEENSVPYQVSLNSGYHFCGGSLINEQWVVSAAHCYKTRIQMNPLLILAFVGVAVAAPFDDDDKIVGGYTCEENSVPYQVSLNSGYHFCGGSLINEQWVVSAAHCYKTRIQVRLGEHNIEVLEGTEQFINAAKIIRHPDYDRKTLNNDILLIKLSSPAVINARVSTISLPTAPPAAGAEALVRLGEHNIEVLEGTEQFINAAKIIRHPDYDRKTLNNDILLIKLSSPAVINARVSTISLPTAPPAAGAEALISGWGNTLSSGADYPDELQCLEAPVLSQAECEASYPGKITSNMFCVGFLEGGKDSCQGDSGGPVVSNGQLISGWGNTLSSGADYPDELQCLEAPVLSQAECEASYPGKITSNMFCVGFLEGGKDSCQGDSGGPVVSNGQLQGIVSWGYGCAQKNRPGVYTKVYNYVDWIRDTIAANSQGIVSWGYGCAQKNRPGVYTKVYNYVDWIRDTIAANS
>gi|6755891|ref|NP_035775.1| mesotrypsin [Mus musculus]>gi|6755891|ref|NP_035775.1| mesotrypsin [Mus musculus]MNALLILALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKTRIQMNALLILALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKTRIQVRLGEHNINVLEGNEQFVNAAKIIKHPNFNRKTLNNDIMLLKLSSPVTLNARVATVALPSSCAPAGTQCLVRLGEHNINVLEGNEQFVNAAKIIKHPNFNRKTLNNDIMLLKLSSPVTLNARVATVALPSSCAPAGTQCLISGWGNTLSFGVSEPDLLQCLDAPLLPQADCEASYPGKITGNMVCAGFLEGGKDSCQGDSGGPVVCNRELISGWGNTLSFGVSEPDLLQCLDAPLLPQADCEASYPGKITGNMVCAGFLEGGKDSCQGDSGGPVVCNRELQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAANQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAAN
>gi|6981422|ref|NP_036861.1| protease, serine, 2 [Rattus norvegicus]>gi|6981422|ref|NP_036861.1| protease, serine, 2 [Rattus norvegicus]MRALLFLALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKSRIQMRALLFLALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKSRIQVRLGEHNINVLEGNEQFVNAAKIIKHPNFDRKTLNNDIMLIKLSSPVKLNARVATVALPSSCAPAGTQCLVRLGEHNINVLEGNEQFVNAAKIIKHPNFDRKTLNNDIMLIKLSSPVKLNARVATVALPSSCAPAGTQCLISGWGNTLSSGVNEPDLLQCLDAPLLPQADCEASYPGKITDNMVCVGFLEGGKDSCQGDSGGPVVCNGELISGWGNTLSSGVNEPDLLQCLDAPLLPQADCEASYPGKITDNMVCVGFLEGGKDSCQGDSGGPVVCNGELQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAANQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAAN
>gi|27819626|ref|NP_777115.1| pancreatic anionic trypsinogen [Bos taurus]>gi|27819626|ref|NP_777115.1| pancreatic anionic trypsinogen [Bos taurus]MHPLLILAFVGAAVAFPSDDDDKIVGGYTCAENSVPYQVSLNAGYHFCGGSLINDQWVVSAAHCYQYHIQMHPLLILAFVGAAVAFPSDDDDKIVGGYTCAENSVPYQVSLNAGYHFCGGSLINDQWVVSAAHCYQYHIQVRLGEYNIDVLEGGEQFIDASKIIRHPKYSSWTLDNDILLIKLSTPAVINARVSTLALPSACASGSTECLVRLGEYNIDVLEGGEQFIDASKIIRHPKYSSWTLDNDILLIKLSTPAVINARVSTLALPSACASGSTECL. . .. . .
![Page 4: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/4.jpg)
OneOne of the options to get multiple of the options to get multiple sequence Fasta filesequence Fasta file
![Page 5: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/5.jpg)
OneOne of the options to get multiple of the options to get multiple sequence Fasta filesequence Fasta file
![Page 6: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/6.jpg)
Input: multiple sequence Fasta fileInput: multiple sequence Fasta file>gi|21536452|ref|NP_002762.2| mesotrypsin preproprotein [Homo sapiens]>gi|21536452|ref|NP_002762.2| mesotrypsin preproprotein [Homo sapiens]MNPFLILAFVGAAVAVPFDDDDKIVGGYTCEENSLPYQVSLNSGSHFCGGSLISEQWVVSAAHCYKTRIQMNPFLILAFVGAAVAVPFDDDDKIVGGYTCEENSLPYQVSLNSGSHFCGGSLISEQWVVSAAHCYKTRIQVRLGEHNIKVLEGNEQFINAAKIIRHPKYNRDTLDNDIMLIKLSSPAVINARVSTISLPTAPPAAGTECLVRLGEHNIKVLEGNEQFINAAKIIRHPKYNRDTLDNDIMLIKLSSPAVINARVSTISLPTAPPAAGTECLISGWGNTLSFGADYPDELKCLDAPVLTQAECKASYPGKITNSMFCVGFLEGGKDSCQRDSGGPVVCNGQLISGWGNTLSFGADYPDELKCLDAPVLTQAECKASYPGKITNSMFCVGFLEGGKDSCQRDSGGPVVCNGQLQGVVSWGHGCAWKNRPGVYTKVYNYVDWIKDTIAANSQGVVSWGHGCAWKNRPGVYTKVYNYVDWIKDTIAANS
>gi|114051746|ref|NP_001040585.1| protease, serine, 2 [Macaca mulatta]>gi|114051746|ref|NP_001040585.1| protease, serine, 2 [Macaca mulatta]MNPLLILAFVGVAVAAPFDDDDKIVGGYTCEENSVPYQVSLNSGYHFCGGSLINEQWVVSAAHCYKTRIQMNPLLILAFVGVAVAAPFDDDDKIVGGYTCEENSVPYQVSLNSGYHFCGGSLINEQWVVSAAHCYKTRIQVRLGEHNIEVLEGTEQFINAAKIIRHPDYDRKTLNNDILLIKLSSPAVINARVSTISLPTAPPAAGAEALVRLGEHNIEVLEGTEQFINAAKIIRHPDYDRKTLNNDILLIKLSSPAVINARVSTISLPTAPPAAGAEALISGWGNTLSSGADYPDELQCLEAPVLSQAECEASYPGKITSNMFCVGFLEGGKDSCQGDSGGPVVSNGQLISGWGNTLSSGADYPDELQCLEAPVLSQAECEASYPGKITSNMFCVGFLEGGKDSCQGDSGGPVVSNGQLQGIVSWGYGCAQKNRPGVYTKVYNYVDWIRDTIAANSQGIVSWGYGCAQKNRPGVYTKVYNYVDWIRDTIAANS
>gi|6755891|ref|NP_035775.1| mesotrypsin [Mus musculus]>gi|6755891|ref|NP_035775.1| mesotrypsin [Mus musculus]MNALLILALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKTRIQMNALLILALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKTRIQVRLGEHNINVLEGNEQFVNAAKIIKHPNFNRKTLNNDIMLLKLSSPVTLNARVATVALPSSCAPAGTQCLVRLGEHNINVLEGNEQFVNAAKIIKHPNFNRKTLNNDIMLLKLSSPVTLNARVATVALPSSCAPAGTQCLISGWGNTLSFGVSEPDLLQCLDAPLLPQADCEASYPGKITGNMVCAGFLEGGKDSCQGDSGGPVVCNRELISGWGNTLSFGVSEPDLLQCLDAPLLPQADCEASYPGKITGNMVCAGFLEGGKDSCQGDSGGPVVCNRELQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAANQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAAN
>gi|6981422|ref|NP_036861.1| protease, serine, 2 [Rattus norvegicus]>gi|6981422|ref|NP_036861.1| protease, serine, 2 [Rattus norvegicus]MRALLFLALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKSRIQMRALLFLALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKSRIQVRLGEHNINVLEGNEQFVNAAKIIKHPNFDRKTLNNDIMLIKLSSPVKLNARVATVALPSSCAPAGTQCLVRLGEHNINVLEGNEQFVNAAKIIKHPNFDRKTLNNDIMLIKLSSPVKLNARVATVALPSSCAPAGTQCLISGWGNTLSSGVNEPDLLQCLDAPLLPQADCEASYPGKITDNMVCVGFLEGGKDSCQGDSGGPVVCNGELISGWGNTLSSGVNEPDLLQCLDAPLLPQADCEASYPGKITDNMVCVGFLEGGKDSCQGDSGGPVVCNGELQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAANQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAAN
>gi|27819626|ref|NP_777115.1| pancreatic anionic trypsinogen [Bos taurus]>gi|27819626|ref|NP_777115.1| pancreatic anionic trypsinogen [Bos taurus]MHPLLILAFVGAAVAFPSDDDDKIVGGYTCAENSVPYQVSLNAGYHFCGGSLINDQWVVSAAHCYQYHIQMHPLLILAFVGAAVAFPSDDDDKIVGGYTCAENSVPYQVSLNAGYHFCGGSLINDQWVVSAAHCYQYHIQVRLGEYNIDVLEGGEQFIDASKIIRHPKYSSWTLDNDILLIKLSTPAVINARVSTLALPSACASGSTECLVRLGEYNIDVLEGGEQFIDASKIIRHPKYSSWTLDNDILLIKLSTPAVINARVSTLALPSACASGSTECL. . .. . .
![Page 7: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/7.jpg)
Input: multiple sequence Fasta fileInput: multiple sequence Fasta file>>gi|21536452|ref|NP_002762.2|gi|21536452|ref|NP_002762.2| mesotrypsin preproprotein [Homo sapiens]mesotrypsin preproprotein [Homo sapiens]MNPFLILAFVGAAVAVPFDDDDKIVGGYTCEENSLPYQVSLNSGSHFCGGSLISEQWVVSAAHCYKTRIQMNPFLILAFVGAAVAVPFDDDDKIVGGYTCEENSLPYQVSLNSGSHFCGGSLISEQWVVSAAHCYKTRIQVRLGEHNIKVLEGNEQFINAAKIIRHPKYNRDTLDNDIMLIKLSSPAVINARVSTISLPTAPPAAGTECLVRLGEHNIKVLEGNEQFINAAKIIRHPKYNRDTLDNDIMLIKLSSPAVINARVSTISLPTAPPAAGTECLISGWGNTLSFGADYPDELKCLDAPVLTQAECKASYPGKITNSMFCVGFLEGGKDSCQRDSGGPVVCNGQLISGWGNTLSFGADYPDELKCLDAPVLTQAECKASYPGKITNSMFCVGFLEGGKDSCQRDSGGPVVCNGQLQGVVSWGHGCAWKNRPGVYTKVYNYVDWIKDTIAANSQGVVSWGHGCAWKNRPGVYTKVYNYVDWIKDTIAANS
>>gi|114051746|ref|NP_001040585.1|gi|114051746|ref|NP_001040585.1| protease, serine, 2 [Macaca mulatta]protease, serine, 2 [Macaca mulatta]MNPLLILAFVGVAVAAPFDDDDKIVGGYTCEENSVPYQVSLNSGYHFCGGSLINEQWVVSAAHCYKTRIQMNPLLILAFVGVAVAAPFDDDDKIVGGYTCEENSVPYQVSLNSGYHFCGGSLINEQWVVSAAHCYKTRIQVRLGEHNIEVLEGTEQFINAAKIIRHPDYDRKTLNNDILLIKLSSPAVINARVSTISLPTAPPAAGAEALVRLGEHNIEVLEGTEQFINAAKIIRHPDYDRKTLNNDILLIKLSSPAVINARVSTISLPTAPPAAGAEALISGWGNTLSSGADYPDELQCLEAPVLSQAECEASYPGKITSNMFCVGFLEGGKDSCQGDSGGPVVSNGQLISGWGNTLSSGADYPDELQCLEAPVLSQAECEASYPGKITSNMFCVGFLEGGKDSCQGDSGGPVVSNGQLQGIVSWGYGCAQKNRPGVYTKVYNYVDWIRDTIAANSQGIVSWGYGCAQKNRPGVYTKVYNYVDWIRDTIAANS
>>gi|6755891|ref|NP_035775.1|gi|6755891|ref|NP_035775.1| mesotrypsin [Mus musculus]mesotrypsin [Mus musculus]MNALLILALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKTRIQMNALLILALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKTRIQVRLGEHNINVLEGNEQFVNAAKIIKHPNFNRKTLNNDIMLLKLSSPVTLNARVATVALPSSCAPAGTQCLVRLGEHNINVLEGNEQFVNAAKIIKHPNFNRKTLNNDIMLLKLSSPVTLNARVATVALPSSCAPAGTQCLISGWGNTLSFGVSEPDLLQCLDAPLLPQADCEASYPGKITGNMVCAGFLEGGKDSCQGDSGGPVVCNRELISGWGNTLSFGVSEPDLLQCLDAPLLPQADCEASYPGKITGNMVCAGFLEGGKDSCQGDSGGPVVCNRELQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAANQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAAN
>>gi|6981422|ref|NP_036861.1|gi|6981422|ref|NP_036861.1| protease, serine, 2 [Rattus norvegicus]protease, serine, 2 [Rattus norvegicus]MRALLFLALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKSRIQMRALLFLALVGAAVAFPVDDDDKIVGGYTCQENSVPYQVSLNSGYHFCGGSLINDQWVVSAAHCYKSRIQVRLGEHNINVLEGNEQFVNAAKIIKHPNFDRKTLNNDIMLIKLSSPVKLNARVATVALPSSCAPAGTQCLVRLGEHNINVLEGNEQFVNAAKIIKHPNFDRKTLNNDIMLIKLSSPVKLNARVATVALPSSCAPAGTQCLISGWGNTLSSGVNEPDLLQCLDAPLLPQADCEASYPGKITDNMVCVGFLEGGKDSCQGDSGGPVVCNGELISGWGNTLSSGVNEPDLLQCLDAPLLPQADCEASYPGKITDNMVCVGFLEGGKDSCQGDSGGPVVCNGELQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAANQGIVSWGYGCALPDNPGVYTKVCNYVDWIQDTIAAN
>>gi|27819626|ref|NP_777115.1|gi|27819626|ref|NP_777115.1| pancreatic anionic trypsinogen [Bos taurus]pancreatic anionic trypsinogen [Bos taurus]MHPLLILAFVGAAVAFPSDDDDKIVGGYTCAENSVPYQVSLNAGYHFCGGSLINDQWVVSAAHCYQYHIQMHPLLILAFVGAAVAFPSDDDDKIVGGYTCAENSVPYQVSLNAGYHFCGGSLINDQWVVSAAHCYQYHIQVRLGEYNIDVLEGGEQFIDASKIIRHPKYSSWTLDNDILLIKLSTPAVINARVSTLALPSACASGSTECLVRLGEYNIDVLEGGEQFIDASKIIRHPKYSSWTLDNDILLIKLSTPAVINARVSTLALPSACASGSTECL. . .. . .
![Page 8: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/8.jpg)
Step1: Load the sequencesStep1: Load the sequences
![Page 9: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/9.jpg)
Sequences and conservation viewSequences and conservation view
![Page 10: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/10.jpg)
Step2: Perform AlignmentStep2: Perform Alignment
![Page 11: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/11.jpg)
Sequences and conservation viewSequences and conservation view
![Page 12: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/12.jpg)
Sequences and conservation viewSequences and conservation view
![Page 13: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/13.jpg)
Step 3: Create treeStep 3: Create tree
![Page 14: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/14.jpg)
Step 4: NJPlotStep 4: NJPlot
![Page 15: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/15.jpg)
Step 4: NJPlotStep 4: NJPlot
![Page 16: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/16.jpg)
The Newick tree format is used to represent trees as strings
CA D
In Newick format: ((A,C),(B,D));
B
Each pair of parenthesis () enclose a clade in the tree, and the comma separates the members of the corresponding clade.“;” – is always the last character
![Page 17: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/17.jpg)
How How robustrobust is our tree is our tree??
![Page 18: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/18.jpg)
We need some statistical way to estimate We need some statistical way to estimate the confidence in the tree topologythe confidence in the tree topology
But we don’t know anything about the tree But we don’t know anything about the tree topology distribution or parameterstopology distribution or parameters
The only data source we have is our data The only data source we have is our data (MSA)(MSA)
So, we must rely on our own resources: So, we must rely on our own resources: “pull up by your own bootstraps”“pull up by your own bootstraps”
How robust is our treeHow robust is our tree??
![Page 19: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/19.jpg)
Bootstrap(and jackknife)
![Page 20: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/20.jpg)
Jackknife1. We create n (typically 100-1000) new MSAs (pseudo-data sets) by randomly sampling half of the characters. (random samples without replacement)
We do not change the number of sequences, just the number of positions!
POS: 523161 : TATTT2 : CATTT3 : CACTTN : AACTT
POS: 187451 : TTTAT2 : TAACC3 : TAACCN : TGGGA
POS: 183941 : TTGTA2 : TAGAC3 : TAAACN : TGAGG
![Page 21: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/21.jpg)
Jackknife2. We reconstruct a tree from each data set, using the same method used for reconstructing the original tree
POS: 523161 : TATTT2 : CATTT3 : CACTTN : AACTT
POS: 187451 : TTTAT2 : TAACC3 : TAACCN : TGGGA
POS: 183941 : TTGTA2 : TAGAC3 : TAAACN : TGAGG
Sp1Sp2
Sp3Sp4
Sp1Sp2
Sp3Sp4
Sp1Sp2
Sp3Sp4
![Page 22: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/22.jpg)
3. For each node in our original tree, we count the number of times it appeared in the Jackknife analysis
Sp1Sp2
Sp3Sp4
Sp1Sp2
Sp3Sp4
Sp1Sp2
Sp3Sp4
Back to Jackknife
Sp1Sp2
Sp3
Sp4
67%100%
In 67% of the data sets, the node SP1+SP2 was found
![Page 23: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/23.jpg)
Bootstrap
The same as jackknife, but instead of sampling K/2 positions, we sample K positions with replacement
![Page 24: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/24.jpg)
Bootstrap
1. Resample K positions n times
12345 K1 : ATCTG…A 2 : ATCTG…C3 : ACTTA…C N : ACCTA…T
11244 K1 : AATTT…T2 : AATTT…G3 : AACTT…TN : AACTT…T
47789…K1 : TTTAT…T2 : TAACC…G3 : TAACC…TN : TGGGA…T
15578… K1 : AGGTA…T2 : AGGAC…G3 : AAAAC…AN : AAAGG…C
![Page 25: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/25.jpg)
Bootstrap2. Reconstruct a tree from each data set using the same method used for reconstructing the original tree
Sp1Sp2
Sp3Sp4
Sp1Sp2
Sp3Sp4
Sp1Sp2
Sp3Sp4
11244 K1 : AATTT…T2 : AATTT…G3 : AACTT…TN : AACTT…T
47789…K1 : TTTAT…T2 : TAACC…G3 : TAACC…TN : TGGGA…T
15578… K1 : AGGTA…T2 : AGGAC…G3 : AAAAC…AN : AAAGG…C
![Page 26: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/26.jpg)
Bootstrap3. For each node in our original tree, we count the number of times it appeared in the bootstrap analysis
Sp1Sp2
Sp3Sp4
Sp1Sp2
Sp3Sp4
Sp1Sp2
Sp3Sp4
Sp1Sp2
Sp3
Sp4
67%100%
• The jackknife method is less general than bootstrap• Jackknife explores the data differently• Jackknife is easier to apply to complex sampling schemes
![Page 27: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/27.jpg)
Step 3.5 - BootstrapStep 3.5 - Bootstrap
![Page 28: Multiple Sequence Alignment (MSA) and Phylogeny](https://reader035.vdocument.in/reader035/viewer/2022062314/5681301c550346895d95982b/html5/thumbnails/28.jpg)
Bootstrap values on NJPlotBootstrap values on NJPlot
Note:ClustalX saves trees as .ph filetrees with bootstrap are saved as .phb
You might have to reopen the tree…