what is a phylogenetic tree? - iammdelhi.org · biological sequence databases •electronic...

55
What is a Phylogenetic Tree? TARU SINGH

Upload: others

Post on 24-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

What is a Phylogenetic Tree?

TARU SINGH

Page 2: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

What is a phylogenetic tree used for?

• A phylogenetic tree is used to help represent evolutionary relationships between organisms that are believed to have some common ancestry.

• The name “dendogram” is the broad term for trees.

Page 3: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

Where did the idea for a tree come from?

• Charles Darwin is credited with the earliest representation of a phylogenetic tree published in his book The Origin of Species.

Page 4: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

What does this tree look like?

• There are many different ways to represent the information found in a phylogenetic tree.

• The basic format of a tree is generally in one of the two forms shown, although there are other ways to represent the data.

Page 5: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

What do the lines represent?

• Each line on the tree represents one particular organism of interest.

• The distance of the lines is used to determine how closely two organisms are related to one another or how long ago the may have had a common ancestor.

• The line that connect all the other lines is the representation of the common ancestor that is being looked at to compare other organisms to.

Page 6: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

The “Rooted” vs. “Unrooted” tree

• A rooted tree is used to make inferences about the most common ancestor of the leaves or branches of the tree. Most commonly the root is referred to as an “outgroup”.

• An unrooted tree is used to make an illustration about the leaves or branches, but not make assumption regarding a common ancestor.

Page 7: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

The bifurcating tree

• A tree that bifurcates has a maximum of 2 descendants arising from each of the interior nodes.

Page 8: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

The multi-furcating tree

• A tree that multi-furcates has multiple descendants arising from each of the interior nodes.

Page 9: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

Where do I go to make a tree?

• Many computational biology programs have dendogram programs.

• An example of a free program that is available via the EMBL-EBI (European BioInformatics Institute) called ClutsalW or ClustalX. – You pick the program based on the format of your

computer, i.e. command line verses graphical interface

Page 10: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

What criteria is important when building a tree?

• There are many different things that you should consider as you get set to build your tree.

• Some examples are; • Efficiency

• Power

• Consistency/Reliability

• Robustness

Page 11: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

Limitations to the use of trees

• It is important to remember that trees do have limitations. For example, trees are meant to provide insight into a research question and not intended to represent an entire species history.

• Several factors, like gene transfers, may affect the output placed into a tree.

• All knowledge of limitations related to DNA degradation over time must be considered, especially in the case of evolutionary trees aimed at ancient or extinct organisms.

Page 12: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

PHYLOGENETIC TREE FORMATION

Accession no.

Nucleotide Blast

Sequences of the related organisms obtained.

Note pad file.

Get phylogenetic at http://align.genome.jp/

Page 13: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

Biological sequence databases

• Electronic reservoir of information. • Nucleotide sequence databases- Gen bank (NCBI) in

collaboration with DDBJ & EMBL. • Protein sequence database - SWISS PROT & PIR. • Molecular sequencer database- PDB. • Information database- OMIM (Online Mendelian

Inheritance in man). • Literature database- Medline, AIDS line.

Page 14: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

What is a nucleotide sequence?

• The order of nucleotides in a DNA or

RNA molecule .

CGTAACCAAGGTTAACCTTGGTTACG

• A succession of any number of

nucleotides greater than four is liable

to be called a sequence .

Page 15: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

What is an alignment?

• Sequence alignment is an

arrangement of two or more

sequences, highlighting their

similarity.

tcctctgcctctgccatcat--- caaccccaaagt

| | | | | | | | | | | | | | || | | | | | | | | | | | | | tcctgtgcatctgcaatcatgggcaaccccaaagt

Page 16: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

Why to align sequences?

• To find evolutionarily relationship

between 2 or more genes or proteins.

• To find structurally or functionally

similar regions within proteins.

Page 17: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

SEQUENCE ALIGNMENT

MATCH : Corresponding nucleic acid sequences are vertically

aligned.

ATGGCAT ATGGCAT

GAP :

When a residue seems to

have been deleted or inserted,

represented by dashes.

(INDELS)

ATGAGCAT ATG - GCAT

Page 18: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

Example: sequence alignment

Task: align “abcdef” with “abdgf”

Write second sequence below the first

abcdef

abdgf Move sequences to give maximum match

between them.

Show characters that match using vertical bar

Page 19: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

Example sequence alignment

abcdef

| |

abdgf

Insert gap between b and d on lower sequence to allow d and f to align

Page 20: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

Example sequence alignment

abcdef

| | | |

ab - dgf

Note e and g don’t match but it is the

best alignment that can be produced.

Page 21: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

Sequence Alignment

Procedure for comparing two or more

sequences by searching patterns that

are in the same order in the

sequences.

Pair-wise alignment:

Compare two sequences

Multiple sequence alignment: Compare

more than two sequences

Page 22: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

SEQUENCE ALIGNMENT

GLOBAL ALIGNMENT

Alignment from start to end.

LOCAL ALIGNMENT

Alignment stops at region of identity.

Priority given to find conserved nucleotide patterns.

Page 23: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

PHYLOGENETIC ANALYSIS

Taxon =Phylogenetically distinct units on a tree

Taxonomy – naming & classifying organisms

Systematics – naming & classifying organisms according to their evolutionary relationships

Phylogenetics – reconstructing the evolutionary relationships among organisms

Phylogenetic tree – hypothesized genealogy traced back to the last common ancestor through hierarchical, dichotomous branching

Cladistics – the principles that guide the production of phylogenetic trees, a.k.a., cladograms

Page 24: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

PHYLOGENETIC ANALYSIS Node – branch point, speciation event

Lineage or clade – an entire branch

A clade is a monophyletic group, i.e., an

ancestral species and all of its descendents

A polyphyletic group lacks the common ancestor of species in the group

E.g., If the Class Reptilia is to be monophyletic,

birds must be included!

Page 25: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

CLADOGRAM

Branch length is not drawn proportionally to evolutionary distance

PHYLOGRAM

The branch lengths are drawn in a scale proportional to evolutionary distances.

k

A

B

C

D

E

F

J

H

G

I

Page 26: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

Phylogenetic tree

• Phylogenetic tree represents the evolutionary relationship among different life-forms

– Node represents the most recent common ancestor of the descendants.

– Edge lengths correspond to time estimates.

– Each node in a phylogenetic tree is called a taxonomic unit.

Page 27: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

Types of Trees

Unrooted trees illustrate relatedness

of leaf nodes without making assumptions about ancestry.

•Rooted tree •Directed tree

•Unique node

corresponding

to the most

recent ancestor

•Leaves

represent the

entities

Page 28: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

k

A

B

C

D

E

F

J

H

G

I

D

E

G

A

B

C H

J I

F

ROOTED TREE UNROOTED TREE

Page 29: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

k

A

B

C

D

E

F

J

H

G

I

ROOT

INTERNAL NODES (HTU)

EXTERNAL NODES (OTU)

OTU = Operational taxonomic unit (can represent many types of comparable data)

HTU= Hypothetical taxonomic unit (hypothetical progenitors of OUT)

Branch

Outgroup

Ingroup

Page 30: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

Phylogenetic Analysis-4 Steps

• 1.Alignment

• 2.Determining substitution models

• 3.Tree building and

• 4. Tree evaluation

Page 31: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

• Multiple sequence alignment (Clustal X)

• Manual Editing of alignment.

• Submission to tree building program (Treecon, Mega, Phylip).

(Guide tree from Clustal W is formatted as phylip tree and can be imported into various tree drawing programs.)

1.Alignment

Page 32: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

2. Determining substitution models.

Substitution model is selected while keeping in mind its considerations for:

Variations in length.

Insertions

Deletions

And introduction into gaps

Page 33: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

3.Tree building • TREE BUILDING METHODS

Distance Based-

Methods •Counts the number of differences between sequences. •This number called evolutionary distance •Neighbor-joining. •Fitch Margoliash. •UPGMA

Character Based-

Methods •Derive trees that optimize

the distribution of the

actual data pattern for

each character.

•Max. Parsimony. •Max. Likelihood

Page 34: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

Submitting sequence to DataBank

Open the link http://www.ncbi.nlm.nih.gov/index.html

Click on GenBank

Submission to GenBank

BankIt

Can submit one or few sequences

Used when sequence is not complicated

Sequin

Use for long and complicated situations

Can submit mutation, phylogenetic, population, environmental, or segmented sets

Page 35: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

• Click on BankIt.

(Can enter N no. of sequences together)

• Fill up the form containing the details of the isolate.

Page 36: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

Construction of phylogenetic tree

Softwares required- CLUSTAL-X and TREECON

STEPS FOR TREE CONSTRUCTION:-

1. Go to NCBI Homepage(http://www.ncbi.nlm.nih.gov/)

2. Select “Nucleotide” in search box and put accession number “EU873539” and click “GO”.

Page 37: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

1. A sequence will come, select the sequence in FASTA by click on “FASTA”.

2. Open ”Ribossomal Database Homepage”.

3. Click on “seq match”

4. Paste sequence from NCBI site to this page in the given space and click on “Type Isolates” and then on “submit”.

5. Show “printer friendly results”-click on it.

6. Take cell acession numbers one by one and go to NCBI site and take sequence in FASTA format for all.

7. Prepare a notepad file for all this sequence.

Page 38: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration
Page 39: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration
Page 40: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration
Page 41: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration
Page 42: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration
Page 43: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration
Page 44: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

• Go on “CLUSTAL X” (multiple alignment mode)

• Click on “file” and then select “remove gap only columns”.

• Click on “alignment” and select “output format option”.

• From “output format options” choose “PHYLIP” format and then “CLOSE”.

• Click on “alignment” and select “do complete alignment”.

• After alignment three files will display.

Page 45: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

Dnd file

Phy file

Aln file

Go on “TREECON”

Click on “distance estimation”

Click on “start distance estimation”

Load “PHY FILE”

Click on “open”

Windows of “select sequences” will open.

Select “select all” and then OK.

Page 46: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

• Select “jukes and contor” all, “YES” for boot strap analysis and then “OK”.

• Select “100” for bootstrap analysis and then “ok”

• To finish select “ok”.

Page 47: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration
Page 48: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration
Page 49: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

Select “infer tree topology” from TREECON window.

Select “infering tree topology”. “neighbour joining “yes” for boot strap analysis and “ok”.

Select “root unrooted trees” from treecon windows.

Select “ start rooting unrooted trees”.

Select “single sequence (forced) and “yes” and then “ok”. The select “ok” to finish.

Select “draw phylogenetic tree”

Page 50: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration
Page 51: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

• Select “load new tree”(just below file option). Tree will display.

• Select “add bootstrap values”{70/ABC}

• Select “1” and then “ok”

• Select “distance scale {0.1} select “0.1-0.15-0.02” and then “ok”

• Go to file select “save tree” then “as treecon file” and save the tree.

• But this tree is still not complete as this tree is “without names of organisms”.

Page 52: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration
Page 53: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

• To add sequence names in phylogenetic tree:-

• Click on “customize” then select “sequence names” and then “change”.

• Click on “ black boxes” one by one and then put the name in the box(new name) in the title window and then ok.

• After putting all the names, save the “tree”. This is required tree.

Page 54: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration
Page 55: What is a Phylogenetic Tree? - iammdelhi.org · Biological sequence databases •Electronic reservoir of information. •Nucleotide sequence databases- Gen bank (NCBI) in collaboration

THANK YOU