function preserves sequences christophe roos - medicel ltd [email protected] similarity is...
TRANSCRIPT
![Page 1: Function preserves sequences Christophe Roos - MediCel ltd christophe.roos@medicel.fi Similarity is the result of conservation or converging evolution](https://reader034.vdocument.in/reader034/viewer/2022042822/56649f1c5503460f94c32251/html5/thumbnails/1.jpg)
Function preserves sequences
Christophe RoosChristophe Roos - - MediCel ltdMediCel [email protected]
Similarity is the result of conservation or converging
evolution – it has its reason of being
Mutations change sequences
![Page 2: Function preserves sequences Christophe Roos - MediCel ltd christophe.roos@medicel.fi Similarity is the result of conservation or converging evolution](https://reader034.vdocument.in/reader034/viewer/2022042822/56649f1c5503460f94c32251/html5/thumbnails/2.jpg)
Spring 2002Christophe Roos - 3/6 Sequence databases & comparison
The public biological databases
• EMBL or GenBank or DDBJ for DNA– emblnew for daily updates, merges the main DB 4x/year
• SwissProt or PIR for proteins– Trembl, tremblnew, remtrembl
• PDB for structures
• In flat file format, yet quite informative and convertible– Fasta format is a ‘universal’ sequence format: first line starts with ‘>’
followed by free text. Second line has the start of the sequence (50 or 60 characters per line). Use the first line for the name or the Accession Number (AC)
![Page 3: Function preserves sequences Christophe Roos - MediCel ltd christophe.roos@medicel.fi Similarity is the result of conservation or converging evolution](https://reader034.vdocument.in/reader034/viewer/2022042822/56649f1c5503460f94c32251/html5/thumbnails/3.jpg)
Spring 2002Christophe Roos - 3/6 Sequence databases & comparison
Database homes
• The European database home is in Hinxton, Cambridge, UK: European Bioinformatics Institute - EBI – http://www.ebi.ac.uk– Access through the Sequence Retrieval System, SRS
• The American database home is in Washington DC: National Center for Biotechnology Information – NCBI– http://www.ncbi.nlm.nih.gov– Access through Entrez
• Both centers exchange their data on a daily basis, however there are differences in annotations, consistency, speed and quality.
• There is also a Japanese database provider, DDBJ.
![Page 4: Function preserves sequences Christophe Roos - MediCel ltd christophe.roos@medicel.fi Similarity is the result of conservation or converging evolution](https://reader034.vdocument.in/reader034/viewer/2022042822/56649f1c5503460f94c32251/html5/thumbnails/4.jpg)
Spring 2002Christophe Roos - 3/6 Sequence databases & comparison
A look at one entry from EMBL
part 1/3
![Page 5: Function preserves sequences Christophe Roos - MediCel ltd christophe.roos@medicel.fi Similarity is the result of conservation or converging evolution](https://reader034.vdocument.in/reader034/viewer/2022042822/56649f1c5503460f94c32251/html5/thumbnails/5.jpg)
Spring 2002Christophe Roos - 3/6 Sequence databases & comparison
A look at one entry from EMBL
part 2/3
![Page 6: Function preserves sequences Christophe Roos - MediCel ltd christophe.roos@medicel.fi Similarity is the result of conservation or converging evolution](https://reader034.vdocument.in/reader034/viewer/2022042822/56649f1c5503460f94c32251/html5/thumbnails/6.jpg)
Spring 2002Christophe Roos - 3/6 Sequence databases & comparison
A look at one entry from EMBL
part 3/3
The feature table of the entry contains several linkeditems, such as exon-assembly (mRNA) and codingsequence (CDS).There are also cross-references to other databases
![Page 7: Function preserves sequences Christophe Roos - MediCel ltd christophe.roos@medicel.fi Similarity is the result of conservation or converging evolution](https://reader034.vdocument.in/reader034/viewer/2022042822/56649f1c5503460f94c32251/html5/thumbnails/7.jpg)
Spring 2002Christophe Roos - 3/6 Sequence databases & comparison
A look at one entry from SwissProt
The eyeless gene: a master regulatory gene in eye formation
![Page 8: Function preserves sequences Christophe Roos - MediCel ltd christophe.roos@medicel.fi Similarity is the result of conservation or converging evolution](https://reader034.vdocument.in/reader034/viewer/2022042822/56649f1c5503460f94c32251/html5/thumbnails/8.jpg)
Spring 2002Christophe Roos - 3/6 Sequence databases & comparison
The effect of the eyeless gene
The eyeless gene is a master regulatory gene in eye formation• When it is absent, no eyes are formed• When it is present where it should not, it induces eye formation
Normal
Ove
rexp
ress
ed in
ant
enna
e an
d w
ings
Absent
![Page 9: Function preserves sequences Christophe Roos - MediCel ltd christophe.roos@medicel.fi Similarity is the result of conservation or converging evolution](https://reader034.vdocument.in/reader034/viewer/2022042822/56649f1c5503460f94c32251/html5/thumbnails/9.jpg)
Spring 2002Christophe Roos - 3/6 Sequence databases & comparison
A look at one entry from SwissProt
Part 2: the annotations about the function and location
![Page 10: Function preserves sequences Christophe Roos - MediCel ltd christophe.roos@medicel.fi Similarity is the result of conservation or converging evolution](https://reader034.vdocument.in/reader034/viewer/2022042822/56649f1c5503460f94c32251/html5/thumbnails/10.jpg)
Spring 2002Christophe Roos - 3/6 Sequence databases & comparison
A look at one entry from SwissProt
Part 3: The feature table and the amino acid sequence
![Page 11: Function preserves sequences Christophe Roos - MediCel ltd christophe.roos@medicel.fi Similarity is the result of conservation or converging evolution](https://reader034.vdocument.in/reader034/viewer/2022042822/56649f1c5503460f94c32251/html5/thumbnails/11.jpg)
Spring 2002Christophe Roos - 3/6 Sequence databases & comparison
A look at one entry from SwissProt
The eyeless gene is also called PAX6 and can be foundin several species: birds, mammals, reptiles, fish, invertebrates
![Page 12: Function preserves sequences Christophe Roos - MediCel ltd christophe.roos@medicel.fi Similarity is the result of conservation or converging evolution](https://reader034.vdocument.in/reader034/viewer/2022042822/56649f1c5503460f94c32251/html5/thumbnails/12.jpg)
Spring 2002Christophe Roos - 3/6 Sequence databases & comparison
Sequence comparison
• Function by analogy: If sequences are conserved their function is probably also conserved.
• Functional domains: If some parts of the sequences are more conserved than other parts, there must be an underlying biological reason for it.
• Establishing relationship/differences in function: By quantification of sequence relationships it is possible to estimate function of novel genes
• Establishing relationship between species
- Why?
![Page 13: Function preserves sequences Christophe Roos - MediCel ltd christophe.roos@medicel.fi Similarity is the result of conservation or converging evolution](https://reader034.vdocument.in/reader034/viewer/2022042822/56649f1c5503460f94c32251/html5/thumbnails/13.jpg)
Spring 2002Christophe Roos - 3/6 Sequence databases & comparison
Sequence comparison – how?
• Compare two sequences of similar length• Compare two sequences of very different length• Compare several sequences• Allow gaps or not?• Scoring: yes-no or good-intermediate-bad• The best or all above a threshold?
![Page 14: Function preserves sequences Christophe Roos - MediCel ltd christophe.roos@medicel.fi Similarity is the result of conservation or converging evolution](https://reader034.vdocument.in/reader034/viewer/2022042822/56649f1c5503460f94c32251/html5/thumbnails/14.jpg)
Spring 2002Christophe Roos - 3/6 Sequence databases & comparison
Sequence comparison – metrics
• The scoring matrix • The score for a match• The penality for a mismatch• The penality for the insertion of a gap (gap-open)• The penality for elongating a gap (gap-length)• Local or global similarities ?
GA-CGGATTAG
GATCGGAATAG
mismatch
gapmatch