survey of misannotations and pseudogenes in the arabidopsis genome

14
Survey of Misannotations and Pseudogenes in the Arabidopsis Genome Tanmay Prakash

Upload: efuru

Post on 24-Jan-2016

48 views

Category:

Documents


0 download

DESCRIPTION

Survey of Misannotations and Pseudogenes in the Arabidopsis Genome. Tanmay Prakash. Objectives. Objectives Find Possible Misannotations Find Possible Pseudogenes. Why Misannotation can hinder research Pseudogenes can be used to study natural selection. Misannotations. Intron. UTR. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Survey of Misannotations and  Pseudogenes in the Arabidopsis Genome

Survey of Misannotations and Pseudogenes in the Arabidopsis Genome

Tanmay Prakash

Page 2: Survey of Misannotations and  Pseudogenes in the Arabidopsis Genome

Objectives

Why•Misannotation can hinder research•Pseudogenes can be used to study natural selection

Objectives•Find Possible Misannotations•Find Possible Pseudogenes

Page 3: Survey of Misannotations and  Pseudogenes in the Arabidopsis Genome

Many misannotations are the result of gene prediction programs mislabeling introns because of the presence of a stop codon

Misannotations

CDS CDSIntronUTR UTR

Page 4: Survey of Misannotations and  Pseudogenes in the Arabidopsis Genome

Pseudogenes are DNA sequences that no longer function but resemble the functional genes they once were. There are two types:•Processed•Non-processed

Common Properties of Pseudogenes•Stop Codons•Frameshift mutations•Lack of Selective Pressure

agtacatgcataggactcgatcgactc

agtacatgataggactcgatcgactc

STCIGLDRL

ST..DSID

Pseudogenes

Page 5: Survey of Misannotations and  Pseudogenes in the Arabidopsis Genome

Query Protein

Domains

SubjectArabidopsis

Introns

BLASTSearch

HMMERSearch

Query Protein

Domains

SubjectArabidopsis

CDS

GenesMatching In Introns

GenesMatching

In CDS

GenesMatchingIn Both

PossiblyMisannotated

Genes

Check forStop CodonsFrameshift

CheckKa/Ks

PossiblePseudogenes

Pipeline

Page 6: Survey of Misannotations and  Pseudogenes in the Arabidopsis Genome

Query Protein

Domains

SubjectArabidopsis

Introns

BLASTSearch

HMMERSearch

Query Protein

Domains

SubjectArabidopsis

CDS

GenesMatching In Introns

GenesMatching In Exons

Page 7: Survey of Misannotations and  Pseudogenes in the Arabidopsis Genome

GenesMatchingIn Both

PossiblyMisannotated

Genes

Page 8: Survey of Misannotations and  Pseudogenes in the Arabidopsis Genome

Results

There were 346 genes (different models not included) that had matches to the same domain in the introns and exons

There were 299 genes (different models not included) that had matches to the same domain in an intron and flanking exons. These are most likely misannotations.

Page 9: Survey of Misannotations and  Pseudogenes in the Arabidopsis Genome

Domain Possible Misannotations #DomainsPF01657.7 16 76PF02902.8 15 32PF06721.1 13 3PF07734.2 15 113

4 domains with the most possible misannotations

Page 10: Survey of Misannotations and  Pseudogenes in the Arabidopsis Genome

Domain Family Size vs Misannotations

02468

10121416

0 500 1000 1500 2000 2500 3000

Number of Domains in Family

Nu

mb

er o

f M

isan

no

tati

on

s

Series1

Page 11: Survey of Misannotations and  Pseudogenes in the Arabidopsis Genome

Misannotation Frequency

0

0.1

0.2

0.3

0.4

0.5

0.6

0 2000 4000 6000 8000 10000

Number of Genes Matching Domain

Per

cen

tag

e M

isan

no

tati

on

Page 12: Survey of Misannotations and  Pseudogenes in the Arabidopsis Genome

Domian Gene Frequentcy

0

5

10

15

20

0 2000 4000 6000 8000 10000

Number of genes matching Domain

Num

ber o

f M

isan

nota

tions

Page 13: Survey of Misannotations and  Pseudogenes in the Arabidopsis Genome

Future Research

•Identify pseudogenes by looking for stop codons, and frameshift mutations in the introns and checking the Ka/Ks value•Use a more recent database of domains•Follow the same process for the rice genome

Page 14: Survey of Misannotations and  Pseudogenes in the Arabidopsis Genome

Acknowledgement

Dr. Shin-Han ShiuDr. Kosuke HanadaDr. Melissa Lehti-ShiuDr. Gail RichmondHSHSP