analysis of go annotation at cluster level by h. bjørn nielsen slides from agnieszka s. juncker

12
Analysis of GO annotation at cluster level by H. Bjørn Nielsen Slides from Agnieszka S. Juncker

Post on 22-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Analysis of GO annotation at cluster level by H. Bjørn Nielsen Slides from Agnieszka S. Juncker

Analysis of GO annotation at cluster level

by H. Bjørn NielsenSlides from Agnieszka S. Juncker

Page 2: Analysis of GO annotation at cluster level by H. Bjørn Nielsen Slides from Agnieszka S. Juncker

Sample PreparationHybridization

Array designProbe design

QuestionExperimental Design

Buy Chip/Array

Statistical AnalysisFit to Model (time series)

Expression IndexCalculation

Advanced Data Analysis

Clustering PCA Classification Promoter Analysis

Meta analysis Survival analysis Regulatory Network

Normalization

Image analysis

The DNA Array Analysis Pipeline

ComparableGene Expression Data

GO annotations

Page 3: Analysis of GO annotation at cluster level by H. Bjørn Nielsen Slides from Agnieszka S. Juncker

Gene Ontology

Gene Ontology (GO) is a collection of controlled vocabularies describing the biology of a gene product in any organism

There are 3 independent sets of vocabularies, or ontologies:

• Molecular Function (MF)– e.g. ”DNA binding” and ”catalytic activity”

• Cellular Component (CC)– e.g. ”organelle membrane” and ”cytoskeleton”

• Biological Process (BP)– e.g. ”DNA replication” and ”response to stimulus”

Page 4: Analysis of GO annotation at cluster level by H. Bjørn Nielsen Slides from Agnieszka S. Juncker

Gene Ontology structure

Page 5: Analysis of GO annotation at cluster level by H. Bjørn Nielsen Slides from Agnieszka S. Juncker

GO structure, example 2

Page 6: Analysis of GO annotation at cluster level by H. Bjørn Nielsen Slides from Agnieszka S. Juncker

KEGG pathways

• KEGG PATHWAYS:– collection of manually drawn pathway maps representing our

knowledge on the molecular interaction and reaction networks, for a large selection of organisms

• 1. Metabolism– Carbohydrate, Energy, Lipid, Nucleotide, Amino acid, Other

amino acid, Glycan, PK/NRP, Cofactor/vitamin, Secondary metabolite, Xenobiotics

• 2. Genetic Information Processing• 3. Environmental Information Processing • 4. Cellular Processes• 5. Human Diseases • 6. Drug Development

Page 7: Analysis of GO annotation at cluster level by H. Bjørn Nielsen Slides from Agnieszka S. Juncker

KEGG pathway example 1

Page 8: Analysis of GO annotation at cluster level by H. Bjørn Nielsen Slides from Agnieszka S. Juncker

KEGG pathway example 2

Page 9: Analysis of GO annotation at cluster level by H. Bjørn Nielsen Slides from Agnieszka S. Juncker

Cluster analysis and GO

Analysis example:

• Partitioning clustering of genes into e.g. 10 clusters based on expression profiles

• Assignment of GO terms to genes in clusters

• Looking for GO terms overrepresented in clusters

Page 10: Analysis of GO annotation at cluster level by H. Bjørn Nielsen Slides from Agnieszka S. Juncker

Hypergeometric test

• The hypergeometric distribution arises from sampling from a fixed population.

10 balls

• We want to calculate the probability for drawing 7 or more white balls out of 10 balls given the distribution of balls in the urn

20 white ballsout of

100 balls

Page 11: Analysis of GO annotation at cluster level by H. Bjørn Nielsen Slides from Agnieszka S. Juncker

Yeast cell cycle

Time series

experiment:

Gene expression

profiles:

Time

YY

YY

Y

Y

Y

Time

Gene1

Gene2

Sampling

Page 12: Analysis of GO annotation at cluster level by H. Bjørn Nielsen Slides from Agnieszka S. Juncker

Exercise

Find it on the course page