microarray experiments. database and analysis tools

26
Kate Milova MolGen retreat March 24, 2005 1 Microarray experiments. Microarray experiments. Database and Analysis Database and Analysis Tools. Tools. Kate Milova cDNA Microarray Facility March 24, 2005

Upload: tibor

Post on 27-Jan-2016

34 views

Category:

Documents


1 download

DESCRIPTION

Microarray experiments. Database and Analysis Tools. Kate Milova cDNA Microarray Facility March 24, 2005. Outline. Microarray platforms and services at AECOM: cDNA Long Oligo Affymetrix Database (cDNA & Long Oligo) structure and content: Printing information Chip layout - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 1

Microarray experiments. Microarray experiments. Database and Analysis Tools.Database and Analysis Tools.

Kate Milova

cDNA Microarray Facility

March 24, 2005

Page 2: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 2

Outline.Outline. Microarray platforms and services at AECOM:

cDNA Long Oligo Affymetrix

Database (cDNA & Long Oligo) structure and content: Printing information Chip layout Annotation

Annotation algorithms and data mining On-line Analysis Tools:

Normalization Signal filtering Data sets comparison

Statistical packages and Analysis software Summary

Page 3: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 3

Microarray Platforms at AECOM.Microarray Platforms at AECOM.

Page 4: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 4

How to choose a microarray platform.How to choose a microarray platform.

Page 5: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 5

Before starting your microarray experiment.Before starting your microarray experiment.

Page 6: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 6

cDNA Microarray Facility. Home page.cDNA Microarray Facility. Home page.

Standart & Custom Arrays. Description & PricesHybridization, labeling, bioinformatics, workshopsDatabase for cDNA & Long Oligo Arrays. Analysis PipelineAECOM cDNA microarray facility. Supported publicationsUseful links of analysis tools

Page 7: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 7

Database for Analysis of Microarrays at Database for Analysis of Microarrays at AECOM. Contents.AECOM. Contents.

Printing Information Chip layout Gene Annotation

Chip name Specie Number of spots Number of controls Number of pen domains Number of slides Printing pattern Distance between spots Number of rows Number of columns Printing date Master chip

Chip name Spot information (Accession or clone id or bacterial control) Spot location Library name Clone location on 384 plate Clone location on 96 plate

Accession Clone ID Clone end Vector name Clone name UniGene cluster ID Best blast hit Main blast parameters (score, E-value, % identity, blast date, etc.) Gene ID Gene symbol Gene synonyms Chromosome Map location GO IDs GO Annotation

Page 8: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 8

Annotation sources: NCBI.Annotation sources: NCBI.

NCBI Entrez Gene

UniGene

Refseq & NT databases Annotation

Blast Search

Blast Software

UniGene ID Gene ID

GO ID

UniGene ID Accession

UniGene ID Blast against UniGene

clusters

Page 9: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 9

Annotation sources: NCBI.Annotation sources: NCBI.

NCBI

UniGene

UniGene ID Accession

UniGene ID Blast against UniGene

clusters

NCBI UniGene UniGene ID: UniGene ID for cDNA arrays is obtained from the UniGene source file for each particular accession number of the clone.

NCBI UniGene Blast: UniGene ID for Long Oligo arrays is obtained from blast results Blast search was done with the set of oligo sequences against UniGene clusters with cutoff 99% for sequence identity and 90% for overlapping. UniGene ID for the oligo hitting multiple UniGene clusters is marked as an “Ambiguous cluster ID”.

Page 10: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 10

Annotation sources: NCBI.Annotation sources: NCBI.

NCBI Entrez GeneUnigene ID

Gene ID GO ID

UniGene ID Gene ID: All information retrieved from ‘Enrez Gene’ project is based on the UniGene cluster ID and corresponding Gene ID. Gene ID is ambiguous in ‘Gene ID’ to ’UniGene cluster ID’ connection. Parsing filter was used to eliminate ambiguous Gene IDs.

Gene ID GO ID: For each Gene ID corresponding Gene Ontology IDs were retrieved from Entrez Gene source file There might be a few or more then 10 different GO IDs for a Gene ID. All of them are collected.

Page 11: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 11

Annotation sources: NCBI.Annotation sources: NCBI.

NCBI

Refseq & NT databases Annotation

Blast Search

Blast Software

Blast Software package is installed on the microarray server.

This software allows to format databases and run batch homology search for any combination of custom databases and query sequences.

Refseq & NT databases. Annotation Loaded formatted and periodically updated on the microarray server. When databases are updated we run blast search of cDNA and Long Oligo sequences.Blast results are parsed using our algorithm for annotation extraction.

Page 12: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 12

Annotation Extraction AlgorithmAnnotation Extraction Algorithm

Sequences Raw Data

Database of cDNA & Long Oligo sequences

Formatted Data

Homology search against RefSeq & NT

80%

90%Alignment qualitycheck

Page 13: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 13

Annotation Extraction AlgorithmAnnotation Extraction Algorithm

Identity: < 90% > 90%

Overlapping: < 80% > 80%

1st RefSeq hit:

1st NT hit:

OU

T

FPROTEINRFTYROSINEMNWZMKINASEJHMIWEDMUSDFMUSKULUSDETRIKENGLLCLONEJF

LinguisticFilter

Page 14: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 14

Annotation sources: Gene Ontology.Annotation sources: Gene Ontology.

Gene Ontology

Biological process

Cellular compartment

Molecular function

Gene Ontology. Multiple GO IDs for each Gene ID are retrieved in the previous step from Entrez Gene ( if available).

Gene Ontology annotation for all GO IDs is kept in three different information fields: biological processes, molecular function and cellular compartment. For each of the fields all available annotation was prefiltered with redundancy check and concatenated.

Page 15: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 15

cDNA Microarray Facility. Database.cDNA Microarray Facility. Database.

Page 16: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 16

Database Search.Database Search.

Page 17: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 17

Microarray Data Analysis Pipeline.Microarray Data Analysis Pipeline.

Page 18: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 18

Pipeline. Filtering.Pipeline. Filtering.

Page 19: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 19

Pipeline. LOWESS Normalization.Pipeline. LOWESS Normalization.

Page 20: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 20

Pipeline. Data set Comparison.Pipeline. Data set Comparison.

Page 21: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 21

Microarray Analysis Software: GeneTraffic: client-server systems for microarray data analysis. Iobion GeneSpring: cutting-edge tools for expression analysis. Agilent Technologies GeneSifter. GeneSifter BASE. Lund University

Data Mining: PathwayAssist: Interaction Explore Software. Stratagen Pathways Analysis. Ingenuity

Tools for Statistical Analysis: SAM: Significance Analysis of Microarrays. Stanford R statistical package S-PLUS. Insightful

Statistical packages and Analysis software.

Page 22: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 22

SummarySummary Multiple microarray platforms are available at AECOM:

Affymetrix cDNA arrays Long Oligo Custom arrays

Data analysis and annotationDatabase for Analysis of Microarrays containes all information about our arrays, cDNA and oligo sets Sequences annotation is updated and integrated into the database Web interface of the database makes it easy to search for a particular gene, synonyms, map location, function, etc. Easy to use web based analysis pipeline – get your results in just 5 minutes. List of ‘Up’, ‘Down’ regulated genes with full gene annotation. We are here for help and consultation !

Page 23: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 23

BACKUPSBACKUPS

Page 24: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 24

cDNA Microarray Facility.cDNA Microarray Facility. Services.Services.

Page 25: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 25

cDNA Microarray Facility. Arrays.cDNA Microarray Facility. Arrays.

Page 26: Microarray experiments. Database and Analysis Tools

Kate Milova MolGen retreat March 24, 2005 26

cDNA Microarray Facility. Publications.cDNA Microarray Facility. Publications.