biostatistics bioinformatics core

17
Biostatistics Bioinformatics Core Personnel Elizabeth Garrett, PhD Biostatistician Giovanni Parmigiani, PhD Biostatistician Data analysis and System support staff Hardware DELL server; linux OS Linux and Windows workstations Software GeneX Database; R-based analysis tools Labs: Affy Suite, others TBA

Upload: odette

Post on 11-Feb-2016

38 views

Category:

Documents


0 download

DESCRIPTION

Biostatistics Bioinformatics Core. Personnel Elizabeth Garrett, PhD Biostatistician Giovanni Parmigiani, PhD Biostatistician Data analysis and System support staff Hardware DELL server; linux OS Linux and Windows workstations Software GeneX Database; R-based analysis tools - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Biostatistics Bioinformatics Core

Biostatistics Bioinformatics Core

Personnel Elizabeth Garrett, PhD Biostatistician Giovanni Parmigiani, PhD Biostatistician Data analysis and System support staff

Hardware DELL server; linux OS Linux and Windows workstations

Software GeneX Database; R-based analysis tools Labs: Affy Suite, others TBA

Page 2: Biostatistics Bioinformatics Core

Contact Information

Elizabeth S. [email protected] 1103, 550 Building410-614-2588

Giovanni [email protected] 1103 550 Building410-614-3426

Page 3: Biostatistics Bioinformatics Core

Aims of the Biostatistics Core

Specific Aim 1:To provide biostatistical consultation and

support to projects in the program. Special emphasis will be to assist in

visualization, analysis, quantitative modeling and interpretation of results.

Page 4: Biostatistics Bioinformatics Core

Aims of the Biostatistics Core

Specific Aim 2:To help in identifying the appropriate data

structures; ensuring data quality and data confidentiality; and developing efficient data transferring and interfacing for data analysis and data visualization under different platforms.

Page 5: Biostatistics Bioinformatics Core

Two important stages where we get involved

• Planning Stage: – Experimental Design

• How many samples?• How many replicates?• Housekeeping genes?• Dye swapping?

– What’s the big deal? You could spend a lot of time and money and not able to answer your questions due to experimental errors, etc.

Before the study:How can I best address my hypothesis using minimal resources to get maximal information?

After the study:Now that I have this enormous amount of data, how do I summarizeit and answer my questions?

• Analysis Stage:– Visualization– Data Exploration– Analytic Tools and Models

Page 6: Biostatistics Bioinformatics Core

What we do• One-on-one consultations with investigators for

planning experiments• One-on-one consultations with investigators for

visualization, data exploration, and analysis.• Tutorials for helping investigators use some of the

software for exploration and visualization independently.

• Tutorials on basic statistical concepts, including experimental design in gene expression studies and basic analytic tools.

Page 7: Biostatistics Bioinformatics Core

GeneX• Web based database, data mining, and data analysis tool• Supports * multiple users * multiple species * multiple microarray platforms

Common Denominator for data analysis

Page 8: Biostatistics Bioinformatics Core

GeneX Components

• Curation Tool (imports data)• Database (OpenSource SQL)• XML Data Exchange Protocol• Query and analytic routines -- mining -- biostatistics in R

Page 9: Biostatistics Bioinformatics Core

Analytical Tools and Applications Included or Co-developed with GeneX

• Clustering• Visualization• Principle Component Analysis

and Multi-Dimensional Scaling• Significance testing with R• Integration with other databases

Page 10: Biostatistics Bioinformatics Core

Regulation of extracellular matrix changes and fibrosis in inflammatory bowel disease.

Shukti ChakravartiFeng Wu

Department of MedicineJohns Hopkins University

Page 11: Biostatistics Bioinformatics Core

TNBS-colon

Control

TNBS

Page 12: Biostatistics Bioinformatics Core

TNBS-induced colitis modelTNBS dose time points (weeks)

Harvest

0 2 4 6 12

• RNA • Protein • Histology • Intestinal fibroblasts

Disease initiation

fibrosis

8

inflammation

Page 13: Biostatistics Bioinformatics Core

acti

vity

time

inflammation

ECM/fibrosis

Page 14: Biostatistics Bioinformatics Core

Analysis Plan

• Expression estimates using dChip• Additional normalization for scanner effect• Two-level regression model• Identification of reliably estimable time

trends in gene expression• Grouping genes by patterns

Page 15: Biostatistics Bioinformatics Core

Normalization

Page 16: Biostatistics Bioinformatics Core

FDR < 1/2

Empirical Bayes Ranking versus Statistical Significance

P-value < .05

Page 17: Biostatistics Bioinformatics Core

Patterns of gene expression over time

Red: positive slope, low fdrGreen: negative slope, low fdr Orange and Brown: low p-value