multiple testing in the survival analysis of microarray data

27
Multiple Testing in the Survival Analysis of Microarray Data José A. Correa, Florida Atlantic University Sandrine Dudoit, Univ. California Berkeley Darlene R. Goldstein, École Polytechnique Fédérale de Lausanne Contact: [email protected]. edu

Upload: inara

Post on 13-Jan-2016

22 views

Category:

Documents


1 download

DESCRIPTION

Multiple Testing in the Survival Analysis of Microarray Data. José A. Correa, Florida Atlantic University Sandrine Dudoit, Univ. California Berkeley Darlene R. Goldstein, École Polytechnique Fédérale de Lausanne Contact: [email protected] Software: http://www.math.fau.edu/correa/. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Multiple Testing in the Survival Analysis of Microarray Data

Multiple Testing in the Survival Analysis of

Microarray Data

José A. Correa, Florida Atlantic University

Sandrine Dudoit, Univ. California Berkeley

Darlene R. Goldstein, École PolytechniqueFédérale de Lausanne

Contact: [email protected]

Software: http://www.math.fau.edu/correa/

Page 2: Multiple Testing in the Survival Analysis of Microarray Data

cDNA gene expression data

Data on m genes for n samples

Genes

mRNA samples

Gene expression level of gene i in mRNA sample j

= (normalized) Log( Red intensity / Green intensity)

sample1 sample2 sample3 sample4 sample5 …

1 0.46 0.30 0.80 1.51 0.90 ...2 -0.10 0.49 0.24 0.06 0.46 ...3 0.15 0.74 0.04 0.10 0.20 ...4 -0.45 -1.03 -0.79 -0.56 -0.32 ...5 -0.06 1.06 1.35 1.09 -1.09 ...

Page 3: Multiple Testing in the Survival Analysis of Microarray Data

Multiple Testing Problem• Simultaneously test m null

hypotheses, one for each gene j Hj: no association between

expression level of gene j and the covariate or response

• Because microarray experiments simultaneously monitor expression levels of thousands of genes, there is a large multiplicity issue

• Would like some sense of how ‘surprising’ the observed results are

Page 4: Multiple Testing in the Survival Analysis of Microarray Data

Hypothesis Truth vs. Decision

# not rejected

# rejected totals

# true H U V (F +) m0

# non-true H

T S m1

totals m - R R m

Truth

Decision

Page 5: Multiple Testing in the Survival Analysis of Microarray Data

Type I (False Positive) Error Rates

• Per-family Error Rate

PFER = E(V)

• Per-comparison Error Rate PCER = E(V)/m

• Family-wise Error Rate

FWER = p(V ≥ 1)

• False Discovery RateFDR = E(Q), whereQ = V/R if R > 0; Q = 0 if R = 0

Page 6: Multiple Testing in the Survival Analysis of Microarray Data

Strong vs. Weak Control

• All probabilities are conditional on which hypotheses are true

• Strong control refers to control of the Type I error rate under any combination of true and false nulls

• Weak control refers to control of the Type I error rate only under the complete null hypothesis (i.e. all nulls true)

• In general, weak control without other safeguards is unsatisfactory

Page 7: Multiple Testing in the Survival Analysis of Microarray Data

Comparison of Type I Error Rates

• In general, for a given multiple testing procedure,

PCER FWER PFER, and

FDR FWER,

with FDR = FWER under the complete null

Page 8: Multiple Testing in the Survival Analysis of Microarray Data

Adjusted p-values (p*)

• If interest is in controlling, e.g., the FWER, the adjusted p-value for hypothesis Hj is:

pj* = inf {: Hj is rejected at FWER }

• Hypothesis Hj is rejected at FWER if pj

*

• Adjusted p-values for other Type I error rates are similarly defined

Page 9: Multiple Testing in the Survival Analysis of Microarray Data

Some Advantages of p-value Adjustment

• Test level (size) does not need to be determined in advance

• Some procedures most easily described in terms of their adjusted p-values

• Usually easily estimated using resampling

• Procedures can be readily compared based on the corresponding adjusted p-values

Page 10: Multiple Testing in the Survival Analysis of Microarray Data

A Little Notation

• For hypothesis Hj, j = 1, …, m

observed test statistic: tj

observed unadjusted p-value: pj

• Ordering of observed (absolute) tj: {rj}

such that |tr1| |tr2

| … |trm|

• Ordering of observed pj: {rj}

such that |pr1| |pr2

| … |prm|

• Denote corresponding RVs by upper case letters (T, P)

Page 11: Multiple Testing in the Survival Analysis of Microarray Data

Control of the FWER

• Bonferroni single-step adjusted p-valuespj* = min (mpj, 1)

• Holm (1979) step-down adjusted p-valuesprj* = maxk = 1…j {min ((m-k+1)prk, 1)}

• Hochberg (1988) step-up adjusted p-values (Simes inequality)

prj* = mink = j…m {min ((m-k+1)prk, 1) }

Page 12: Multiple Testing in the Survival Analysis of Microarray Data

Control of the FWER

• Westfall & Young (1993) step-down minP adjusted p-values

prj* = maxk = 1…j { p(maxl{rk…rm} Pl prk H0C )}

• Westfall & Young (1993) step-down maxT adjusted p-values

prj* = maxk = 1…j { p(maxl{rk…rm} |Tl| ≥ |trk| H0

C )}

Page 13: Multiple Testing in the Survival Analysis of Microarray Data

Westfall & Young (1993) Adjusted p-values

• Step-down procedures: successively smaller adjustments at each step

• Take into account the joint distribution of the test statistics

• Less conservative than Bonferroni, Holm, or Hochberg adjusted p-values

• Can be estimated by resampling but computer-intensive (especially for minP)

Page 14: Multiple Testing in the Survival Analysis of Microarray Data

maxT vs. minP

• The maxT and minP adjusted p-values are the same when the test statistics are identically distributed (id)

• When the test statistics are not id, maxT adjustments may be unbalanced (not all tests contribute equally to the adjustment)

• maxT more computationally tractable than minP

• maxT can be more powerful in ‘small n, large m’ situations

Page 15: Multiple Testing in the Survival Analysis of Microarray Data

Control of the FDR• Benjamini & Hochberg (1995): step-up

procedure which controls the FDR under some dependency structures

prj* = mink = j…m { min ([m/k] prk, 1) }

• Benjamini & Yuketieli (2001): conservative step-up procedure which controls the FDR under general dependency structures

prj* = mink = j…m { min (m [1/j]/k] prk, 1) }

• Yuketieli & Benjamini (1999): resampling based adjusted p-values for controlling the FDR under certain types of dependency structures

Page 16: Multiple Testing in the Survival Analysis of Microarray Data

Identification of Genes Associated with Survival

• Data: survival yi and gene expression xij for individuals i = 1, …, n and genes j = 1, …, m

• Fit Cox model for each gene singly:

h(t) = h0(t) exp(jxij)

• For any gene j = 1, …, m, can test Hj: j = 0

• Complete null H0C: j = 0 for all j = 1, …, m

• The Hj are tested on the basis of the Wald statistics tj and their associated p-values pj

Page 17: Multiple Testing in the Survival Analysis of Microarray Data

Datasets

• Lymphoma (Alizadeh et al.)40 individuals, 4026 genes

• Melanoma (Bittner et al.)15 individuals, 3613 genes

• Both available at http://lpgprot101.nci.nih.gov:8080/GEAW

Page 18: Multiple Testing in the Survival Analysis of Microarray Data

Results: Lymphoma

Page 19: Multiple Testing in the Survival Analysis of Microarray Data

Results: Melanoma

Page 20: Multiple Testing in the Survival Analysis of Microarray Data

Other Proposals from the Microarray Literature

• ‘Neighborhood Analysis’, Golub et al.– In general, gives only weak control of FWER

• ‘Significance Analysis of Microarrays (SAM)’ (2 versions)– Efron et al. (2000): weak control of PFER– Tusher et al. (2001): strong control of PFER

• SAM also estimates ‘FDR’, but this ‘FDR’ is defined as E(V|H0

C)/R, not E(V/R)

Page 21: Multiple Testing in the Survival Analysis of Microarray Data

Controversies

• Whether multiple testing methods (adjustments) should be applied at all

• Which tests should be included in the family (e.g. all tests performed within a single experiment; define ‘experiment’)

• Alternatives

– Bayesian approach

– Meta-analysis

Page 22: Multiple Testing in the Survival Analysis of Microarray Data

Situations where inflated error rates are a concern

• It is plausible that all nulls may be true• A serious claim will be made whenever

any p < .05 is found• Much data manipulation may be

performed to find a ‘significant’ result• The analysis is planned to be

exploratory but wish to claim ‘sig’ results are real

• Experiment unlikely to be followed up before serious actions are taken

Page 23: Multiple Testing in the Survival Analysis of Microarray Data

Discussion (I)

• Lack of significant findings

– Small sample sizes

– FWER-controlling procedures may be too stringent in microarray applications

– FDR could perhaps be made even more powerful by taking into account the joint distribution of gene expression levels

Page 24: Multiple Testing in the Survival Analysis of Microarray Data

Discussion (II)

• Computational considerations

– All computing done in the R statistical environment (Ihaka and Gentleman)

– For max T, Cox model analysis was repeated for each of 100,800 random permutations of survival times

– Exact maximum likelihood calculation took about 60 hours per machine in cluster of 24 PCs, each with 1 GHz Pentium III and 256 MB memory

– Time can be reduced substantially by using a score approximation to obtain parameter estimates, and by calling C language code from within R

Page 25: Multiple Testing in the Survival Analysis of Microarray Data

References

• Alizadeh et al. (2000) Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling. Nature 403: 503-511

• Benjamini and Hochberg (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing. JRSSB 57: 289-200

• Benjamini and Yuketieli (2001) The control of false discovery rate in multiple hypothesis testing under dependency. Annals of Statistics

• Bittner et al. (2000) Molecular classification of cutaneous malignant melanoma by gene expression profiling. Nature 406: 536-540

• Efron et al. (2000) Microarrays and their use in a comparative experiment. Tech report, Stats, Stanford

• Golub et al. (1999) Molecular classification of cancer. Science 286: 531-537

Page 26: Multiple Testing in the Survival Analysis of Microarray Data

References

• Hochberg (1988) A sharper Bonferroni procedure for multiple tests of significance. Biometrika 75: 800-802

• Holm (1979) A simple sequentially rejective multiple testing procedure. Scand. J Statistics 6: 65-70

• Ihaka and Gentleman (1996) R: A language for data analysis and graphics. J Comp Graph Stats 5: 299-314

• Tusher et al. (2001) Significance analysis of microarrays applied to transcriptional responses to ionizing radiation. PNAS 98: 5116 -5121

• Westfall and Young (1993) Resampling-based multiple testing: Examples and methods for p-value adjustment. New York: Wiley

• Yuketieli and Benjamini (1999) Resampling based false discovery rate controlling multiple test procedures for correlated test statistics. J Stat Plan Inf 82: 171-196

Page 27: Multiple Testing in the Survival Analysis of Microarray Data

Acknowledgements

• Debashis Ghosh

• Erin Conlon