genomic meta-analysis in combining expression profiles

67
Genomic meta-analysis in combining expression profiles

Upload: gauri

Post on 23-Feb-2016

71 views

Category:

Documents


0 download

DESCRIPTION

Genomic meta-analysis in combining expression profiles. Outline. Introduction Two review papers Quality control ( MetaQC ) Meta-analysis for detecting differentially expressed genes ( MetaDE ) Meta-analysis for detecting pathways ( MetaPath ). 1. Introduction. Image analysis. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Genomic meta-analysis in combining expression profiles

Genomic meta-analysis in combining expression profiles

Page 2: Genomic meta-analysis in combining expression profiles

Outline

1. Introduction2. Two review papers3. Quality control (MetaQC)4. Meta-analysis for detecting differentially

expressed genes (MetaDE)5. Meta-analysis for detecting pathways

(MetaPath)

Page 3: Genomic meta-analysis in combining expression profiles

1. Introduction

Page 4: Genomic meta-analysis in combining expression profiles

Experimental designImage analysis

Preprocessing(Normalization, filtering,

MV imputation)

Data visualization

Identify differentially expressed genes

Regulatory network

Clustering Classification

Statistical Issues in Microarray Analysis

Gene enrichment analysis

Integrative analysis & meta-analysis

Page 5: Genomic meta-analysis in combining expression profiles

Meta-analysis and integrative analysis

Page 6: Genomic meta-analysis in combining expression profiles

Meta-analysis and integrative analysis

• Horizontal genomic meta-analysis: Combine multiple relevant studies (e.g. microarray or GWAS) to increase statistical power.

• Vertical genomic integrative analysis: Integrate multiple studies that measure multiple dimension of genetic information of the same cohort (e.g. transcription, genotyping, copy number variation, methylation, miRNA etc).

Page 7: Genomic meta-analysis in combining expression profiles

Genomic meta-analysis• In this lecture, we’ll more focus on microarray meta-analysis

but the principles applies to GWAS as well.• In statistics, a “meta-analysis” combines the results of

several studies that address a set of related research hypotheses.

• In the literature, many microarray meta-analysis have been done. Advantages include:– Increase statistical power– Provide robust and accurate

validation across studies– The result can guide future experiments.

• Many methods in microarray meta-analysis can be extended for genomic integrative analysis.

Page 8: Genomic meta-analysis in combining expression profiles

Motivation

• Microarray has become a common tool in biological investigation. Related high-throughput technologies (SNP array, ChIP-chip, next-generation sequencing) are also getting popular.

• As many data sets are publicly available, information integration of multiple studies becomes important.

Page 9: Genomic meta-analysis in combining expression profiles

Microarray databasesPrimary database• Gene Expression Omnibus (GEO) in NCBI• ArrayExpress in EBI• Stanford Microarray database• caArray at NCI

Secondary database• GEO Profiles (extension from GEO)• Gene Expression Atlas (extension from

ArrayExpress)• Genevestigator database• Oncomine

Page 10: Genomic meta-analysis in combining expression profiles

group) (diseased 1,group) (control1

,1,1study , sample

, gene ofintensity expression :

kkk

k

gsk

nmsmms

KkGgks

gx

study 1

genes N … N T … T

statistic

1 t11

2 t21

3 t31

… …G tG1

study K

genes N … N T … T

statistic

1 t1K

2 tK

3 t3K

… …G tGK

study 2

genes N … N T … T

statistic

1 t12

2 t22

3 t32

… …G tG2

Motivation and backgroundData considered:

Page 11: Genomic meta-analysis in combining expression profiles

Assume• K homogeneous studies are considered for information

integration. (inclusion/exclusion criteria)• Genes are matched across all studies with no missing

value. (gene matching across studies)• For each study, samples of two groups ( eg. normal vs

tumor) are available.

Q: how meta-analysis can help to enhance biomarker detection?

1. Motivation and background

Page 12: Genomic meta-analysis in combining expression profiles

Steps for genomic meta-analysis

• Identify biological objectives– Data sets available; inclusion/exclusion criteria– Biological questions to be answered

(Biomarker detection in two groups of samples)

• Set up of statistical framework• Choice of methods

Page 13: Genomic meta-analysis in combining expression profiles

2. Two review papers

Page 14: Genomic meta-analysis in combining expression profiles

• George C. Tseng*, Debashis Ghosh and Eleanor Feingold. (2012) Comprehensive literature review and statistical considerations for microarray meta-analysis. Nucleic Acids Research. accepted.

• Ferdouse Begum, Debashis Ghosh, George C. Tseng*, Eleanor Feingold. (2012) Comprehensive literature review and statistical considerations for GWAS meta-analysis. Nucleic Acids Research. accepted.

Two review papers

Page 15: Genomic meta-analysis in combining expression profiles

Summary of microarray meta-analysis

Page 16: Genomic meta-analysis in combining expression profiles

Summary of GWAS meta-analysis

Page 17: Genomic meta-analysis in combining expression profiles

Summary of GWAS meta-analysis

Page 18: Genomic meta-analysis in combining expression profiles

3. Quality control (MetaQC)

Page 19: Genomic meta-analysis in combining expression profiles

MetaQCQuality control analysis to determine inclusion/exclusion criteria for microarray meta-analysis

• Dongwan D. Kang, Etienne Sibille, Naftali Kaminski, and George C. Tseng*. (2012) MetaQC: Objective Quality Control and Inclusion/Exclusion Criteria for Genomic Meta-Analysis. Nucleic Acids Research. 40(2):e15.

Page 20: Genomic meta-analysis in combining expression profiles

Inclusion/exclusion criteriaExamples of inclusion/exclusion criteria in the literature:• Collect whatever microarray data sets available to combine• Go to GEO to retrieve all relevant studies in Affymetrix

U133• At least four samples in each class label…

Problem: ad hoc criteria and “expert” opinionAim: Is it possible to develop a quantitative quality

assessment to perform inclusion/exclusion of the microarray studies?

Page 21: Genomic meta-analysis in combining expression profiles

Six quality control (QC) measures

Each QC measure is defined as minus log-transformed p-values from formal hypothesis testing.

Page 22: Genomic meta-analysis in combining expression profiles

Four example tested

Page 23: Genomic meta-analysis in combining expression profiles

Brain cancer example

Paugh and Yamanaka have lower quality and will be excluded from meta-analysis.

These two studies have small sample sizes.

Page 24: Genomic meta-analysis in combining expression profiles

4. Meta-analysis for detecting differentially expressed genes (MetaDE)

Page 25: Genomic meta-analysis in combining expression profiles

prostate cancer data

example

• Each study contains small number of samples. Makes sense to perform meta-analysis.

Page 26: Genomic meta-analysis in combining expression profiles

Key issues in microarray meta-analysis(Ramasamy et al., PLoS Medicine 2008)(1) Identify suitable microarray studies; (2) Extract the data from studies;(3) Prepare the individual datasets; (4) Annotate the individual datasets; (5) Resolve the many-to-many relationship

between probes and genes; (6) Combine the study-specific estimates;(7) Analyze, present, and interpret results.

Page 27: Genomic meta-analysis in combining expression profiles

Goal of meta-analysis

Goal of meta-analysis:• What kind of biomarkers is of interest:

– Biomarkers statistically significant and consistent in all (or majority) of the studies.

– Biomarkers statistically significant in one or more studies.

Page 28: Genomic meta-analysis in combining expression profiles

Two hypothesis settingStudy 1 Study 2 Study 3 Study 4 Study 5

gene A 0.1 0.1 0.1 0.1 0.1

gene B 1E-5 1 1 1 1

KkHKkH

HSkA

kA 1 allfor 0:

1 somefor 0 :: 0

KkHH

HSkA

KB 1 somefor 0:

0:: 10

ÞDetect genes consistently DE in all studies

(similar to union-intersection test; IUT)

ÞDetect genes DE in at least one of the K studies

(intersection-union test; UIT)

Page 29: Genomic meta-analysis in combining expression profiles

Two popular methods

• Fisher’s method

• maxP

03.017)3.27.28.27.0(2

))1.0log()07.0log()06.0log()5.0(log(2..

~)log(2 221

p

ge

pT KK

i i

06.05.0)1.0,07.0,06.0,5.0max(..

)1,(~max1

pge

KBetapT iKi

Example: p-values of four studies=(0.5, 0.06, 0.07, 0.1)

Page 30: Genomic meta-analysis in combining expression profiles

Two hypothesis setting

1. HSA type of DE genes are usually more desirable and can quickly narrow down gene targets.

2. But genomic studies combined are usually not as consistent as hoped. HSA type of analysis can only detect small number of genes.

3. Heterogeneity between studies can exist by nature (e.g. Five different tissues are studied in each study).

Study 1 Study 2 Study 3 Study 4 Study 5

gene A 0.1 0.1 0.1 0.1 0.1

gene B 1E-5 1 1 1 1

Page 31: Genomic meta-analysis in combining expression profiles

Meta-analysisFour category of microarray meta-analysis methods:• Combine p-values

– Fisher, Stouffer, minP, maxP, adaptively weighted (AW) Fisher, rth ordered p-value (rOP), vote counting

• Combine effect sizes– Random effects model, fixed effects model, Bayesian

methods• Combine ranks

– Rank sum, rank product, rank aggregation• Direct merging

– Directly merge studies for analysis, various normalization methods (DWD, XPN …)

Page 32: Genomic meta-analysis in combining expression profiles

Illustrative examplesStudy 1 Study 2 Study 3 Study 4 Study 5 Fisher

(old)AW

(new)maxP (old)

rOP (new)

gene A 0.1 0.1 0.1 0.1 0.1

gene B 1E-5 1 1 1 1

gene C 0.01 0.01 0.01 0.01 1

gene D 0.2 0.2 0.2 0.2 0.2

Fisher: Detects gene A-C. Cannot distinguish between gene A & B.AW (adaptively weighted): Detects gene A-C. Gives indicator of which studies are DE .maxP: Detects gene A and D. but miss gene C.rOP (rth ordered p-value): Detects gene A, C and D. Provides more robustness.

DE in all studies

DE only in one study

DE in most studies

DE in all studies

META-DE

0.01

0.01

6E-5

0.1

0.075(1,1,1,1,1)

1E-4(1,0,0,0,0)

1E-4(1,1,1,1,0)

0.4(1,1,1,1,1)

1E-5

1

1

3E-4

5E-4

1

5E-8

7E-3

Page 33: Genomic meta-analysis in combining expression profiles

Other methods

• minP: • Adaptively weighted Fisher (Li & Tseng 2011)

)1,(~max1 KBetapT iKi

Page 34: Genomic meta-analysis in combining expression profiles

weights Weighted statistics

Null pvlaue

(1,1,1) 13.82 0.032

(1,1,0) 0 1

(1,0,1) 13.82 0.008

(1,0,0) 0 1

(0,1,1) 13.82 0.008

(0,1,0) 0 1

(0,0,1) 13.82 0.001

26

24

24

22

22

22

24

pvalues Study I Study II Study III

Gene 1 0.10 0.10 0.10Gene 2 1 1 0.001

Basic idea of adaptively-weighted method

Gene 2

What weight combination gives the best statistical significance?Given the best-weight, what is the null distribution and how to estimat FDR?

Adaptively-weighted (AW) statistic

K

kkk

K

kk

pwSAW

pSFisher

1

1

)log(:

)log(:

=> T=0.001

Page 35: Genomic meta-analysis in combining expression profiles

K

kkk pwS

1

)log(

Method: What weight combination gives the best statistical significance?

Rationale:In a traditional epidemiological study or a medical study, best-weight is severely biased to the signal we try to prove (a bad approach).

But in detecting DE genes in microarray study, it makes great biological sense. Some pathways may be altered only in some of the studies due to heterogeneous sample collection and experimental operations in different studies. It becomes very useful when combining many data sets.

adaptively-weighted statistic

Page 36: Genomic meta-analysis in combining expression profiles

From now on, we refer to Fisher’s method as equal-weight method (EW):

Our proposed adaptively-weighted method is:

In both cases, we avoid parametric assumption. Instead, we pursue “permutation test” to control FDR.

K

kkpS

1

)log(

K

kkk pwS

1

)log(

adaptively-weighted statistic

Page 37: Genomic meta-analysis in combining expression profiles

BG

TTITp

BG

TTITp

B

b

G

gb

gkbkgb

gk

B

b

G

g gkbkg

gk

1' 1')()'(

')(

1 1')(

'

|)||(|)(

: wellas statistics permutedfor pvalues Compute

|)||(|)(

)assumption Gaussian without parametric-(non testsnpermutatioby values-p evaluate :1 Step

adaptively-weighted statisticI. Evaluate study-specific p-values by permutation:

Page 38: Genomic meta-analysis in combining expression profiles

K

k

bgkk

bg

K

kgkkg

K

TpwwU

TpwwU

,w,wwg

1

)()(

1

1

)(log)(

:ondistributi null theas versionpermuted defineSimilarly

)(log)(

:pvalues-log of sum weightedof statistics a define ,)( weightsgiven and gene Fix :2 Step

adaptively-weighted statisticII. Calculate AW statistic:

Page 39: Genomic meta-analysis in combining expression profiles

statisticsAW the,))(( Define

)(minarg

pvalue. (smallest)t significan most theproducest that best weigh theObtain :4 Step

))()(()(

:)( statistics of pvalues Evaluate:3 Step

*

*

1' 1')(

'

ggg

gwg

B

b

G

g gb

gg

g

wUpV

wUpw

BG

wUwUIwUp

wU

Note: The searching space of w needs to specify. In the following we only search wk={0,1}.

For example, if the pvalues of four studies are (0.03, 0.05, 0.51, 0.45), the above algorithm will select w=(1,1,0,0).

adaptively-weighted statistic

Page 40: Genomic meta-analysis in combining expression profiles

G

g gg

g

G

g gg

B

b

G

g gb

gg

B

b

G

g gb

gg

bg

bg

bg

bg

bg

g

VpVpIVpG

VpVpIB

VVIVq

BG

VVIVp

wUpV

wUpwV

1' '

1' '

1 1')(

'

1' 1')(

'

)()()(

)()(

)()()(

)()(1)(

)(

)()(

estimated.similarly be can methodt best weighour of FDRand pvalue The :6 Step

*))((

)(minarg*on.distributi null theas of

versionpermuted calculatesimilary to4. Step Repeat :5 Step

adaptively-weighted statisticIII. Assess q-values of AW statistic:

Page 41: Genomic meta-analysis in combining expression profiles

Advantage of our proposed method:• Inference is done through permutation

analysis. No parametric assumption.• Instead of equal weight in Fisher’s method,

our method pursues best weight of study contributions based on data.

• The AW weights provides natural categorization of biomarkers for further biological investigation.

adaptively-weighted statistic

Page 42: Genomic meta-analysis in combining expression profiles

• Biomarkers detected by Fisher’s method (EW) and ordered by hierarchical clustering.

• Genes are DE in one or more studies but no indication of which ones.

Fisher’s methodFisher vs AW

Page 43: Genomic meta-analysis in combining expression profiles

• Biomarkers detected by AW method and ordered by hierarchical clustering.

• The optimal weights provide natural categorization and interpretation of biomarkers.

Adaptively weighted (AW)Fisher vs AW

Page 44: Genomic meta-analysis in combining expression profiles

Vote counting method

• Compute statistical significance of differential expression (p-value) in each study.

• For each gene, count the number of studies that have p-value smaller than a pre-defined threshold (e.g. 0.05).

• Genes with vote count more than a pre-defined threshold (e.g. 5 out of 10 total studies) are considered as significant biomarkers.

Page 45: Genomic meta-analysis in combining expression profiles

Drawback of vote counting method• It is possible to assess statistical significance of

vote counting method by permutation test.• This method is widely used in biological literature

due to its simplicity.• But vote counting has been criticized in

conventional meta-analysis and is an unfavorable approach.

• A gene with weak signal in all studies is interesting but won’t be detected by vote counting e.g. p-values= (0.1, 0.15, 0.07, 0.12).

Page 46: Genomic meta-analysis in combining expression profiles
Page 47: Genomic meta-analysis in combining expression profiles

REM vs FEM

),0(~,, when

(FEM) Model EffectsFixed

),0(~,

),0(~,

(REM) Model EffectsRandom

2,

2

2

2,

iiii

iii

iiiii

sNy

N

sNy

yhomogeniet of hypothesis nullunder ~

)(ˆ,)(

)ˆ(

)statistics Q s(Cochran' or FEM? REM determine How to

21

2

2

K

iiiii

ii

Q

wywsw

ywQ

Page 48: Genomic meta-analysis in combining expression profiles

Forest plot

Page 49: Genomic meta-analysis in combining expression profiles

Summary

• Theoretically, meta-analysis provides improved statistical power by combining multiple studies with small sample size.

• Different studies are performed by different platforms/protocol in different labs. Different patient cohorts are used.

• Be aware of assumptions behind different methods and the final biological goal.

Page 50: Genomic meta-analysis in combining expression profiles

5. Meta-analysis for detecting pathways (MetaPath)

Kui Shen and George C Tseng*. (2010) Meta-analysis for pathway enrichment analysis when combining multiple microarray studies. Bioinformatics. 26:1316-1323.

Page 51: Genomic meta-analysis in combining expression profiles

Diagram for enrichment analysis

xgs1≤g≤G 1≤s≤S

Phenotype

Genes

Samples

tg1≤g≤GG

enes

Gene statistic

zgp1≤g≤G

1≤p≤P

Pathway

Genes

vp

1≤p≤P

Pathway statistic

I

II

ys

Array data

matrix

Pathway database

Page 52: Genomic meta-analysis in combining expression profiles

MAPE_G...

tg1

1≤g≤G

Study 1

Genes tgk

1≤g≤G

Study K

Genes...

Associate score with phenotype

ug

1≤g≤G

Association with phenotype after meta-analysis at gene level

zgp1≤g≤G

1≤p≤P

Pathway database

Pathway

Genes

Pathway enrichment evidence score using MAPE_G

vp

1≤p≤P

I

II

III

A: MAPE_G

Study 1

Phenotype

x1gs1≤g≤G 1≤s≤SG

enes

Samples

Array data

matrix

y1s

Study K

Phenotype

xkgs1≤g≤G 1≤s≤SG

enes

Samples

Array data

matrix

yks

Page 53: Genomic meta-analysis in combining expression profiles

Procedures of MAPE_GI. For a given study k, compute p-values of differential expression:

1. Compute the t-statistic, tgk, of gene g in study k, where 1 ≤ g ≤ G, 1 ≤ k ≤ K.2. Permute group labels in each study B times, and calculate the permuted statistics, , where 1 ≤ b ≤ B.3. Estimate the p-value of tgk as and p-value of as

II. Meta-analysis:

1. The maximum p-value statistic (maxP) , , is applied for the meta analysis. Similarly, . 2. Estimate the p-value of maxP statistics as .

III. Enrichment analysis:

1. Given a pathway p, compute vp, the KS statistic for gene set enrichment based on p(ug).

2. Permute gene labels B times, and calculate the permuted statistics, , 1 ≤b≤B.

3. Estimate p-value of pathway p as

and similarly calculate .

4. Estimate q-value of pathway p as ,

)(bgkt

)()()( 1 1')(

' GBttItp Bb

Gg gk

bkggk

)()()( 1' 1')()'(

')( GBttItp B

bGg

bgk

bkg

bgk

)(max1 gkKkg tpu )(max )(

1)( b

gkKkb

g tpu

)()()( 1 1')(

' GBuuIup Bb

Gg g

bgg

)()()( 1 1')(

' PBvvIvp Bb

Pp p

bpp

)()()( 1' 1')()'(

')( PBvvIvp B

bPp

bp

bp

bp

( )' '1 ' 1 ' 1

( ) ( ) ( )B P Pbp p p p pb p p

q v I v v B I v v

)(bpv

Page 54: Genomic meta-analysis in combining expression profiles

MAPE_P...

tgk

1≤g≤G

Study K

Genes...

Associate score with phenotype

zgp1≤g≤G

1≤p≤P

Pathway database

Pathway

Genes

wp

1≤p≤P

Pathway enrichment evidence score using

MAPE_P

tg1

1≤g≤G

Study 1

Genes

vp1

1≤p≤P

Study 1

Pathway enrichment evidence score

vpk

1≤p≤P

Study k

...

III

II

I

B: MAPE_P

Study 1

Phenotype

x1gs1≤g≤G 1≤s≤SG

enes

Samples

Array data

matrix

y1s

Study K

Phenotype

xkgs1≤g≤G 1≤s≤SG

enes

Samples

Array data

matrix

yks

Page 55: Genomic meta-analysis in combining expression profiles

Procedures of MAPE_P I. Pathway enrichment analysis:

1. For each study k, Calculate , the p-value of gene g, by Student t-test, 1≤g≤G.

2. Given a pathway p, compute the KS statistic vpk that compares the p-values (p(tgk)) inside and outside the pathway.

3. Permute gene labels B times, and calculate the permuted statistics, , 1 ≤ b ≤ B.

4. Estimate the p-value of KS statistic in pathway p and study k as

and similarly calculate .

II. Meta-analysis:

1. The maximum p-value statistic (maxP) is applied for meta-analysis: and .2. Estimate p-value of wp as . Similarly .3. Estimate q-value as

( )gkp t

)(bpkv

)()()( 1 1')(

' PBvvIvp Bb

Pp pk

bkppk

)()()( 1' 1')()'(

')( PBvvIvp B

bPp

bpk

bkp

bpk

)(max1 pkKkp vpw

)(max )(1

)( bpkKk

bp vpw

)()()( 1 1')(

' PBwwIwp Bb

Pp p

bpp

)()()( 1' 1')()'(

')( PBwwIwp B

bPp

bp

bp

bp

Pp pp

Bb

Pp p

bpp wwIBwwIwq 1' '1 1'

)(' )()()(

Page 56: Genomic meta-analysis in combining expression profiles

Why two procedures?

Complementary advantages of MAPE_G vs MAPE_P

An example: AANU and HCTU gene sets

Page 57: Genomic meta-analysis in combining expression profiles

Why two procedures?

Page 58: Genomic meta-analysis in combining expression profiles

Why two procedures?

Page 59: Genomic meta-analysis in combining expression profiles

MAPE results

Page 60: Genomic meta-analysis in combining expression profiles

Combine MAPE_G and MAPE_P

MAPE_G and MAPE_P have complementary strengths. We are interested in pathways identified by both methods.

Page 61: Genomic meta-analysis in combining expression profiles

...

tg1

1≤g≤G

Study 1

Genes tgk

1≤g≤G

Study K

Genes...

Associate score with phenotype

ug

1≤g≤G

Association with phenotype after meta-analysis at gene level

zgp1≤g≤G

1≤p≤P

Pathway database

Pathway

Genes

Pathway enrichment evidence score using

MAPE_G

vp

1≤p≤P

...

tgk

1≤g≤G

Study K

Genes...

Associate score with phenotype

zgp1≤g≤G

1≤p≤P

Pathway database

Pathway

Genes

wp

1≤p≤P

Pathway enrichment evidence score using

MAPE_P

tg1

1≤g≤G

Study 1

Genes

vp1

1≤p≤P

Study 1

Pathway enrichment evidence score

vpk

1≤p≤P

Study k

...

sp

1≤p≤P

Pathway enrichment evidence score using MAPE_I

I

II

III III

II

I

A: MAPE_G B: MAPE_P

C: MAPE_I

Study 1

Phenotype

x1gs1≤g≤G 1≤s≤SG

enes

Samples

Array data

matrix

y1s

Study K

Phenotype

xkgs1≤g≤G 1≤s≤SG

enes

Samples

Array data

matrix

yks

Study 1

Phenotype

x1gs1≤g≤G 1≤s≤SG

enes

Samples

Array data

matrix

y1s

Study K

Phenotype

xkgs1≤g≤G 1≤s≤SG

enes

Samples

Array data

matrix

yks

The procedure of MAPE_I

Page 62: Genomic meta-analysis in combining expression profiles

Procedures of MAPE_I1. Let and

from Procedures in MAPE_G and MAPE_P.

2. Estimate the p-value as .

3. Estimate q-value as .

are enriched pathways identified by MAPE_I.

)(),(min ppp wpvps )(),(min )()()( bp

bp

bp wpvps

)()()( 1 1')(

' PBssIsp Bb

Pp p

bpp

Pp pp

Bb

Pp p

bpp ssIBssIsq 1' '1 1'

)(' )()()(

05.0)(:_ pIMAPE sqp

Page 63: Genomic meta-analysis in combining expression profiles

Scenario 1 (different degrees of effect sizes)

MAPE_G has better power in large or small α.

Page 64: Genomic meta-analysis in combining expression profiles

Scenario 1 (different degrees of effect sizes)

• Red line: power of MAPE_I• Blue line: power of MAPE_P• Green line: power of MAPE_G

Page 65: Genomic meta-analysis in combining expression profiles

Scenario 1 (different degrees of effect sizes)

1. When is low (0.5≤θ1 = θ2≤1), MAPE_G is more powerful than MAPE_P particularly when the pathway enrichment strength is not high.

2. When is large (1.5≤θ1 = θ2≤4), MAPE_G is more powerful than MAPE_P when the array coverage rate (0.7≤≤1) is high and the pathway enrichment strength is low (0.15≤≤0.2).

3. It shows complementary advantages of MAPE_G vs. MAPE_P.

=0.5

=0.75

=1

=1.5

=2

=4

Page 66: Genomic meta-analysis in combining expression profiles

Scenario 1 (different degrees of effect sizes)

MAPE_I (red) always has near the best statistical power among MAPE_G and MAPE_P, thus a good hybrid method to integrate complementary advantages of the two.

=0.5

=0.75

=1

=1.5

=2

=4

Page 67: Genomic meta-analysis in combining expression profiles

Summary

• MAPE_P and MAPE_G have complementary advantages depending on data structure.

• The hybrid form MAPE_I integrates advantages of both approaches is usually recommended.

• Our “MetaPath” package in R provides convenient routines to use in applications.