charlotte soneson - bioconductor · 2019. 7. 24. · control treat.gtcontrol.a treat.gtcontrol.b...
TRANSCRIPT
![Page 1: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/1.jpg)
Experimental design
Charlotte Soneson
Friedrich Miescher Institute for Biomedical Research & SIB Swiss Institute of Bioinformatics
@CSoneson
![Page 2: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/2.jpg)
“To call in the statistician after the experiment is done may be no more than asking him to perform a postmortem examination:
he may be able to say what the experiment died of.”
Sir Ronald Fisher, Indian Statistical Congress, Sankhya, around 1938
![Page 3: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/3.jpg)
[Lazic, 2016]
Different types of experiments
![Page 4: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/4.jpg)
http://www.stats.gla.ac.uk/steps/glossary/anova.html#expdes
What is experimental design?
The organization of an experiment, to ensure that the right type of data, and enough of it, is available to answer the questions of interest as clearly and efficiently as possible.
![Page 5: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/5.jpg)
Bad experimental design - examples
Analysis batch I / Study center I / Processing protocol I ...
Tr Tr Tr Tr Tr Tr Tr Tr
Ctl Ctl Ctl Ctl Ctl Ctl Ctl Ctl
Analysis batch II / Study center II / Processing protocol II ...
What is bad experimental design?
![Page 6: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/6.jpg)
Bad experimental design - examples
Analysis batch I / Study center I / Processing protocol I ...
Tr Tr Tr Tr Tr Tr Tr Tr
Ctl Ctl Ctl Ctl Ctl Ctl Ctl Ctl
Analysis batch II / Study center II / Processing protocol II ...
What is bad experimental design?
Confounding!
![Page 7: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/7.jpg)
Akey et al., Nature Genetics 2007; Spielman et al., Nature Genetics 2007
What can happen with bad experimental design?
• Example: gene expression study comparing 60 CEU and 82 ASN HapMap individuals
• 26% of the genes were found to be significantly differentially expressed (78% with less restrictive multiple testing correction)
• But: all CEU samples were processed (sometimes years) before all the ASN samples!
![Page 8: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/8.jpg)
Akey et al., Nature Genetics 2007; Spielman et al., Nature Genetics 2007
What can happen with bad experimental design?
• Example: gene expression study comparing 60 CEU and 82 ASN HapMap individuals
• 26% of the genes were found to be significantly differentially expressed (78% with less restrictive multiple testing correction)
• But: all CEU samples were processed (sometimes years) before all the ASN samples!
Confounding!
![Page 9: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/9.jpg)
Akey et al., Nature Genetics 2007; Spielman et al., Nature Genetics 2007
What can happen with bad experimental design?
78% differentially expressed 96% differentially expressed
Comparing CEU and ASN Comparing processing times
![Page 10: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/10.jpg)
What would be a better experimental design?
• Process all samples at the same time/in one batch (not always feasible)
• Minimize confounding as much as possible through
• blocking
• randomization
• Batch effects may still be present, but with an appropriate design we can account for them
![Page 11: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/11.jpg)
Treated Untreated
Gen
e ex
pres
sion
Nonzero batch effect Zero treatment effect
Gen
e ex
pres
sion
Non-confounded design
Batch A Batch BBatch A Batch B
Confounded design
![Page 12: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/12.jpg)
Batch A Batch B
Treated Untreated
Gen
e ex
pres
sion
Non-confounded design
Nonzero batch effect Nonzero treatment effect
Batch A Batch B
Gen
e ex
pres
sion
Confounded design
![Page 13: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/13.jpg)
Dealing with batch effects
• In statistical modeling, batch effects can be included as covariates (additional predictors) in the model.
• For exploratory analysis, we often attempt to “eliminate” or “adjust for” such unwanted variation in advance, by subtracting the estimated effect from each variable (e.g. the expression of a gene).
• Even partial confounding between batch and signal of interest can lead to problems.
Nygaard et al., Biostatistics 2016
![Page 14: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/14.jpg)
Akey et al., Nature Genetics 2007; Spielman et al., Nature Genetics 2007
What can happen with bad experimental design?
78% differentially expressed 96% differentially expressed
Comparing CEU and ASN Comparing processing times
![Page 15: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/15.jpg)
p-values from test comparing CEU and ASN, after controlling for the processing year
0% differentially expressedAkey et al., Nature Genetics 2007
“Batch effect correction” won’t work here
![Page 16: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/16.jpg)
Danielsson et al., Briefings in Bioinformatics 2015
color = tissue; symbol = study (batch)
Accounting for batch effects in practice
Public, processed RNA-seq data from 3 tissues, 4 studies show strong “study” (=batch) signal
![Page 17: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/17.jpg)
Danielsson et al., Briefings in Bioinformatics 2015
color = tissue; symbol = study (batch)
Accounting for batch effects in practice
Accounting for the batch effect brings out the signal of interest
![Page 18: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/18.jpg)
Leek et al., Nature Reviews Genetics 2010
Batch effect adjustment vs normalization
Batch effect adjustment goes beyond the “global” between-sample normalization methods
![Page 19: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/19.jpg)
Leek et al., Nature Reviews Genetics 2010
Batch effect adjustment vs normalization
Batch effect adjustment goes beyond the “global” between-sample normalization methods
![Page 20: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/20.jpg)
• Replicates are necessary to estimate within-condition variability.
• Variability estimates are, in turn, vital for statistical testing.
Gen
e ex
pres
sion
Group 1 Group 2
Other design issues: replication
![Page 21: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/21.jpg)
• Replicates are necessary to estimate within-condition variability.
• Variability estimates are, in turn, vital for statistical testing.
Other design issues: replication
Gen
e ex
pres
sion
Group 1 Group 2
![Page 22: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/22.jpg)
• Replicates are necessary to estimate within-condition variability.
• Variability estimates are, in turn, vital for statistical testing.
Gen
e ex
pres
sion
Group 1 Group 2
Other design issues: replication
![Page 23: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/23.jpg)
[Lazic, 2016]
Different types of units
• Biological units (BU) - entities we want to make inferences about (e.g., animal, person)
• Experimental units (EU) - smallest entities that can be independently assigned to a treatment (e.g., animal, litter, cage, well)
• Observational units (OU) - entities at which measurements are made
![Page 24: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/24.jpg)
[Lazic, 2016]
Biological vs experimental units
![Page 25: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/25.jpg)
[Lazic, 2016]
Pseudoreplication
• “Artificial inflation of the sample size, that usually occurs when the biological unit of interest differs from the experimental unit or observational unit.”
• Only replication of experimental units is true replication
• To make a general statement about the effect of an intervention on a biological unit, we need to replicate the number of such units
![Page 26: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/26.jpg)
• Testing is done separately for each gene
• We must tell the packages which model to fit (e.g. which predictors to use)
• The design does not follow “automatically” from having the sample annotation table - many different designs are often possible
• Model formulas in R:
• Fit a separate model for each gene - response variable changes. Specify only predictors
response variable ⇠ predictors
Model formulas and design matrices
![Page 27: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/27.jpg)
Examples
![Page 28: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/28.jpg)
• After fitting the model(s), we must decide which coefficient (or combination thereof) we want to apply a hypothesis test for.
• Combinations of coefficients are called contrasts.
• Design matrices can often be defined in many equivalent ways - important that the contrast is defined accordingly!
Testing and contrasts
![Page 29: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/29.jpg)
Examples
![Page 30: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/30.jpg)
• A design matrix contains the values of the predictor variables for each sample
e.g.: (log) expression values for a given gene
coefficients0
BBBBBB@
y1y2y3y4y5y6
1
CCCCCCA=
0
BBBBBB@
1 01 01 01 11 11 1
1
CCCCCCA
✓�0
�1
◆+
0
BBBBBB@
"1"2"3"4"5"6
1
CCCCCCA= X� + "
yi = �0 + �1xi + "i
Model formulas and design matrices
![Page 31: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/31.jpg)
Many ways of modeling the same expected values
• 1 predictor, 2 groups
b0 b0 + b1
group 1 group 2
b0 b1
group 1 group 2
the coefficients mean different things in the different cases!
• 2 predictors, 2*2 groups
b0 b0+b1
b0+b2 b0+b1+ b2+b3
b0 b1
b2 b3
b0 b0+b1
b0+b2 b0+b2+ b3
X
X
X
X
X
~X
~0 + X
~X*Y ~X + Y + X:Y
~0 + XY
Y
Y
Y
~X + X:Y
New variable, combining X
and Y
Explore model matrices with https://github.com/csoneson/ExploreModelMatrix
![Page 32: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/32.jpg)
Sample table:
Formula:
Design matrix:
Modeled values:
control treated
treatmentcontrol treatmenttreated
⇠ 0 + treatment
Model formulas and design matrices - example 1 One predictor, two levels (without intercept)
![Page 33: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/33.jpg)
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
● ●
●
01
23
4
treatment
y
control treated
treatmentcontroltreatmenttreated
![Page 34: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/34.jpg)
⇠ treatment
Sample table:
Formula:
Design matrix:
Modeled values:
control treated
1 * Intercept + 0 * treatmenttreated
1 * Intercept + 1 * treatmenttreated
Model formulas and design matrices - example 1 One predictor, two levels (with intercept)
![Page 35: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/35.jpg)
⇠ treatment
Sample table:
Formula:
Design matrix:
Modeled values:
control treated
1 * Intercept + 0 * treatmenttreated
1 * Intercept + 1 * treatmenttreated
Model formulas and design matrices - example 1 One predictor, two levels (with intercept)
![Page 36: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/36.jpg)
⇠ treatment
Sample table:
Formula:
Design matrix:
Modeled values:
control treated
1 * Intercept + 0 * treatmenttreated
1 * Intercept + 1 * treatmenttreated
Model formulas and design matrices - example 1 One predictor, two levels (with intercept)
![Page 37: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/37.jpg)
⇠ treatment
Sample table:
Formula:
Design matrix:
Modeled values:
control treated
1 * Intercept + 0 * treatmenttreated
1 * Intercept + 1 * treatmenttreated
Model formulas and design matrices - example 1 One predictor, two levels (with intercept)
![Page 38: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/38.jpg)
⇠ treatment
Sample table:
Formula:
Design matrix:
Modeled values:
control treated
1 * Intercept + 0 * treatmenttreated
1 * Intercept + 1 * treatmenttreated
Model formulas and design matrices - example 1 One predictor, two levels (with intercept)
![Page 39: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/39.jpg)
⇠ treatment
Sample table:
Formula:
Design matrix:
Modeled values:
control treated
1 * Intercept + 0 * treatmenttreated
1 * Intercept + 1 * treatmenttreated
Model formulas and design matrices - example 1 One predictor, two levels (with intercept)
![Page 40: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/40.jpg)
⇠ treatment
Sample table:
Formula:
Design matrix:
Modeled values:
control treated
1 * Intercept + 0 * treatmenttreated
1 * Intercept + 1 * treatmenttreated
Model formulas and design matrices - example 1 One predictor, two levels (with intercept)
![Page 41: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/41.jpg)
⇠ treatment
Sample table:
Formula:
Design matrix:
Modeled values:
control treated
InterceptIntercept +
treatmenttreated
Model formulas and design matrices - example 1 One predictor, two levels (with intercept)
![Page 42: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/42.jpg)
●
●
●
●
●
●
●
●
●●
●
●
●
●●
●
●
● ●
●
01
23
4
treatment
y
control treated
(Intercept)treatmenttreated
![Page 43: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/43.jpg)
Sample table:
Formula:
Design matrix:
Modeled values:
s1 s2 s3 s4 s5 s6
Intercept + 21 * age
Intercept + 12 * age
Intercept + 64 * age
Intercept + 44 * age
Intercept + 19 * age
Intercept + 26 * age
⇠ age
Model formulas and design matrices - example 2 One continuous predictor
![Page 44: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/44.jpg)
⇠ treatment
Sample table:
Formula:
Design matrix:
Modeled values:
control treatA treatB
InterceptIntercept +
treatmenttreatAIntercept +
treatmenttreatB
Model formulas and design matrices - example 3 One predictor, three levels
![Page 45: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/45.jpg)
⇠ sample + treatment
Sample table:
Formula:
Design matrix:
Modeled values:s1 s2 s3
control Intercept Intercept + samples2
Intercept + samples3
treated Intercept + treatmenttreated
Intercept + samples2 +
treatmenttreated
Intercept + samples3 +
treatmenttreated
Model formulas and design matrices - example 4 One predictor, paired data (or two predictors)
![Page 46: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/46.jpg)
Sample table:
Formula:
Design matrix:
Modeled values:genotype A genotype B
control Intercept Intercept + genotypeB
treated Intercept + treatmenttreated
Intercept + genotypeB + treatmenttreated
⇠ genotype + treatment
Model formulas and design matrices - example 4 One predictor, paired data (or two predictors)
![Page 47: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/47.jpg)
●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
01
23
45
treatment/genotype
y
control.A treated.A control.B treated.B
(Intercept)treatmenttreatedgenotypeB
![Page 48: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/48.jpg)
⇠ genotype * treatment
⇠ genotype + treatment + genotype:treatment
Sample table:
Formula:
Design matrix:
Modeled values:genotype A genotype B
control Intercept Intercept + genotypeB
treated Intercept + treatmenttreatedIntercept + genotypeB +
treatmenttreated + genotypeB:treatmenttreated
Model formulas and design matrices - example 5 Two predictors, with interaction
![Page 49: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/49.jpg)
●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
01
23
45
treatment/genotype
y
control.A treated.A control.B treated.B
(Intercept)treatmenttreatedgenotypeBtreatmenttreated:genotypeB
![Page 50: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/50.jpg)
Sample table:
Formula:
Design matrix:
Modeled values:genotype A genotype B
control treat.gtcontrol.A treat.gtcontrol.B
treated treat.gttreated.A treat.gttreated.B
⇠ 0 + treat.gt
Model formulas and design matrices - example 6 Two predictors, with interaction
![Page 51: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/51.jpg)
●
●
●
●
●
●
●
●
●●
●
●
●
●●●
●
●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
01
23
45
treatment/genotype
y
control.A treated.A control.B treated.B
treat.gtcontrol.Atreat.gttreated.Atreat.gtcontrol.Btreat.gttreated.B
![Page 52: Charlotte Soneson - Bioconductor · 2019. 7. 24. · control treat.gtcontrol.A treat.gtcontrol.B treated treat.gttreated.A treat.gttreated.B ⇠ 0+treat.gt Model formulas and design](https://reader036.vdocument.in/reader036/viewer/2022071405/60f997957aa7f973b50b7125/html5/thumbnails/52.jpg)
• Akay et al.: On the design and analysis of gene expression studies in human populations. Nature Genetics 39(7):807-808 (2007) • Nygaard et al.: Methods that remove batch effects while retaining group differences may lead to exaggerated confidence in downstream analyses. Biostatistics 17(1):29-39
(2016) • Danielsson et al.: Assessing the consistency of public human tissue RNA-seq data sets. Briefings in Bioinformatics 16(6):941-949 (2015) • Leek et al.: Tackling the widespread and critical impact of batch effects in high-throughput data. Nature Reviews Genetics 11(10:733-739 (2010) • Schurch et al.: How many biological replicates are needed in an RNA-seq experiment and which differential expression tool should you use? RNA (2016) • Lazic: Experimental Design for Laboratory Biologists: Maximising Information and Improving Reproducibility. Cambridge University Press (2016).
References