small sample aic lecture, slides only
TRANSCRIPT
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 1/69
Information-theoretic analysis of -omics dataAn introduction
David R. Bickel
University of Ottawa
17 November 2008
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 1 / 11
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 2/69
Today’s class
Di¤erential gene/protein/metabolite expression
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 2 / 11
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 3/69
Today’s class
Di¤erential gene/protein/metabolite expression
Which genes express di¤erently between treatment and control?
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 2 / 11
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 4/69
Today’s class
Di¤erential gene/protein/metabolite expression
Which genes express di¤erently between treatment and control?Examples of "treatments"
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 2 / 11
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 5/69
Today’s class
Di¤erential gene/protein/metabolite expression
Which genes express di¤erently between treatment and control?Examples of "treatments"
Medical: drug or chemotherapy applied to some patients
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 2 / 11
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 6/69
Today’s class
Di¤erential gene/protein/metabolite expression
Which genes express di¤erently between treatment and control?Examples of "treatments"
Medical: drug or chemotherapy applied to some patientsBasic: hormone or other chemical added to some cell cultures
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 2 / 11
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 7/69
Today’s class
Di¤erential gene/protein/metabolite expression
Which genes express di¤erently between treatment and control?Examples of "treatments"
Medical: drug or chemotherapy applied to some patientsBasic: hormone or other chemical added to some cell culturesOther examples?
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 2 / 11
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 8/69
Today’s class
Di¤erential gene/protein/metabolite expression
Which genes express di¤erently between treatment and control?Examples of "treatments"
Medical: drug or chemotherapy applied to some patientsBasic: hormone or other chemical added to some cell culturesOther examples?
How much information or evidence is in the measurements
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 2 / 11
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 9/69
Today’s class
Di¤erential gene/protein/metabolite expression
Which genes express di¤erently between treatment and control?Examples of "treatments"
Medical: drug or chemotherapy applied to some patientsBasic: hormone or other chemical added to some cell culturesOther examples?
How much information or evidence is in the measurements
for di¤erential expression?
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 2 / 11
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 10/69
Today’s class
Di¤erential gene/protein/metabolite expression
Which genes express di¤erently between treatment and control?Examples of "treatments"
Medical: drug or chemotherapy applied to some patientsBasic: hormone or other chemical added to some cell culturesOther examples?
How much information or evidence is in the measurements
for di¤erential expression?for equivalent expression?
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 2 / 11
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 11/69
Pick the di¤erentially expressed genes
What is di¤erential gene/protein/metabolite expression?
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 3 / 11
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 12/69
Pick the di¤erentially expressed genes
What is di¤erential gene/protein/metabolite expression?
An average expression ratio of 1 indicates equivalent expression
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 3 / 11
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 13/69
Pick the di¤erentially expressed genes
What is di¤erential gene/protein/metabolite expression?
An average expression ratio of 1 indicates equivalent expressionTwo types of di¤erential expression
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 3 / 11
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 14/69
Pick the di¤erentially expressed genes
What is di¤erential gene/protein/metabolite expression?
An average expression ratio of 1 indicates equivalent expressionTwo types of di¤erential expression
An average expression ratio less than 1 indicates under-expression
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 3 / 11
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 15/69
Pick the di¤erentially expressed genes
What is di¤erential gene/protein/metabolite expression?
An average expression ratio of 1 indicates equivalent expressionTwo types of di¤erential expression
An average expression ratio less than 1 indicates under-expressionAn average expression ratio greater than 1 indicates over-expression
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 3 / 11
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 16/69
Pick the di¤erentially expressed genes
What is di¤erential gene/protein/metabolite expression?
An average expression ratio of 1 indicates equivalent expressionTwo types of di¤erential expression
An average expression ratio less than 1 indicates under-expressionAn average expression ratio greater than 1 indicates over-expression
"Average expression" is over the population, not just the observed data
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 3 / 11
Pi k h di¤ i ll d
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 17/69
Pick the di¤erentially expressed genes
What is di¤erential gene/protein/metabolite expression?
An average expression ratio of 1 indicates equivalent expressionTwo types of di¤erential expression
An average expression ratio less than 1 indicates under-expressionAn average expression ratio greater than 1 indicates over-expression
"Average expression" is over the population, not just the observed data
The histogram of a large expression data set resembles the truedistribution
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 3 / 11
Pi k h di¤ i ll d
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 18/69
Pick the di¤erentially expressed genes
What is di¤erential gene/protein/metabolite expression?
An average expression ratio of 1 indicates equivalent expressionTwo types of di¤erential expression
An average expression ratio less than 1 indicates under-expressionAn average expression ratio greater than 1 indicates over-expression
"Average expression" is over the population, not just the observed data
The histogram of a large expression data set resembles the truedistribution
Gene expression ratios measured by microarrays
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 3 / 11
Pi k h di¤ i ll d
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 19/69
Pick the di¤erentially expressed genes
What is di¤erential gene/protein/metabolite expression?
An average expression ratio of 1 indicates equivalent expressionTwo types of di¤erential expression
An average expression ratio less than 1 indicates under-expressionAn average expression ratio greater than 1 indicates over-expression
"Average expression" is over the population, not just the observed data
The histogram of a large expression data set resembles the truedistribution
Gene expression ratios measured by microarrays
A sample from the treatment group and a sample from the control
group are hybridized to the same microarray slide
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 3 / 11
Pi k th di¤ ti ll d
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 20/69
Pick the di¤erentially expressed genes
What is di¤erential gene/protein/metabolite expression?
An average expression ratio of 1 indicates equivalent expressionTwo types of di¤erential expression
An average expression ratio less than 1 indicates under-expressionAn average expression ratio greater than 1 indicates over-expression
"Average expression" is over the population, not just the observed data
The histogram of a large expression data set resembles the truedistribution
Gene expression ratios measured by microarrays
A sample from the treatment group and a sample from the control
group are hybridized to the same microarray slideEach gene’s expression ratio is a measurement of its expression in thetreatment group relative to its expression in the control group
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 3 / 11
Pi k th di¤ ti ll d
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 21/69
Pick the di¤erentially expressed genes
What is di¤erential gene/protein/metabolite expression?
An average expression ratio of 1 indicates equivalent expressionTwo types of di¤erential expression
An average expression ratio less than 1 indicates under-expressionAn average expression ratio greater than 1 indicates over-expression
"Average expression" is over the population, not just the observed data
The histogram of a large expression data set resembles the truedistribution
Gene expression ratios measured by microarrays
A sample from the treatment group and a sample from the control
group are hybridized to the same microarray slideEach gene’s expression ratio is a measurement of its expression in thetreatment group relative to its expression in the control group
Based on the expression data, which genes are di¤erentially expressed?
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 3 / 11
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 22/69
data set #1 data set #2 data set #4 data set #6
data (n = 3)data (n = 6)model (n = 3)model (n = 6)evidence (n = 3)
evidence (n = 6)For each data set, indicate whether the gene is equivalently expressed (E)or di¤erentially expressed (D) according to the plot of the data, accordingto the model , and according to the evidence for each number of observations (3 or 6). Equivalent expression means the average expression
ratio is 1.
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 4 / 11
Statistical models
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 23/69
Statistical models
p stands for the number of unknown parameters in a model
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 5 / 11
Statistical models
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 24/69
Statistical models
p stands for the number of unknown parameters in a model
Equivalent expression model
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 5 / 11
Statistical models
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 25/69
Statistical models
p stands for the number of unknown parameters in a model
Equivalent expression model
Unknown variability of expression
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 5 / 11
Statistical models
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 26/69
Statistical models
p stands for the number of unknown parameters in a model
Equivalent expression model
Unknown variability of expressionExpression ratio known to be 1
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 5 / 11
Statistical models
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 27/69
Statistical models
p stands for the number of unknown parameters in a model
Equivalent expression model
Unknown variability of expressionExpression ratio known to be 1
One unknown parameter (p = 1)
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 5 / 11
Statistical models
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 28/69
Statistical models
p stands for the number of unknown parameters in a model
Equivalent expression model
Unknown variability of expressionExpression ratio known to be 1
One unknown parameter (p = 1)
Di¤erential expression model
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 5 / 11
Statistical models
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 29/69
Statistical models
p stands for the number of unknown parameters in a model
Equivalent expression model
Unknown variability of expressionExpression ratio known to be 1
One unknown parameter (p = 1)
Di¤erential expression model
Unknown variability of expression
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 5 / 11
Statistical models
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 30/69
Statistical models
p stands for the number of unknown parameters in a model
Equivalent expression model
Unknown variability of expressionExpression ratio known to be 1
One unknown parameter (p = 1)
Di¤erential expression model
Unknown variability of expressionUnknown expression ratio
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 5 / 11
Statistical models
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 31/69
Statistical models
p stands for the number of unknown parameters in a model
Equivalent expression model
Unknown variability of expressionExpression ratio known to be 1
One unknown parameter (p = 1)Di¤erential expression model
Unknown variability of expressionUnknown expression ratio
Two unknown parameters (p = 2)
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 5 / 11
Statistical models
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 32/69
p stands for the number of unknown parameters in a model
Equivalent expression model
Unknown variability of expressionExpression ratio known to be 1
One unknown parameter (p = 1)Di¤erential expression model
Unknown variability of expressionUnknown expression ratio
Two unknown parameters (p = 2)
How do the model plots change your initial assessments?
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 5 / 11
Balancing complexity and …t
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 33/69
g p y
The di¤erential expression model (p = 2) is more complex than theequivalent expression model (p = 1)
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 6 / 11
Balancing complexity and …t
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 34/69
g p y
The di¤erential expression model (p = 2) is more complex than theequivalent expression model (p = 1)
More complex models tend to …t data better than simple models,even if the simple models are better
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 6 / 11
Balancing complexity and …t
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 35/69
g p y
The di¤erential expression model (p = 2) is more complex than theequivalent expression model (p = 1)
More complex models tend to …t data better than simple models,even if the simple models are better
Overly complex models make poor generalizations
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 6 / 11
Balancing complexity and …t
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 36/69
The di¤erential expression model (p = 2) is more complex than theequivalent expression model (p = 1)
More complex models tend to …t data better than simple models,even if the simple models are better
Overly complex models make poor generalizationsA sample of patients may not represent the population
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 6 / 11
Balancing complexity and …t
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 37/69
The di¤erential expression model (p = 2) is more complex than theequivalent expression model (p = 1)
More complex models tend to …t data better than simple models,even if the simple models are better
Overly complex models make poor generalizationsA sample of patients may not represent the populationA single experiment may not re‡ect typical biological processes
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 6 / 11
Balancing complexity and …t
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 38/69
The di¤erential expression model (p = 2) is more complex than theequivalent expression model (p = 1)
More complex models tend to …t data better than simple models,even if the simple models are better
Overly complex models make poor generalizationsA sample of patients may not represent the populationA single experiment may not re‡ect typical biological processes
Fit
Complexity
= Evidence
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 6 / 11
Balancing complexity and …t
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 39/69
The di¤erential expression model (p = 2) is more complex than theequivalent expression model (p = 1)
More complex models tend to …t data better than simple models,even if the simple models are better
Overly complex models make poor generalizationsA sample of patients may not represent the populationA single experiment may not re‡ect typical biological processes
Fit
Complexity
= Evidence
How does balancing …t with complexity change your assessments?
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 6 / 11
Quality of model …t to the data
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 40/69
n = sample size
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 7 / 11
Quality of model …t to the data
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 41/69
n = sample size
number of measured expression ratios
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 7 / 11
Quality of model …t to the data
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 42/69
n = sample size
number of measured expression ratios
MSE = mean of squared errors of the model
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 7 / 11
Quality of model …t to the data
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 43/69
n = sample size
number of measured expression ratios
MSE = mean of squared errors of the model
degree to which the model disagrees with the observed data (log scale)
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 7 / 11
Quality of model …t to the data
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 44/69
n = sample size
number of measured expression ratios
MSE = mean of squared errors of the model
degree to which the model disagrees with the observed data (log scale)
Fit =
1p MSE
n
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 7 / 11
Quality of model …t to the data
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 45/69
n = sample size
number of measured expression ratios
MSE = mean of squared errors of the model
degree to which the model disagrees with the observed data (log scale)
Fit =
1p MSE
n
degree to which the model …ts the observed data (assuming a normal
distribution)
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 7 / 11
Model complexity
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 46/69
n = sample size
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 8 / 11
Model complexity
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 47/69
n = sample size
number of measured expression ratios
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 8 / 11
Model complexity
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 48/69
n = sample size
number of measured expression ratios
p = model dimension
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 8 / 11
Model complexity
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 49/69
n = sample size
number of measured expression ratios
p = model dimension
number of unknown parameters in the model
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 8 / 11
Model complexity
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 50/69
n = sample size
number of measured expression ratios
p = model dimension
number of unknown parameters in the model
p = 1 for the equivalent expression model
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 8 / 11
Model complexity
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 51/69
n = sample size
number of measured expression ratios
p = model dimension
number of unknown parameters in the model
p = 1 for the equivalent expression model
p = 2 for the di¤erential expression model
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 8 / 11
Model complexity
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 52/69
n = sample size
number of measured expression ratios
p = model dimension
number of unknown parameters in the model
p = 1 for the equivalent expression model
p = 2 for the di¤erential expression model
p c = p +p (p + 1)
2 (n p + 1)
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 8 / 11
Model complexity
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 53/69
n = sample size
number of measured expression ratios
p = model dimension
number of unknown parameters in the model
p = 1 for the equivalent expression model
p = 2 for the di¤erential expression model
p c = p +p (p + 1)
2 (n p + 1)
e¤ective number of parameters in the model (corrected for small n)
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 8 / 11
Model complexity
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 54/69
n = sample size
number of measured expression ratios
p = model dimension
number of unknown parameters in the model
p = 1 for the equivalent expression model
p = 2 for the di¤erential expression model
p c = p +p (p + 1)
2 (n p + 1)
e¤ective number of parameters in the model (corrected for small n)
Complexity = 2.718p c
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 8 / 11
Model complexity
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 55/69
n = sample size
number of measured expression ratios
p = model dimension
number of unknown parameters in the model
p = 1 for the equivalent expression model
p = 2 for the di¤erential expression model
p c = p +p (p + 1)
2 (n p + 1)
e¤ective number of parameters in the model (corrected for small n)
Complexity = 2.718p c
Fit
Complexity= Evidence
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 8 / 11
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 56/69
Answers
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 57/69
How do our analyses compare to the truth?
If a statistical method says an equivalently expressed gene isdi¤erentially expressed, is the method useless?
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 9 / 11
Answers
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 58/69
How do our analyses compare to the truth?
If a statistical method says an equivalently expressed gene isdi¤erentially expressed, is the method useless?If a statistical method says a di¤erentially expressed gene is
equivalently expressed, is the method useless?
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 9 / 11
Answers
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 59/69
How do our analyses compare to the truth?
If a statistical method says an equivalently expressed gene isdi¤erentially expressed, is the method useless?If a statistical method says a di¤erentially expressed gene is
equivalently expressed, is the method useless?The advantage of obtaining more data
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 9 / 11
Answers
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 60/69
How do our analyses compare to the truth?
If a statistical method says an equivalently expressed gene isdi¤erentially expressed, is the method useless?If a statistical method says a di¤erentially expressed gene is
equivalently expressed, is the method useless?The advantage of obtaining more data
The best possible assessment given the available data
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 9 / 11
Answers
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 61/69
How do our analyses compare to the truth?
If a statistical method says an equivalently expressed gene isdi¤erentially expressed, is the method useless?If a statistical method says a di¤erentially expressed gene is
equivalently expressed, is the method useless?The advantage of obtaining more data
The best possible assessment given the available data
How con…dent should you be in your assessments?
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 9 / 11
Answers
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 62/69
How do our analyses compare to the truth?
If a statistical method says an equivalently expressed gene isdi¤erentially expressed, is the method useless?If a statistical method says a di¤erentially expressed gene is
equivalently expressed, is the method useless?The advantage of obtaining more data
The best possible assessment given the available data
How con…dent should you be in your assessments?
Should you obtain more data before making an assessment?
David Bickel (uOttawa) Information theory 17 N ovem ber 2008 9 / 11
The expression data sets
data set #1 data set #2 data set #4 data set #6
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 63/69
# # # #
ratio 1 2 1 1.4
expression equivalent di¤erential equivalent di¤erential
n = 10 0.
44/1.
38 0.
14/0.
09 0.
14/0.
17 0.
19/0.
37
n = 25 0.29/0.71 0.03/0.002 4.77/1.00 0.05/0.04
n = 100 36/691 104
2 10716/32 0.03/0.01
Key
n is the number of observed expression ratios.
Each ratio is Evidence di¤erentially expressedEvidence equivalently expressed , the weight of evidence
favoring di¤erential expression over equivalent expression.
* misleading evidence for di¤erential expression
** misleading evidence for equivalent expressionDavid Bickel (uOttawa) Information theory 17 November 2 00 8 10 / 1 1
Further study
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 64/69
The method presented is based on the Akaike information criterion(AIC) after correcting it for small numbers of measurements
David Bickel (uOttawa) Information theory 17 N ove mb er 2008 11 / 11
Further study
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 65/69
The method presented is based on the Akaike information criterion(AIC) after correcting it for small numbers of measurements
AICc = 2 ln (Evidence)
David Bickel (uOttawa) Information theory 17 N ove mb er 2008 11 / 11
Further study
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 66/69
The method presented is based on the Akaike information criterion(AIC) after correcting it for small numbers of measurements
AICc = 2 ln (Evidence)Software packages with the AIC but without the correction may be
unreliable for small numbers of observations (n < 40)
David Bickel (uOttawa) Information theory 17 N ove mb er 2008 11 / 11
Further study
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 67/69
The method presented is based on the Akaike information criterion(AIC) after correcting it for small numbers of measurements
AICc = 2 ln (Evidence)Software packages with the AIC but without the correction may be
unreliable for small numbers of observations (n < 40)Kenneth Burnham and David Anderson, Model Selection and
Multi-Model Inference
David Bickel (uOttawa) Information theory 17 N ove mb er 2008 11 / 11
Further study
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 68/69
The method presented is based on the Akaike information criterion(AIC) after correcting it for small numbers of measurements
AICc = 2 ln (Evidence)Software packages with the AIC but without the correction may be
unreliable for small numbers of observations (n < 40)Kenneth Burnham and David Anderson, Model Selection and
Multi-Model Inference
These slides and …gures will be on the lab website
David Bickel (uOttawa) Information theory 17 N ove mb er 2008 11 / 11
Further study
8/14/2019 Small sample AIC lecture, slides only
http://slidepdf.com/reader/full/small-sample-aic-lecture-slides-only 69/69
The method presented is based on the Akaike information criterion(AIC) after correcting it for small numbers of measurements
AICc = 2 ln (Evidence)Software packages with the AIC but without the correction may be
unreliable for small numbers of observations (n < 40)Kenneth Burnham and David Anderson, Model Selection and
Multi-Model Inference
These slides and …gures will be on the lab website
www.statomics.com
David Bickel (uOttawa) Information theory 17 N ove mb er 2008 11 / 11