how to get the most out of null results using bayes zoltán dienes
TRANSCRIPT
![Page 1: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/1.jpg)
How to get the most out of null results using Bayes
Zoltán Dienes
![Page 2: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/2.jpg)
The problem:
Does a non-significant result count as evidence for the null hypothesis or as no evidence either way?
![Page 3: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/3.jpg)
-20 -10 0 10 20 30 40
H0 mdiff
Difference in verbal ability
Su
cce
ssiv
e e
xpe
rim
en
ts
Geoff Cummin: http://www.latrobe.edu.au/psy/esci/index.html
![Page 4: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/4.jpg)
The solutions:
1. Power2. Interval estimates3. Bayes Factors
![Page 5: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/5.jpg)
Problems with Power
I) Power depends on specifying the minimal effect of interest (which may be poorly specified by the theory)
II) Power cannot make use of your actual data to determine the sensitivity of those data
Confidence intervals solve the second problem
Bayes Factors can solve both problems
By making use of the full range of predictions of the theory, it makes maximal use of the data in assessing the sensitivity of the data in distinguishing your theory from the null
A Bayes Factor can show strong evidence for the null hypothesis over your theory, when it is impossible to say anything using power or confidence intervals
![Page 6: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/6.jpg)
Difference between means ->
Minimal interesting value
If the 95% confidence/ credibility/ likelihood interval is completely contained in this region, conclude there is good evidence that population value lies in null region – accept the null region hypothesis
If the interval is completely outside this region, conclude there is good evidence that population value lies outside null region – reject the null region hypothesis
0
Null region
If the upper limit of the interval is below the minimal interesting value, conclude there is evidence against a theory postulating a positive difference
If the interval includes both null and theoretically interesting values, the data are insensitive
The four principles of inference by intervals:
![Page 7: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/7.jpg)
The Bayes Factor
![Page 8: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/8.jpg)
![Page 9: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/9.jpg)
Cat hypothesisDevil hypothesis
![Page 10: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/10.jpg)
![Page 11: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/11.jpg)
If a cat, you lose finger only 1/10 of time
If a devil, you will lose finger 9/10 of time
![Page 12: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/12.jpg)
Evidence supports the theory that most strongly predicted it
![Page 13: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/13.jpg)
Evidence supports the theory that most strongly predicted it
John puts his hand in the box and loses a finger.
Which hypothesis is most strongly supported, the cat hypothesis or the devil hypothesis?
![Page 14: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/14.jpg)
Evidence supports a theory that most strongly predicted it
John puts his hand in the box and loses a finger.
Which hypothesis is most strongly supported, the cat hypothesis or the devil hypothesis?
Cat hypothesis predicts this result with probability = 1/10Devil hypothesis predicts this result with probability = 9/10
![Page 15: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/15.jpg)
Evidence supports a theory that most strongly predicted it
John puts his hand in the box and loses a finger.
Which hypothesis is most strongly supported, the cat hypothesis or the devil hypothesis?
Cat hypothesis predicts this result with probability = 1/10Devil hypothesis predicts this result with probability = 9/10
Strength of evidence for devil over cat hypothesis = 9/10 divided by 1/10 = 9
![Page 16: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/16.jpg)
The evidence is nine times as strong for the devil over the cat hypothesis
OR
Bayes Factor (B) = 9
![Page 17: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/17.jpg)
Consider:
John does not lose a finger
![Page 18: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/18.jpg)
Consider:
John does not lose a finger
Now evidence strongly supports cat over devil hypothesis (BF = 9 for cat over devil hypothesis or 1/9 for devil over cat hypothesis)
![Page 19: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/19.jpg)
Probability of losing finger given cat = 4/10Probability of losing finger given devil = 6/10
Now if John loses finger strength of evidence for devil over cat = 6/4 = 1.5
Not very strong
![Page 20: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/20.jpg)
We can distinguish:
Evidence for cat hypothesis over devil
Evidence for devil hypothesis over cat
Not much evidence either way.
![Page 21: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/21.jpg)
Bayes factor tells you how strongly the data are predicted by the different theories (e.g. your pet theory versus null hypothesis):
B =
Probability of your data given your pet theory
divided by
probability of data given null hypothesis
![Page 22: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/22.jpg)
If B is greater than 1 then the data supported your theory over the null
If B is less than 1, then the data supported the null over your theory
If B = about 1, experiment was not sensitive.
(Automatically get a notion of sensitivity;
contrast: just relying on p values in significance testing.)
Jeffreys, 1961: Bayes factors more than 3 or less than a 1/3 are substantial
![Page 23: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/23.jpg)
To know which theory data support need to know what the theories predict
The null is normally the prediction of e.g. no difference
Population difference between conditions
Plausibility
0 2-2 4
On the null hypothesis only this value is plausible
![Page 24: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/24.jpg)
To know which theory data support need to know what the theories predict
The null is normally the prediction of e.g. no difference
Need to decide what difference or range of differences are consistent with one’s theory
Difficult - but forces one to think clearly about one’s theory.
![Page 25: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/25.jpg)
To calculate a Bayes factor must decide what range of differences are predicted by the theory
1) Uniform distribution
2) Half normal
3) Normal
![Page 26: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/26.jpg)
Plausibility
Population difference in means between conditions0 2 4 8
Example: The theory predicts a difference will be in one direction.
Subjects give 0-8 ratings in two conditions
Maximum difference allowed
-2
![Page 27: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/27.jpg)
Seems more plausible to think the larger effects are less likely than the smaller ones:
0
Plausibility
Population difference in means between conditions
But how to scale the rate of drop?
![Page 28: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/28.jpg)
0 4
Implies: Smaller effects more likely than bigger ones; effects bigger than 8 very unlikely
Plausibility
Population difference in means between conditions
![Page 29: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/29.jpg)
Similar sorts of effects as those predicted in the past have been on the order of a 5% difference between conditions
0 5
Implies: Smaller effects more likely than bigger ones; effects bigger than 10% very unlikely
Plausibility
Population difference in means between conditions
![Page 30: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/30.jpg)
Plausibility
Difference between conditions
0 5 10
![Page 31: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/31.jpg)
To calculate Bayes factor in a t-test situationNeed same information from the data as for a t-test:
Mean difference, Mdiff
SE of difference, SEdiff
![Page 32: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/32.jpg)
To calculate Bayes factor in a t-test situationNeed same information from the data as for a t-test:
Mean difference, Mdiff
SE of difference, SEdiff
Note: t = Mdiff / SEdiff
=> SEdiff = Mdiff/t
![Page 33: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/33.jpg)
To calculate a Bayes factor:
1) Google “Zoltan Dienes”
2) First site to come up is the right one:http://www.lifesci.sussex.ac.uk/home/Zoltan_Dienes/
3) Click on link to book
4) Click on link to Chapter Four
5) Scroll down and click on “Click here to calculate your Bayes factor!”
![Page 34: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/34.jpg)
![Page 35: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/35.jpg)
![Page 36: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/36.jpg)
![Page 37: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/37.jpg)
-20 -10 0 10 20 30 40
H0 mdiff
Difference in verbal ability
Su
cce
ssiv
e e
xpe
rim
en
ts
http://www.latrobe.edu.au/psy/esci/index.htmlBayes p
2.96 .0814.88 .0340.52 .744.88 .0342.70 .090.46 .8174.40 .0281024.6 .0013.33 .0564.88 .0311.73 .2794.28 .0242.96 .08349.86 .0022.16 .1672.12 .1721.01 .3870.65 .6140.75 .47628.00 .0064.28 .02849.86 .0025.60 .0242.36 .1441.73 .23
The tai chi of the Bayes factors
The dance of the p values
![Page 38: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/38.jpg)
A Bayes Factor requires establishing predicted effect sizes. How?
Do digit-colour synesthetes show a Stroop effect on digits?
You display: 3 … 4 … 5 … 6
What they see: 3 … 4 … 5 … 6
You get a null effect (incongruent minus congruent RTs) . . . What size effect would be predicted if there were one?
![Page 39: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/39.jpg)
A Bayes Factor requires establishing predicted effect sizes.
Do digit-colour synesthetes show a Stroop effect on digits?
You display: 3 … 4 … 5 … 6
What they see: 3 … 4 … 5 … 6
You get a null effect (incongruent minus congruent RTs) . . . What size effect would be predicted if there were one?
Run normals on a condition in which digits are coloured in the way synesthetes say they are. The Stroop effect is presumably the maximum one could expect synesthetes to show.
Use a uniform:
0 Effect for normals with real colours
Plausibility
Possible population Stroop effects
![Page 40: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/40.jpg)
Another condition in your experiment might help settle expectations:
Jiang et al 2012
Obtained significant amount of unconscious knowledge (5%)
Conscious knowledge was 6% with a SE of 7%
![Page 41: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/41.jpg)
Another condition in your experiment might help settle expectations:
Jiang et al 2012
Obtained significant amount of unconscious knowledge (5%)
Conscious knowledge was 6% with a SE of 7% (non-significant)
To assess meaning of non-significant result, used a half-normal with SD = 5%
BF = 1.25
![Page 42: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/42.jpg)
Another group in your experiment might help settle expectations:
Jiang et al 2012
Obtained significant amount of unconscious knowledge (5%)
Conscious knowledge was 6% with a SE of 7%
Used a half-normal with SD = 5%
BF = 1.25
Nothing follows about whether subjects had conscious knowledge or not
![Page 43: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/43.jpg)
If you have a manipulation meant to reduce an effect, effect of manipulation unlikely to be larger than the basic effect
e.g. Dienes, Baddeley & Jansari (2012)Predicted sad mood would reduce learning compared to neutral mood
So e.g. if on 2-alternative forced choice test, in neutral condition people get 70% correct
![Page 44: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/44.jpg)
If you have a manipulation meant to reduce an effect, effect of manipulation unlikely to be larger than the basic effect
e.g. Dienes, Baddeley & Jansari (2012)Predicted sad mood would reduce learning compared to neutral mood
So e.g. if on 2-alternative forced choice test, in neutral condition people get 70% correct
Sad condition expected to be somewhere between 50 and 70%
So effect of mood must be?
![Page 45: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/45.jpg)
My typical practice:
If think of way of determining an approximate expected size of effectÞUse half normal with SD = to that typical size
If think of way of determining an approximate upper limit of effect=> Use uniform from 0 to that limit
![Page 46: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/46.jpg)
Moral and inferential paradoxes of orthodoxy:
1. On the orthodox approach, standardly you should plan in advance how many subjects you will run.
If you just miss out on a significant result you are not allowed to just run 10 more subjects and test again.
You are not allowed to run until you get a significant result.
Bayes: It does not matter when you decide to stop running subjects. You can always run more subjects if you think it will help.
![Page 47: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/47.jpg)
Moral paradox:If p = .07 after running planned number of subjectsi) If you run more and report significant at 5% you have cheatedii) If you don’t run more and bin the results you have wasted tax payer’s money and
your time, and wasted relevant dataYou are morally damned either way
Inferential paradoxTwo people with the same data and theories could draw opposite conclusions
![Page 48: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/48.jpg)
Moral and inferential paradoxes of orthodoxy:
2. On the orthodox approach, it matters whether you formulated your hypothesis before or after looking at the data.
Post hoc vs planned comparisons
Predictions made in advance of rather than before looking at the data are treated differently
Bayesian inference: It does not matter what day of the week you thought of your theory
The evidence for your theory is just as strong regardless of its timing
![Page 49: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/49.jpg)
Moral and inferential paradoxes of orthodoxy:
3. On the orthodox approach, you must correct for how many tests you conduct in total.
For example, if you ran 100 correlations and 4 were just significant, researchers would not try to interpret those significant results.
On Bayes, it does not matter how many other statistical hypotheses you investigated (or your RA without telling you). All that matters is the data relevant to each hypothesis under investigation.
![Page 50: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/50.jpg)
For orthodoxy but not Bayes:
Different people with the same data and theories can come to different conclusions
You can thus be tempted to make false (albeit inferentially irrelevant claims), like when you thought of your theory
![Page 51: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/51.jpg)
What is the aim of statistics?
1) Control the proportion of errors you make in the long run in accepting and rejecting hypotheses
(conventional statistics)
2) Indicate how strong the evidence is for one hypothesis rather than another / how much you should change your confidence in one hypothesis rather than another
(Bayesian statistics)
![Page 52: How to get the most out of null results using Bayes Zoltán Dienes](https://reader035.vdocument.in/reader035/viewer/2022062422/56649f175503460f94c2e6af/html5/thumbnails/52.jpg)
Dienes 2011 Perspectives on Psychological Science