experimental design. hw3 recap: scientific method

Experimental Design

RECAP: SCIENTIFIC METHOD

The Water Cycle

Earthquakes & Tsunamis

Earth’s Seasons

Science

• We’ve learned a lot of bad ways (fallacies) for figuring out whether claims are true.

• There is a good way of finding things out: science!

• Science tries to discover the causal structure of the world, so it can predict, explain, and control nature for the benefit of humankind.

Science

• We cannot directly observe the causal structure of the world, we can only observe correlations, and theorize about them.

• The goal of science is to test our theories about the causal structure of the world. We try to show that they are false.

• If we try very, very hard to show they are false, and we keep failing, then we can accept them as true.

CORRELATION VS. CAUSATION

Chocolate and Nobel Prizes

In the article, “Eat more chocolate, win more Nobels?” Dr. Franz Messerli claims to have found "a surprisingly powerful correlation” between the chocolate consumption in a country and the Nobel rate.

Chocolate and Flavanols

The theory outlined in the article is that chocolate contains flavanols; flavanols slow down age-related mental decline (though this is doubtful); and…?Well, it’s not really explained how lessened mental decline makes you more likely to win Nobels. Wouldn’t chocolate have to make you smarter and not just prevent you from being dumber?

B causes A

Dr. Messerli, according to the article, admits that “it’s possible… that chocolate isn't making people smart, but that smart people who are more likely to win Nobels are aware of chocolate's benefits and therefore more likely to consume it.”

C causes A and B

The article also quotes Sven Lidin, the chairman of the Nobel chemistry prize committee: “I don't think there is any direct cause and effect. The first thing I'd want to know is how chocolate consumption correlates to gross domestic product.” He seems to be suggesting that GDP causes higher chocolate consumption and more Nobel prizes.

The GDP Theory

Here’s what I think Lidin is suggesting:Chocolate is a luxury. Wealthy individuals are more likely to be able to afford it. Education is also a luxury. Poor people can’t afford to go to college for 10 years to get a PhD in chemistry. But you can’t win the Nobel prize in chemistry unless you’re a chemist.So he expects that the GDP or “wealth” of a country will be correlated both with chocolate eating and with Nobel prizes.

The GDP Theory

So he expects that the GDP or “wealth” of a country will be correlated both with chocolate eating and with Nobel prizes.

Wealth causes chocolate eating & Nobels.

Spurious Correlation

It’s also possible that Dr. Messerli has committed the ecological fallacy, assuming that a correlation between a country’s chocolate consumption and that country’s number of Nobel prizes means that there is a correlation between individual chocolate consumption and individual Nobel-prize winning.

Ecological Fallacy Explanation

Maybe smart people tend to avoid chocolate, because they know it can cause obesity. When they live in a country that consumes lots of chocolate they have to exercise their will power frequently. And maybe smart people + strong willpower = more Nobels. So it’s not eating chocolate but avoiding chocolate that causes Nobel prizes.

EXPERIMENTAL DESIGN

Types of Scientific Studies

There are two basic types of scientific studies (the stuff that gets published in scientific journals and reported in the “science” section of the newspaper):

• Observational studies• Controlled experiments

Observational Studies

An observational study looks at data in order to determine whether two variables are correlated.

Case Study

In science, we want to know about the effects of something (exposure to radiation, living through a certain political crisis…) or the causes of something (a disease, having certain beliefs…).

A case study finds people who have the condition we want to know about (they were exposed to radiation, or they have the disease) and looks back at their histories.

Example

Suppose I want to know why people gamble. I might find a group of gamblers and give them all a survey:When did you first have sex?Do you smoke?Did your parents divorce?When you win money, how do you spend it?Do you eat meat?

Problems with Case Studies

Suppose I find that 27% of the gamblers I survey have divorced parents. Does that mean divorce is significant cause of gambling?

No. We need to know if this is more or less than the divorce rate among non-gamblers. (In fact, it’s about the same: HK divorce rate is 20%-30%.)

Case Control Study

In a case control study we find not just a group of people we’re interested in (gamblers) but also a group of people we’re not interested in (the control group, non-gamblers).

The goal is to compare the histories of one group to the histories of the other group.

Problems with Case Control Studies

Correlation is not causation!

What if I discover that more gamblers smoke than non-gamblers. I still don’t know:• Maybe smoking causes gambling.• Maybe gambling causes smoking.• Maybe poverty causes gambling & smoking.• Maybe it’s just a coincidence.

Unreliable Histories

Why Do We Do Them?

Case control studies can be done very easily, very fast, and with very little expense.

Scientists will use them to suggest things to study more seriously, or to rule out certain hypotheses. After all, if gamblers smoke less than non-gamblers, smoking probably does not cause gambling!

Cohort Studies

Cohort studies are more reliable than case control studies.

In a cohort study, you follow two groups over time. One group, the cohort, has a certain condition (for example, smokes) and the other group doesn’t. Then you see what happens and compare the results.

Cohort Studies

For example, an cohort study might ask women to record how much wine they drink, and also to report if they develop breast cancer. After many years, a correlation may be found between wine consumption and cancer.

Advantages over Case Control

• Avoids recall bias.• Lets us study changes over time.• Useful for studying rare conditions.• Lets us investigate many effects.• Allows us to calculate the relative risk (the

amount that a condition increases or decreases your risk of something.)

Problems: Confounding Variables

Suppose my cohort is a group of smokers. Smokers tend to have more in common with one another than just smoking:• The poor smoke more than the rich.• The uneducated smoke more than the

educated.• People who drink alcohol smoke more than

people who do not.

Problems: Confounding Variables

Anything that we discover in the cohort that is correlated with smoking will also be correlated with all the confounding variables!

So if smokers get more cancer, is it because they smoke, or because they don’t have money to go to a hospital for checkups?

Observational Studies

Importantly, observational studies can only show whether two variables A and B are correlated. They cannot show whether A causes B, or B causes A, or some third cause causes both, or if the correlation is accidental.

EXPERIMENTAL STUDIES

Controlled Experiments

The first recorded controlled experiment occurs in the Book of Daniel, part of the Jewish Torah and the Christian Bible.

Daniel 1:1-16

In the book, Daniel is forced into the service of King Nebuchadnezzar of Babylon. He is fed the king’s meat and wine, but he refuses – the Jews have special laws about how things like meat and wine are prepared.

Daniel’s Experiment

“Please test your servants for ten days: Give us nothing but vegetables to eat and water to drink. Then compare our appearance with that of the young men who eat the royal food, and treat your servants in accordance with what you see.” Daniel 1:12-13

Daniel’s Experiment

“At the end of the ten days they looked healthier and better nourished than any of the young men who ate the royal food. So the guard took away their choice food and the wine they were to drink and gave them vegetables instead.” Daniel 1:15-16

In a controlled experiment there are two groups who get separate treatments.

One group, the “control group” gets the standard treatment. For example, all of the king’s servants ate meat and wine before Daniel suggested a different diet might be better.

The other group, the “experimental group”, gets the treatment we plan to test.

If the test group has better results than the control group, we have good evidence that our new treatment should be adopted.

CONTROLLING AND RANDOMIZATION

Suppose I believe that eating chocolate makes you smarter.

Maybe I have some evidence, in the form of observational studies that show a correlation between chocolate consumption in a country and the number of Nobel prizes won by that country.

But there are alternative theories:

• Smartness causes chocolate eating• Wealth causes smartness and chocolate eating• Chocolate avoiding causes smartness• Etc.

Experimental Design

I can rule out these possibilities with a well-designed experiment.

What I want is two groups: one group (the experimental group) that eats chocolate because I tell them to, and another group (the control group) that does not eat chocolate, because I tell them not to.

Not: B causes A

If the experimental group improves in intelligence over the course of the experiment, I know that this is not because higher intelligence leads to more chocolate consumption (even if that is true).

In my experiment, intelligence does not cause chocolate consumption, I do. I am the experimenter and I say who eats chocolate.

Controlling for

Additionally, if I make sure to put equal numbers of rich people in both groups, and equal numbers of middle-class people, and equal numbers of poor people, then I can make sure that improvements in the experimental group are not due to wealth: both groups have the same distribution of wealthy and non-wealthy people. This is called controlling for wealth.

Randomization

Ideally, an experiment controls for as many variables as possible.

To a large extent, this is done by randomly assigning individuals in the study to either the control group or the experimental group. This way, the members of the group are less likely to share features other than chocolate eating.

Randomization

Randomization is not the only tool for controlling for confounding variables, and for certain variables, it can’t help.

For example, suppose I want to test whether seeing pictures of babies makes people happier.

Babies and Happiness

I randomly assign participants in the study to the control group and the experimental group.The control group takes a happiness questionnaire.The experimental group looks at baby pictures and then takes a happiness questionnaire.

A Crucial Difference

Suppose that the experimental group rates higher on the happiness questionnaire. Does this mean that baby pictures cause happiness?

No. The control group didn’t get to look at any pictures. Maybe they got bored, and boredom makes you less happy. We should give the controls pictures other than babies.

Maximal Similarity

In general, the control condition and the experimental condition should be as similar as possible, and differ only in the variable being tested.For example, if the experimental group is given the happiness test by a beautiful woman and the control group is tested by a grumpy professor, that might be the real reason for a difference in scores, not the baby pictures.

It can be difficult or impossible to make the control and experimental conditions similar.

If you want to study the effects of exercise, how do you make exercise and non-exercise similar?

BLINDING AND DOUBLE-BLINDING

“Blinds”

In experimental studies, we say that the participants are blind if they do not know which group they are in: the control group or the experimental group.

Again, it’s not always possible to have blind participants, but this is considered best practice.

The Placebo Effect

Why is blinding important? For several reasons:

First, the mind has a pretty powerful effect on the body. People who think they’re receiving an effective treatment (even if they aren’t) are more likely to get better People who think they’re not receiving an effective treatment (even if they are) are less likely to get better.

Placebo Controls

So we don’t test a drug by giving it to the experimental group and giving nothing to the control group. Then the control group knows it’s the control group, because it gets no pills!

Instead, we give the control group fake pills with only sugar in them, and we don’t tell them that the pills are fake.

Subject Bias

A second reason that blinding participants is good is that people who believe they are getting an effective treatment are more likely to say they’ve gotten better (even if they haven’t) and people who believe they are not getting an effective treatment are more likely to say that they haven’t gotten better (even if they have).

Subject Bias

So, for example, if you give a group of people a fake pain medication and ask them whether it helps their pain, they might reason:“Well, I’m supposed to feel better, so I probably did get a little bit better.”

Deception and Blinding

One common way of making sure subjects don’t know which condition they’re in is by lying to them about what you’re studying.

You might tell people that you’re studying math ability, when what you’re really doing is studying the affects of cold rooms on math ability.

Improper Blinding

One study of medical experiments found that studies with “improper” blinding procedures (where the subjects could find out which group they were in) exaggerated effects by 17%.

Blind Experimenters

Ideally, in experiments the researchers are blind to which group subjects are in.

This prevents the experimenter from accidentally indicating to the subjects which group they are in.

It also prevents experimenter bias.

Clever Hans

Clever Hans was a horse that was supposed to have amazing mathematical abilities. It was claimed that he could add, subtract, multiply, divide, read, spell, and understand German.

Clever Hans

Hans’ trainer would ask him questions like:

“If the eighth day of the month comes on a Tuesday, what is the date of the following Friday?”

Then Hans would tap out the answer with his hoof.

The Clever Hans Effect

The psychologist Oskar Pfungst investigated Hans and discovered that Hans could only solve problems when (a) The questioner knew

what the answer was and

(b) The horse could see the questioner.

The Clever Hans Effect

Hans got math questions right 68% of the time when the questioner knew the answer but only 6% of the time when the questioner didn’t.

Hans was able to tell what people wanted him to do, and he would keep stamping his foot if it looked like people wanted him to go on, and he would stop when it looked like they were happy and about to reward him.

Conforming to Expectations

That’s why double-blinding is important:

If subjects know what the experimenter expects of them, their actions will conform to those expectations.

Experimenter Bias

If experimenters want a certain outcome, they can record or interpret the experimental results in a biased way.

For example, if I’m studying a blood pressure medication, and I know a subject is in the experimental group when I take his blood pressure, this might bias the results.

Experimenter Bias

If the reading is high, I might take his blood pressure a second time “just to make sure.”If the second reading is low, I have biased the results. Would I have taken a second reading for someone in the control group?

Randomization

An experimenter can also bias an experiment when it is not appropriately randomized.

If the experimenter decides who goes in the control group and who goes in the experimental group, she can (consciously or unconsciously) put people who are more likely to get better anyway in the experimental group.

Blind Randomization

But it is not enough to simply randomize, the experimenter must be blind to the randomization process itself.

For example, one method of randomization is to assign the first person who signs up to the control group, the second to the experimental group, the third to the control…

Blind Randomization

Now suppose someone really sick shows up on an “odd” time and should be assigned to the experimental group.

The experimenter might convince the person that the experiment wasn’t appropriate for them, and thus make it more likely that only healthy people were in the experimental group.

This Matters

Studies have shown that flawed randomization methods overstate effects by 30%!

That’s a huge effect, and in the case of drug trials, overexaggerated effects can cost lives.

SUMMARY

Benefits of Controlled Experiments

Today we learned that a well-designed controlled experiment can:

• Control for confounding variables• Rule out common cause and reverse cause

explanations

Elements of Good Design

Well-designed experiments should:

• Assign subjects to control/ experimental groups randomly

• Not let subjects know which group they’re in• Not let the experimenter know which group

subjects are in

experimental design. hw3 recap: scientific method

Documents

hw3 solutions

parallel programming hw3 assignment

hw3 solution new

ltd cases hw3

[hw3 ls ee571] akhtar rasool

hw3 tutorial 2015

system administration hw3 - shell script

random house cleanup. hw3 answers cis665/answer.zip

hw3 solutions 2012

solution hw3

hw3 ece2100 page 1

circuits hw3 sol

cse674 hw3 (midterm prep.)...

me 6105 hw3 elevator final

hw3 unit test v1.1 - ntut.edu.tw

simmons 520 hw3

hw3520-hw3-postcard and html mockups

hw3 - the dirty work

hw3: dti challenges paper

hw3 - lafayette college · title: hw3.pdf author: kurtzs...