introduction to experimental science dr. bob pergolizzi [email protected] in room 244 (temporarily)...

Introduction to Experimental Science

Dr. Bob Pergolizzi

[email protected] Room 244 (temporarily)

Always available to chat and assist

Scientists

• More than anything else, scientists are skeptical.

• Scientific skepticism is a gullible public’s defense against charlatans and others who would sell them ineffective medicines and cures, impossible schemes to get rich, and supernatural explanations for natural phenomena.

• Most (not all) scientists do research to gather data to prove or disprove hypotheses.

Research Methods

Researchers are . . .

- like detectives – gather evidence, develop a theory.

- Like judges – decide if evidence meets scientific standards.

- Like juries – decide if evidence is “beyond a reasonable doubt.”

Science . . .

• Is cumulative. Current research builds on previous research.

• Scientific Method:–Empirical (acquires new knowledge via direct

observation and experimentation)

–Systematic, controlled observations.

–Unbiased, objective.

–Operational definitions.

–Valid, reliable, testable, critical, skeptical.

CONTROL

• Using appropriate controls distinguishes a scientific experiment from nonscientific procedures.

• The scientist manipulates the Independent Variable (IV) – “the treatment” – at least two levels –

“experimental and control conditions”)

– all other variables must be controlled to obtain valid results.

More control

• After manipulating the IV – the experimenter is independent – he/she

decides what to do

• He/she measures the effect on the Dependent Variable –what is measured – it depends on the IV

Distinction between Variables

• IV vs. Individual Differences variable

• The scientist MANIPULATES an IV, but SELECTS an Individual Differences variable (or “subject” variable).

• Can’t manipulate a subject variable. –“Select a sample. Have half of ‘em get a

divorce.”

Operational Definitions

• Explains a concept solely in terms of the operations used to produce and measure it.– Bad: “Smart people.”

– Good: “People with an IQ over 120.”

– Bad: “People with long index fingers.”

– Good: “People with index fingers at least 7.2 cm.”

– Bad: Ugly guys.

– Good: “Guys rated as ‘ugly’ by at least 50% of the respondents.”

Validity and Reliability

• Validity: the “truthfulness” of a measure. Are you really measuring what you claim to measure? – The validity of a measure is supported to the extent

that people do as well on it as they do on independent measures that are presumed to measure the same concept.”

• Reliability: a measure’s consistency.• A measure can be reliable without being valid,

but not vice versa.

Difference between a fact, a theory and a hypothesis?

• In popular usage, a theory is just a vague sort of fact and a hypothesis is often used as a fancy synonym to `guess'.

• However, to a scientist a theory is a conceptual framework that explains existing observations and predicts new ones. – For instance, suppose you see the Sun rise. This is an existing

observation which is explained by the theory of gravity proposed by Newton. This theory, in addition to explaining why we see the Sun move across the sky, also explains many other phenomena

• the path followed by the Sun as it moves (as seen from Earth) across the sky, the phases of the Moon, the phases of Venus, the tides, etc.

– Based on this ‘theory” you can calculate a prediction of the position of the Sun, the phases of the Moon and Venus, etc. 200 years from now.

• A hypothesis is a working assumption. – Typically, a scientist devises a hypothesis and then sees if it ``holds

water'' by testing it against available data (obtained from previous experiments and observations).

– If the hypothesis stands up to testing, the scientist declares it to be a theory.

Theory and Hypothesis

• Theory: a logically organized set of propositions (claims, statements, assertions) that serves to define events (concepts), describe relationships among these events, and explain their occurrence.– Theories organize our knowledge and guide our

research

• Hypothesis: A tentative explanation.– A scientific hypothesis is TESTABLE.

A paradigm of how new theories encompass old ones.

Saturn devouring his sons (Goya)

Goals of Scientific Method

• Description– Nomothetic approach – establish broad generalizations and

general laws that apply to a diverse population

– Versus idiographic approach – interested in the individual, their uniqueness (e.g., case studies)

• Prediction– Correlational study – when scores on one variable can be

used to predict scores on a second variable.

– Doesn’t necessarily tell you “why.”

• Understanding – con’t. on next page• Creating change

– Applied research

Understanding

• Three important conditions for making a causal inference:–Covariation of events. (IV changes, and the

DV changes.)

–A time-order relationship. (First the scientist changes the IV – then there’s a change in the DV.)

–The elimination of plausible alternative causes.

Confounding Variables

• When two potentially effective IVs are allowed to covary simultaneously.

• Poorly controlled!

Intervening Variables

• Link the IV and the DV, and are used to explain why they are connected.–Intervening variables are important in

theories.

More about theories

• Good theories provide “precision of prediction”

• The “rule of parsimony” is followed– The simplest alternative explanations are favored

– Ockham’s Razor• William of Ockham (fourteenth century): ``Pluralitas non est

ponenda sine neccesitate'‘…``entities should not be multiplied unnecessarily''.

• A good scientific theory passes the most rigorous tests of the day.

• Testing will be more informative when you try to DISPROVE (falsify) a theory

Populations and Samples

• Population: the set of all cases of interest

• Sample: Subset of all the population that we choose to study.

Population Parameters

Sample Statistics

Ethics

• Informed consent – a person’s expressed willingness to participate in a research project, based on a clear understanding of the nature of the research, the consequences of declining, and other factors that might influence the decision.

• Debriefing should be informal and indirect.

Independent Groups Design

• Description and Prediction are crucial to the scientific study of behavior, but they’re not sufficient for understanding the causes. We need to know WHY.

• Best way to answer this question is with the experimental method.

• “The special strength of the experimental method is that it is especially effective for establishing cause-and-effect relationships.”

• Experimental methods and descriptive methods are different, but related– often used together

Why we conduct experiments

• If results of a well-conducted experiment are consistent with theory, we say we’ve supported the theory; NOT that it is “right.”

• Otherwise, we modify the theory.

• Testing hypotheses and revising theories based on the outcomes of experiments – the long process of science.

Logic of Experimental Research

• Researchers manipulate an independent variable in an experiment to observe the effect on behavior, as assessed by the dependent variable.

Independent Groups Design

• Each group represents a different condition as defined by the independent variable.

Random . . .

• Random Selection vs. Random Assignment– Random Selection = every member of the

population has an equal chance of being selected for the sample.

– Random Assignment = every member of the sample has an equal chance of being placed in the experimental group or the control group.• Random assignment allows for individual differences

among test participants to be averaged out.

Let’s step back a minute

• An experiment is personkind’s way of asking nature a question.

• I want to know if one variable (factor, event, thing) has an effect on another variable – does the IV influence the DV?

• I manipulate some variables (IVs), control other variables, and count on random selection to wash out the effects of all the rest of the variables.

Block Randomization

• Another way to wash-out error variance.

• Assign subjects to blocks of subjects, and have whole blocks see certain conditions.

• (Very squirrelly description in the book.)

Challenges to Internal Validity

• Testing intact groups. (Why is the group a group? Might be some systematic differences.)

• Extraneous variables. (Balance ‘em.) (E.g., experimenter).

• Subject loss– Mechanical loss, OK.

– Select loss, not OK.

• Demand characteristics (cues and other info participants pick up on) – use a placebo, and double-blind procedure

• Experimenter effects – use double-blind procedure

Role of Data Analysis in Exps.

• Primary goal of data analysis is to determine if our observations support a claim about behavior. Is that difference really different?

• We want to draw conclusions about populations, not just the sample.

• Two ways – statistics and replication.

Two methods of making inferences

• Null hypothesis testing– Assume IV has no effect on DV; differences we obtain

are just by chance (error variance)– If the difference is unlikely enough to happen by chance

(and “enough” tends is STATISTICALLY SIGNIFICANT (much more about this another day) then we say there’s a true difference.

• Confidence intervals– We compute a confidence interval for the “true”

population mean, from sample data. (95% level, usually, more about this another day also)

– If two groups’ confidence intervals don’t overlap, we say (we INFER) there’s a true difference.

What data can’t tell us

• Proper use of inferential statistics is NOT the whole answer.–Scientist could have done a trivial

experiment.

–Also, study could have been confounded.

–Also, could by chance find this difference.

This is HUGE.

• When we get a NON-significant difference, or when the confidence intervals DO overlap, we do NOT say that we ACCEPT the null hypothesis.

• We just cannot reject it at this time.

• We have insufficient evidence to infer an effect of the IV on the DV.

Notice

• Many things influence how easy or hard it is to discover a difference.–How big the real difference is.

–How much variability there is in the population distribution(s).

–How much error variance there is.• We need to discuss variance

Sources of variance

• Systematic vs. Error– Real differences

– Error variance

• What would happen to the standard deviation if our measurement apparatus was a little inconsistent?

• There are OTHER sources of error variance, and the whole point of experimental design is to try to minimize them.

• IMPORTANT! The more error variance exists in the experimental conditions, the harder for real differences to emerge>

One way to reduce the error variance

• Matched groups design– If there’s some variable that you think MIGHT cause

some variance,

– Pre-test subjects on some matching test that equates the groups on a dimension that is relevant to the outcome of the experiment. (Must have a good matching test.)

– Then assign matched groups. This way the groups will be similar on this one important variable.

– STILL use random assignment WITHIN the groups.

– Good when there are a small number of possible test subjects.

Another design

• Natural Groups design–Based on subject (or individual differences)

variables.

–Selected, not manipulated.

–Remember: This will give us description, and prediction, but not understanding (cause and effect).

We’ve been talking about . . .

• Making two groups comparable, so that the ONLY systematic difference is the IV.–CONTROL some variables.–Match on some.–Use random selection to wash out the

effects of the others.–What would be the best possible match for

one subject, or one group of subjects?

Themselves!

• When each test subject is his/her own control, then that’s called a –Repeated measures design, or a

–Within-subjects design.

(And the random groups design is called a “between subjects” design.)

Repeated Measures

• If each subject serves as his/her own control, then we don’t have to worry about individual differences, across experimental and control conditions.

• EXCEPT for newly introduced sources of variance – order effects:–Practice effects

–Fatigue effects

Counterbalancing

• ABBA

• Used to overcome order effects.

• Assumes practice/fatigue effects are linear.

Which method when?

• Some questions DO lend themselves to repeated measures (within-subjects) design – Can people read faster in condition A or condition

B?– Is memorability improved if words are grouped in

this way or that?

• Some questions do NOT lend themselves to repeated measures design– Do these instructions help people solve a particular

puzzle?– Does this drug reduce cholesterol?

Our Experiment

• We will discuss the “observation” everyone made and try to formulate research questions based on those observations.–This process will be eventually be used to define

research projects (based on other questions)

• We will make in-class “data measurements and analyze the data

introduction to experimental science dr. bob pergolizzi [email protected] in room 244 (temporarily)...

Documents