research methods for the social sciences: an introductory course february 23 rd, 2010 – statistics...

Research Methods for Research Methods for the Social Sciences:the Social Sciences:

An Introductory CourseAn Introductory Course

February 23February 23rdrd, 2010 – Statistics, 2010 – StatisticsSarah Nelson and Julia BravermanSarah Nelson and Julia Braverman

Division on Addictions, Cambridge Health AllianceDivision on Addictions, Cambridge Health Alliance

Harvard Medical SchoolHarvard Medical School

1

3

5

7

9

11

13

15

17

19

M1 M2 M3 M4

ControlIntervention

The Dissertation Effect – The Dissertation Effect – First Year ProjectFirst Year Project

1.85

1.9

1.95

2

2.05

2.1

2.15

2.2

2.25

2.3

M1 M2 M3 M4

ControlIntervention

The Dissertation Effect - The Dissertation Effect - DissertationDissertation

OverviewOverview

Descriptive statisticsDescriptive statistics– Central tendencyCentral tendency– Variability/DispersionVariability/Dispersion

Inferential statisticsInferential statistics– Hypothesis testingHypothesis testing– Test reviewTest review– Effect size and meta-analysisEffect size and meta-analysis

Adapted from Dr. Braverman’s slides

Descriptive StatisticsDescriptive Statistics Statistics used to summarize or Statistics used to summarize or

describe datadescribe data– FrequenciesFrequencies

60% of public high school students say they 60% of public high school students say they have plagiarized papers. have plagiarized papers.

– MeanMean Men who are lost wait an Men who are lost wait an averageaverage of 20 of 20

minutes before giving up and asking for minutes before giving up and asking for directions. Women – 10 minutes.directions. Women – 10 minutes.

– RangeRange Romantic love involves chemical changes in Romantic love involves chemical changes in

the brain that last 12 – 18 months. After the brain that last 12 – 18 months. After that, you and your partner are on your own.that, you and your partner are on your own.

Adapted from Dr. Braverman’s slides

DataData Class A. Class A.

60, 80, 90, 80, 75, 90, 60, 80, 90, 80, 75, 90, 95, 30,70, 80, 7095, 30,70, 80, 70

N = 11. N = 11.

MeanMean– ΣX/NΣX/N– 740/11740/11– Mean = 67.3Mean = 67.3

ModeMode– Most frequent valueMost frequent value– 8080

Median - Center Median - Center value when data is value when data is arranged in orderarranged in order

30 60 70 70 75 80 80 80 90 90 95

Central tendency

Dr. Braverman’s slides

DataData

Class A. Class A. 60, 80, 90, 80, 75, 90, 60, 80, 90, 80, 75, 90,

95, 30,70, 80, 7095, 30,70, 80, 70

MeanMean– ΣX/NΣX/N– 740/11740/11– Mean = 67.3Mean = 67.3


Median - Center Median - Center value when data is value when data is arranged in orderarranged in order

30 60 70 70 75 80 80 80 90 90 95

Central tendency


DataData

Class A. Class A. 60, 80, 90, 80, 75, 90, 60, 80, 90, 80, 75, 90,

95, 30,70, 80, 7095, 30,70, 80, 70

MeanMean– ΣX/NΣX/N– 740/11740/11– Mean = Mean = 67.367.3


Median - Center Median - Center value when data is value when data is arranged in orderarranged in order– Median = Median = 80.80.

30 60 70 70 75 80 80 80 90 90 95

Central tendency


Mean and MedianMean and Median

Mean Mean – Reflects ALL valuesReflects ALL values

MedianMedian– No extreme valuesNo extreme values– Exactly 50% above Exactly 50% above

and below. and below.


VariabilityVariability

RangeRange Standard deviation/dispersionStandard deviation/dispersion


Dispersion/standard Dispersion/standard deviationdeviation

Country ACountry A– 15,00015,000– 20,00020,000– 50,00050,000– 60,00060,000– 80,00080,000– 90,00090,000– 100,000100,000

Country BCountry B– 40,00040,000– 40,00040,000– 50,00050,000– 60,00060,000– 70,00070,000– 70,00070,000– 80,00080,000

Standard deviation – Average distance of all scores from the average.


Normal DistributionNormal Distribution

frequency values


Normal DistributionNormal Distribution

SymmetricalSymmetrical Mean = Mode = MedianMean = Mode = Median Bell-shapedBell-shaped Most scores cluster around the mean. Most scores cluster around the mean.


The normal distributionThe normal distribution

The area under the normal curve The area under the normal curve represents 100% of the scoresrepresents 100% of the scores

A property of the normal curve is that A property of the normal curve is that approx. 99% of scores fall within 3 approx. 99% of scores fall within 3 standard deviations of the meanstandard deviations of the mean– Specifically Specifically

~34.13% within one SD in one direction~34.13% within one SD in one direction ~68.26% within one SD~68.26% within one SD ~95.44% within two SD~95.44% within two SD


Normal and z-scoresNormal and z-scores


Normal distribution of IQNormal distribution of IQ

= 100= 100 = 15= 15 Mode = 100 Mode = 100 Median = 100Median = 100

Frequency of IQ

30 40 50 60 70 80 90 100 110 120 130 140 150 160 170

IQ

Numb

er of

peop

le


Central TendencyCentral Tendency


Inferential statisticsInferential statistics

Drawing a conclusion about general Drawing a conclusion about general population based on population based on a sample.a sample.


A Quick AsideA Quick Aside Population computations:Population computations:

– OftenOften provided in text books for pedagogical provided in text books for pedagogical reasons.reasons.

– Requires that you have every subject, i.e. you are Requires that you have every subject, i.e. you are not making any inferences to a more general not making any inferences to a more general populationpopulation

– Uses Greek letters: Uses Greek letters: Sample computations:Sample computations:

– Often presented as computational formulasOften presented as computational formulas– Allows one to make inferences to a more general Allows one to make inferences to a more general

populationpopulation– Uses Arabic letters: M S XUses Arabic letters: M S X– Uses degrees of freedomUses degrees of freedom


Hypothesis TestingHypothesis Testing

9 Step procedure9 Step procedure– State the hypothesesState the hypotheses– Determine the nature of variablesDetermine the nature of variables– Choose the appropriate test statisticChoose the appropriate test statistic– Set Type I error rates (alpha)Set Type I error rates (alpha)– Determine your sample sizeDetermine your sample size– Collect dataCollect data– Run appropriate statistical testRun appropriate statistical test– Calculate observed effect sizeCalculate observed effect size– Make a decision and conclusionsMake a decision and conclusions


HypothesesHypotheses

Null hypothesis: HNull hypothesis: H0 0

– States that nothing special is happening States that nothing special is happening in respect to some characteristic of the in respect to some characteristic of the underlying populationunderlying population

Alternative hypothesis HAlternative hypothesis H11

– The opposite of the null hypothesis, also The opposite of the null hypothesis, also called the research hypothesiscalled the research hypothesis


ExampleExample

Students participate in an SAT prep Students participate in an SAT prep class in English. The experimenter class in English. The experimenter thinks that this class will improve the thinks that this class will improve the students scored compared to the students scored compared to the national average of 500.national average of 500.– Null: There will be no difference in the Null: There will be no difference in the

performance of the participants in the prep performance of the participants in the prep group. X-group. X- <= 0 <= 0

– Alternative: Student in the prep group Alternative: Student in the prep group perform better than the national average. perform better than the national average. X-X- > 0 > 0


Rejecting Null hypothesisRejecting Null hypothesis

With the given set of data, how likely With the given set of data, how likely (p) is it that the null hypothesis is (p) is it that the null hypothesis is true? true?

αα = .05/ .01/ .001 (Arbitrary setting) = .05/ .01/ .001 (Arbitrary setting)

If p < If p < αα Reject H Reject H0 0


ErrorError

Type I error: Probability of rejecting a Type I error: Probability of rejecting a null hypothesis when there is no effect null hypothesis when there is no effect present. Alpha level, generally set to .05present. Alpha level, generally set to .05

Type II error: Probability of retaining a Type II error: Probability of retaining a false null, i.e., missing an effect. Betafalse null, i.e., missing an effect. Beta– Power is the opposite of Beta and is the Power is the opposite of Beta and is the

probability of detecting a present effect. probability of detecting a present effect. Power is useful for determining a sample sizePower is useful for determining a sample size


Significance of significanceSignificance of significance

What significance does NOT mean.. What significance does NOT mean..

1.1. The effect is IMPORTANTThe effect is IMPORTANT

2.2. The effect is LARGEThe effect is LARGE


Effect sizeEffect size Effect size = (mean of experimental group Effect size = (mean of experimental group

- mean of control group)/standard - mean of control group)/standard deviation deviation

Cohen’s d (can be bigger than 1)Cohen’s d (can be bigger than 1)– Less than .1 trivialLess than .1 trivial– .1-.3 small.1-.3 small– .3-.5 moderate.3-.5 moderate– Greater than .5 largeGreater than .5 large

r - correlation coefficientr - correlation coefficient– <.2 - small<.2 - small– .5 - large.5 - large


PowerPower

The probability to reject null The probability to reject null hypothesis when it is, in fact, false. hypothesis when it is, in fact, false.

Power is bigger ifPower is bigger if– Effect size is biggerEffect size is bigger– Sample size is biggerSample size is bigger– Alpha is bigger Alpha is bigger


What Kind of Test to Use? What Kind of Test to Use?

1.1. Define your variables. Define your variables. – Independent Variables: Independent Variables:

what you manipulatewhat you manipulate

– Dependent Variables: Dependent Variables: what you measurewhat you measure


Numerical or CategoricalNumerical or Categorical

NumericalNumerical– Values defined by numbers.Values defined by numbers.– You can calculate You can calculate meanmean and and standard standard

deviation. deviation. CategoricalCategorical

– Values defined by labels. Values defined by labels. – You can calculate You can calculate frequencyfrequency (how (how

many of each category).many of each category).


Tests to Use if All Variables are Tests to Use if All Variables are NumericalNumerical

Examples:Examples:– How SAT is related to the average GPA during the senior How SAT is related to the average GPA during the senior

year of college.year of college.– How attitudes toward George Bush related to the How attitudes toward George Bush related to the

attitudes toward abortion. attitudes toward abortion.

Test to use:Test to use:– Regression/Correlation analysis Regression/Correlation analysis

Statistics: r Statistics: r


Linear CorrelationLinear Correlation

Examining the relationship between Examining the relationship between height and self esteemheight and self esteem

Examining the relationship between Examining the relationship between emotional intelligence and social emotional intelligence and social skills. skills.


Linear CorrelationLinear Correlation

Positive (r > 0)Positive (r > 0) Negative (r < 0)Negative (r < 0)

Hours of sleep before the exam

Perf

orm

ance

Number of bystanders

Help


r = ? r = ?

Anxiety

Perf

orm

anc

e

r = 0 1 > r > 0


r- correlationr- correlation

A social scientist wishes to determine A social scientist wishes to determine whether their is a relationship whether their is a relationship between the attractiveness scores between the attractiveness scores (on a 100-point scale) assigned to (on a 100-point scale) assigned to college students by a panel of peers college students by a panel of peers and their score on a paper-and-pencil and their score on a paper-and-pencil test of anxiety. test of anxiety. – Variable 1 (numerical): attractivenessVariable 1 (numerical): attractiveness– Variable 2 (numerical): anxietyVariable 2 (numerical): anxiety


Tests to Use if Variables are Tests to Use if Variables are Mixed: Categorical + Mixed: Categorical +

NumericalNumerical Examples:Examples:

1.1. Do females experience more empathy toward a Do females experience more empathy toward a stranger than males do?stranger than males do?

2.2. Which religious group gives more support toward the Which religious group gives more support toward the president. president.

Test to use:Test to use:– t-test or different kinds of ANOVA. t-test or different kinds of ANOVA.

Depends on the number of groups and variables.Depends on the number of groups and variables.


t-test for independent t-test for independent samples.samples.

A school psychologist compares the A school psychologist compares the reading comprehension score of reading comprehension score of migrant children who, as a result of migrant children who, as a result of random assignment, are enrolled in random assignment, are enrolled in either a special bilingual reading either a special bilingual reading program or a traditional reading program or a traditional reading program.program.– IV( categorical, 2 levels): Reading IV( categorical, 2 levels): Reading

programprogram– DV (numerical): Reading score.DV (numerical): Reading score.


t-test for matched samples.t-test for matched samples.

To determine whether speed reading To determine whether speed reading influences reading comprehension, a influences reading comprehension, a researcher obtains two reading researcher obtains two reading comprehension scores for each student in comprehension scores for each student in a group of high school students, once a group of high school students, once before and once after training in speed before and once after training in speed reading. reading. – IV (categorical, 2 levels) : trainingIV (categorical, 2 levels) : training– DV (numerical): reading comprehension scoreDV (numerical): reading comprehension score


One-way ANOVAOne-way ANOVA

In a study of problem solving, a In a study of problem solving, a researcher randomly assigns college researcher randomly assigns college students to one of three groups: students to one of three groups: high-status leader, equal-status high-status leader, equal-status leader or no leader, and measure the leader or no leader, and measure the amount of time required to solve a amount of time required to solve a complex puzzle. complex puzzle. – IV (categorical, 3 levels): type of leaderIV (categorical, 3 levels): type of leader– DV (numerical): timeDV (numerical): time


2-way ANOVA2-way ANOVA To determine whether cramming can increase To determine whether cramming can increase

GRE scores a researcher randomly assigns GRE scores a researcher randomly assigns college students to either a specialized GRE test-college students to either a specialized GRE test-taking workshop, a general test-taking workshop, taking workshop, a general test-taking workshop, or a control (non-test-taking) workshop. or a control (non-test-taking) workshop. Furthermore, to check the effect of scheduling, Furthermore, to check the effect of scheduling, students are randomly assigned, in equal students are randomly assigned, in equal number, to experience their workshop either number, to experience their workshop either during one long marathon weekend or during during one long marathon weekend or during weekly sessions.weekly sessions.– DV (numerical): GRE score DV (numerical): GRE score – IV (categorical, 3 levels): kind of workshopIV (categorical, 3 levels): kind of workshop– IV (categorical, 2 levels): sessionIV (categorical, 2 levels): session


Tests to Use if There are No Tests to Use if There are No Numerical VariablesNumerical Variables

Example:Example:– Is there a different frequency of rainy days in Is there a different frequency of rainy days in

the four seasons? the four seasons?

Test to use:Test to use:– One variableOne variable

One-way Chi-squareOne-way Chi-square

– Two variablesTwo variables Two-way Chi squareTwo-way Chi square


One-Way Chi-Square. One-Way Chi-Square.

A random sample of 90 college A random sample of 90 college students indicates whether they students indicates whether they most desire love, wealth, power, most desire love, wealth, power, health, fame, or family happiness.health, fame, or family happiness.– Variable 1 (categorical): desire.Variable 1 (categorical): desire.


2-way Chi-Square2-way Chi-Square

A social scientist cross-classifies the A social scientist cross-classifies the responses of 100 randomly selected responses of 100 randomly selected people on the basis of gender and people on the basis of gender and whether or not they favor strong gun whether or not they favor strong gun control laws.control laws.– Variable 1 (categorical, 2 levels): genderVariable 1 (categorical, 2 levels): gender– Variable 2 (categorical, 2 levels): opinion Variable 2 (categorical, 2 levels): opinion

toward gun control. toward gun control.


Article 1: Meghany et al., 2009. Predictors of Article 1: Meghany et al., 2009. Predictors of resolution of aberrant drug behavior in chronic resolution of aberrant drug behavior in chronic pain patients treated in a structured opioid risk pain patients treated in a structured opioid risk

management programmanagement program

Research Question: Research Question: – For chronic pain patients prescribed For chronic pain patients prescribed

opioids who are at risk for developing opioids who are at risk for developing addiction to opioids, what predicts addiction to opioids, what predicts success in a program designed to manage success in a program designed to manage those risks?those risks?

This is a question about This is a question about moderatorsmoderators of of a supposed treatment effecta supposed treatment effect



Sample: Sample: – Consecutive referrals to the Opioid Consecutive referrals to the Opioid

Renewal Clinic for aberrant drug related Renewal Clinic for aberrant drug related behaviors (ADRBs) over the course of two behaviors (ADRBs) over the course of two and a half years (N = 195 [49% of the 401 and a half years (N = 195 [49% of the 401 total referred])total referred])

– All participants had chronic non-cancer-All participants had chronic non-cancer-related painrelated pain

Article 1: Meghany et al., 2009. Predictors of resolution Article 1: Meghany et al., 2009. Predictors of resolution of aberrant drug behavior in chronic pain patients of aberrant drug behavior in chronic pain patients treated in a structured opioid risk management treated in a structured opioid risk management

programprogram

Measures: Predictors Measures: Predictors – Demographics: Age, Race, Gender, Marital Demographics: Age, Race, Gender, Marital

Status, Employment, Veteran Status, Employment, Veteran Status/ImpairmentStatus/Impairment

– Pain: Primary Diagnosis, # of Pain DiagnosesPain: Primary Diagnosis, # of Pain Diagnoses– Comorbidity: Medical and PsychiatricComorbidity: Medical and Psychiatric

Measures: OutcomeMeasures: Outcome– Staying in the program vs. Discharge/Drop-Staying in the program vs. Discharge/Drop-

outout Assumes resolution of ADRBs and negative urine Assumes resolution of ADRBs and negative urine

screens for illicit drugs screens for illicit drugs



Predictors are:Predictors are:– Categorical Categorical – ContinuousContinuous

Outcome is:Outcome is:– CategoricalCategorical

Specifically, dichotomousSpecifically, dichotomous


programprogram

Analysis PlanAnalysis Plan1.1. DescriptivesDescriptives

– 51% (106) did not complete the program51% (106) did not complete the program– 58% (61) discharged58% (61) discharged– 19% (20) moved into addiction treatment19% (20) moved into addiction treatment– 23% (25) self-discharged/dropped out23% (25) self-discharged/dropped out

– General tendencies on the predictors for the sampleGeneral tendencies on the predictors for the sample

2.2. Comparison of groups (89 vs. 106) on predictor Comparison of groups (89 vs. 106) on predictor variablesvariables

– No demographic differencesNo demographic differences– Successful completers had higher medical comorbidity Successful completers had higher medical comorbidity

and more pain diagnoses, were more likely to have a and more pain diagnoses, were more likely to have a history of depression, and were less likely to have history of depression, and were less likely to have abused cocaineabused cocaine


programprogram

Analysis PlanAnalysis Plan3.3. Multivariate Test: Logistic RegressionMultivariate Test: Logistic Regression

– Point of multivariate test: Takes the Point of multivariate test: Takes the correlation between variables into account so correlation between variables into account so you learn which variables have unique effectsyou learn which variables have unique effects

– Certain relationships disappear because of Certain relationships disappear because of those correlations and others emergethose correlations and others emerge

– # of pain diagnoses and history of cocaine # of pain diagnoses and history of cocaine abuse remained significant independent abuse remained significant independent predictors; marital status emerged as a predictors; marital status emerged as a protective factorprotective factor

ResourcesResources

• Gonick & Smith (1993). The Cartoon Guide Gonick & Smith (1993). The Cartoon Guide to Statistics.to Statistics.

• Grimm & Yarnold (1995). Reading and Grimm & Yarnold (1995). Reading and Understanding Multivariate StatisticsUnderstanding Multivariate Statistics

• Grimm & Yarnold (2000). Reading and Grimm & Yarnold (2000). Reading and Understanding More Multivariate StatisticsUnderstanding More Multivariate Statistics

• Abelson (1995). Statistics as Principled Abelson (1995). Statistics as Principled Argument.Argument.

research methods for the social sciences: an introductory course february 23 rd, 2010 – statistics...

Documents

data class

mean mean xn

order median center

data statistics

mode mode

frequent value

data frequencies

mean men