dr. g. johnson, data analysis for description research methods for public administrators dr. gail...

49
Dr. G. Johnson, www.Resea rchDemystified.org 1 Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Upload: merry-black

Post on 26-Dec-2015

227 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

1

Data Analysis for Description

Research Methods for Public Administrators

Dr. Gail Johnson

Page 2: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

2

Simple But Concrete

The Children’s Defense Fund reports on each day in America: Four children are killed by abuse or neglect Five children or teens commit suicide Eight children or teens are killed by firearms Seventy-five babies die before their 1st birthday

㹈 http://www.childrensdefense.org/child-research-data-publications/each-day-in-america.html

Page 3: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

3

Simple But Concrete

A million seconds = 11 ½ days A billion seconds= 32 years A trillion seconds= 32,000 years

Page 4: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

4

Simple But Concrete

A $700 billion bailout translates into $2,333 IOU from every person in the U.S.

Or—using a different metric-it comes to $45 per week for each person in the U.S.

Going one step further, it comes out to $6 a day Framing: are you willing to pay $6 a day to have a

functioning financial system?

Read more: http://www.time.com/time/business/article/0,8599,1870699,00.html#ixzz0aqek0mRZ

Page 5: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

5

Going Too Far?

Six dollars a day is also 25 cents an hour, or less than half a penny a minute.

Framing: Would you be willing to pay less than half a penny a minute?

Key Point: Does the comparison point make a difference in what you would be willing to pay?

Read more: http://www.time.com/time/business/article/0,8599,1870699,00.html#ixzz0aqf9HSQ9

Page 6: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

6

Common Descriptive Analysis

Counts: how many Decennial census

Percents Women earned 77% of what men earned in

2006, up from 59% in 1970 Parts of a whole

Percents (75%) and proportions (.75 or three-quarters)

Page 7: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

7

Common Descriptive Analysis

But be mindful of “bigger pie” distortions when working with percents and proportions If the pie grows much faster than the slice, the slice will

appear relatively smaller as a percent even though it still grew

Best example is budget deficit as a percent of the GDP: if GDP grows much faster than the budget deficit, it will appear smaller even though it has also grown.

Page 8: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

8

Common Descriptive Analysis

Rates: number of occurrences that are standardized Deaths of infants per 100,000 births Crop yields per acre Crime rates

Rates provide an apples-to-apples comparison between places of different size or populations

Page 9: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

9

Common Descriptive Analysis

Ratio: numbers presented in relationship to each other Student to teacher ratio: 15:1 Divide number of students by the number of

teachers 1,500 students and 45 teachers equals a 33 to 1

student to teacher ratio (1,500 divided by 45)

Page 10: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

10

Common Descriptive Analysis

Rates of change Percentage change from one time period to

the other For example: The budget increased 23% from FY

2006 to FY 2007.

Three Steps:1. Divided newest data by oldest data2. Subtract 13. Multiple by 100 to get the percentage change

Page 11: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

11

Common Descriptive Analysis

Rates of change Percentage change from one time period to

the other For example: The budget increased 23% from FY

2006 to FY 2007.

Three Steps:1. Divided newest data by oldest data2. Subtract 13. Multiple by 100 to get the percentage change

Page 12: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

12

Common Descriptive Analysis

Rates of change: applied What was the rate of change in 1992 budget

deficit as compared to 1980.1. Divide 1992 budget deficit ($290 billion) by the 1980

budget deficit ($73.8 billion) = 3.93

2. 3.93-1 – 2.93

3. 2.93 x 100 = 293 percent The budget deficit in current dollars (meaning not

controlled for by inflation) increased 293 percent.

Page 13: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

13

Common Descriptive Analysis

Frequency Distributions Number and percents of a single variable

Page 14: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

14

In The News: Women Now Are Majority of College Graduates

Page 15: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

15

Interpretation?

How would you interpret these percentages in the comparative trend analysis?

Are you surprised by the changes over time?

Why or why not?

Page 16: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

16

Frequency and Percent Distributions Survey data: analyzed by distributions How many men and women are in the program?

Distribution of Respondents by Gender:

Male Female TotalNumber Percent Number Percent Number

100 33% 200 67% 300

Page 17: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

17

Frequency and Percent Distributions How many men and women are in the

program?

Write-up:

Of the 300 people in this program, 67% are women and 33% are men.

Page 18: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

18

Different Analysis Tools For Different Situations Frequency/percent distributions make sense when

working with nominal and ordinal data But frequency/percent distributions for

interval/ratio data can result in a ridiculously long table that is impossible to interpret If I ask 500 people how many years they lived in an

area, I can can get a wide range of answers. For this type of data, I would then look at means,

medians, modes to describe that variable.

Page 19: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

19

Describing Distributions

Central tendency Means, Medians, Modes How similar are the characteristics?

Example: Use when we want to describe the similarity of the ages of a group of people.

Dispersion Range, standard deviation How dissimilar are the characteristics? Example: how much variation in the ages?

Page 20: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

20

Measures of Central Tendency

The 3-Ms: Mode, Median, Mode.

Mode: most frequent response. Median: mid-point of the distribution Mean: arithmetic average.

Page 21: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

21

Basic Concepts Revisited

Levels of Measurement Nominal Level Data: names, categories

Eg. Gender, religion, state, country Ordinal Level Data: data with an order, going from low

to high Eg. Highest educational degree, income categories, agree—

disagree scales Interval Level Data: numbers but no zero

Eg. IQ scores, GRE scores Ratio Level Data: real numbers with a zero point

Eg. Age, weight, income, temperature

Page 22: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

22

Which Measure of Central Tendency to Use?

Depends on the type of data you have: Nominal data: mode Ordinal data: mode and median Interval/ratio: mode, median and

mean

Page 23: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

23

For Interval Or Ratio Data:Which One To Use?

Concept of the Normal Distribution—also called the bell-shape curve In a normal distribution, the mean, median and

mode should be very similar

Use mean if distribution is normal Use median if distribution is not normal

Page 24: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

24

Normal Distribution: Bell-Shaped Curve

http://en.wikipedia.org/wiki/Normal_distribution

Mean

Page 25: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

25

Office contributions

$10, $ 1, $.50, $.25, $.25. The mean is $2.40 (add up and divide by 5) The median is .50 (the mid-point of this

distribution) The mode is .25 (the most frequently

reported contribution) Best description of contributions is median.

Page 26: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

26

Salaries

Assume that you had 11 teachers. 10 teachers earned $21,000 per year and one earned $1,000,000.

What would be the best measure to describe this data?

Page 27: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

27

Salaries

The average salary would be $110,000. The median and mode is $21,000. The curve would be positively skewed, i.e.

Mean higher than Mode and Median The median would do the best job at

describing the center the salaries

Page 28: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

28

Skewed Data

1. negative skew: The mass of the distribution is concentrated on the right of the figure. It has relatively few low values. The distribution is said to be left-skewed.

2. positive skew: The mass of the distribution is concentrated on the left of the figure. It has relatively few high values. The distribution is said to be right-skewed. The $ million salary pulls the average up.

Wikipedia: http://en.wikipedia.org/wiki/Skewness

Page 29: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

29

Skewed Distributions:Negative and Positive

http://en.wikipedia.org/wiki/File:Skewness_Statistics.svg

Page 30: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

30

Using Means With Survey Data?

Survey data is typically coded using numbers: Gender: Male is coded 1

Female is coded 2 It is faster and less error-prone to code variables using

numbers

But the computer could treat these as numbers and will compute a mean if asked How would you interpret a mean for gender of 1.6? Or

a mean for religion of 2.8

Page 31: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

31

Do Not Use Means With Nominal Data Gender (and religion) are nominal variables

and should only be reported in terms of distributions: Frequency distribution: 10 men and 12 women Percentage distribution: 45% men and 55%

women

Page 32: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

32

Using Means With Survey Data?

Scales (very satisfied<->very dissatisfied are ordinal scales But they coded into the computer using numbers 5 for very satisfied<->1 for very dissatisfied

The computer will compute a mean if asked: The mean was 3.8 for job satisfaction. The mean satisfaction with faculty performance was

4.2 on a scale from 1-5 Grade-point averages are an example of means based

on an ordinal scale (A—F (scale of 0-4)

Page 33: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

33

Using Means With Ordinal Data?

There is disagreement in the field—partly based on academic discipline-about whether to use means with ordinal data.

Things like GPA or faculty ratings are often shown as means

It is often helpful for researchers to look at the means initially when working with a lot of data—researchers are looking for unusually high or low means.

It is also true that sometimes it is easier to show the means than the percentage distribution for every variable

Page 34: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Question 2006 2007 2009 Percent reporting 4 or 5 (positive)

I know what is expected of me at work

4.28 4.25 4.31 87%

I receive recognition for a job well done.

3.34 3.43 3.47 54%

I have the tools and resources I need to do my job effectively.

3.76 3.75 3.80 70%

Washington Employee Survey

Page 35: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

35

Using Means With Ordinal Data?

But most people are more familiar with polling results, which report percent distributions. We tend to see something like 55% report supporting

cap and trade legislation rather than a mean of 3.4 on a scale of 5 (for) to 1 (against).

The decision about whether means or percent distributions are used to report ordinal data should reflect audience preference and ease of audience understanding. Not an ideological stance

Page 36: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

36

Measures of Dispersion

Used with Interval and Ratio Data Simple Description: The Range Reported salaries ranged from $21,000 to $1,000,000 Ages in the group ranged from 18 to 32

Standard Deviation Measures the dispersion in terms of the the distance

from the mean Small standard deviation: not much dispersion Large standard deviation: lots of dispersion

Page 37: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

37

Standard Deviation

Normal Distribution: Bell-shaped curve 68% of the variation is within 1 standard

deviation of the mean 95% of the variation is within 2 standard

deviations of the mean

Page 38: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Normal Distribution

Mean Standard deviationsStandard deviations

95% of the distribution

Page 39: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

39

Applying the Standard Deviation

Average test score= 60. The standard deviation is 10. Therefore, 95% of the scores are

between 40 and 80. Calculation: 60+20=80 60-20=40.

Page 40: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

40

Standard Deviation with Means

The Standard Deviation is used with interval/ratio level data

Typically, standard deviations are presented with means so the reader can tell whether there is a lot or a little variation in the distribution.

Note: the standard deviation is sometimes used in other statistical calculations, such as z-scores and confidence intervals

Page 41: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

41

Describing Two Variables Simultaneously Cross-tabulations (cross tabs, contingency

tables) Used when working with nominal and

ordinal data It provides great detail

Page 42: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

42

Describing Two Variables SimultaneouslyDetail about the race and gender of the 233

people in the workplace:

Race Men Women

White 21% 31%

Black 15% 11%

Other 14% 6%

Page 43: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

43

Describing Race and Gender

Write-up:

Of the 233 employees, the greatest proportion are white women (31%) followed by white men (21%). Fifteen percent of the employees are black men and 11% are black women, and 14% are men of other race identity and 6% are women of other race identity.

Page 44: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

44

Describing Two Variables SimultaneouslyComparison of Means

Used when one variable is nominal or ordinal, and the second variable is interval/ration level of measurement.

Examples: Men in the MPA program have a GPA of 3.2 as

compared to 3.0 for women. The mean overall citizen satisfaction score is 4.2 this

year as compared to 3.5 last year. Mean salary for women was $35,000 as compared to

$38,000 for men last year.

Page 45: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

45

Key Points

These simple descriptive analysis techniques can be effective: Illuminates, provides feedback, informs and might

persuade. The math is generally straight-forward. Descriptive data is generally easy for many people

understand as compared to more complex statistics (stay tuned).

Complex statistics are not inherently better!

Page 46: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

46

The Tough Question

If descriptive data is distorted, it is tends to be in the way things are being counted and measured. The math is usually correct. Example: The federal debt is often presented just in

terms of percent of debt held by the public but the total debt includes money borrowed from other government funds.

As a result, the debt looks smaller than what it actually is.

Page 47: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

47

The Tough Question

If descriptive data is distorted, it is tends to be in the way things are being counted and measured. The math is usually correct Example. Health insurance profits look

different when calculated as a percent of corporate revenue than when calculated as a percent of all spending on health care. It will look smaller when presented as a percent of

all health care spending which is larger than just corporate insurance revenue.

Page 48: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Dr. G. Johnson, www.ResearchDemystified.org

48

The Tough Question

Always ask: what exactly is being measured and counted?

Consider whether there are other ways of counting and other ways of doing the analysis that might yield different results (or create different perceptions).

Do the choices reflect a political agenda?

Page 49: Dr. G. Johnson,  Data Analysis for Description Research Methods for Public Administrators Dr. Gail Johnson

Creative Commons

This powerpoint is meant to be used and shared with attribution

Please provide feedback If you make changes, please share freely

and send me a copy of changes: [email protected]

Visit www.creativecommons.org for more information