section i. statistics

93
Section I. Statistics What do they mean and why are they important?

Upload: faunus

Post on 24-Feb-2016

21 views

Category:

Documents


0 download

DESCRIPTION

Section I. Statistics. What do they mean and why are they important?. What do stats mean?. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Section I. Statistics

Section I. Statistics

What do they mean and why are they important?

Page 2: Section I. Statistics

• To be an intelligent consumer of statistics, your first reflex must be to question the statistics that you encounter. The British Prime Minister Benjamin Disraeli famously said, "There are three kinds of lies -- lies, damned lies, and statistics."

• It is important to think about the numbers, their sources, and most importantly, the procedures used to generate them.

What do stats mean?

Page 3: Section I. Statistics

• Weather forecasts• Emergency preparedness• Predicting disease• Medical studies• Genetics• Political campaigns• Insurance• Consumer goods• Quality testing• Stock market

Top 10 ways you use statistics every day

Page 4: Section I. Statistics

• Six good reasons to study statistics – to be able to effectively conduct research, – to be able to read and evaluate journal articles, – to further develop critical thinking and analytic skills,– to act as an informed consumer, – and to know when you need to hire outside statistical help. – Even Florence Nightingale did it!

But I’m never going to do research!

Page 5: Section I. Statistics

• Increasing emphasis on evidence based practice

– Informs nurses’ decisions and actions– Empowers nurses to make clinical decisions which

benefit their patients, whether individual or community

– Friendly nursing research environment required for Magnet status

– Increases recognition for nursing contribution in health care and policy

Why nursing research

Page 6: Section I. Statistics

• The characteristics we are measuring– Varies according to the population, patient, event,

intervention• Data levels of measurement help us measure

the variables– Nominal– Ordinal– Interval– Ratio

Variables

Page 7: Section I. Statistics

• sometimes called categorical or qualitative

– Permissible statistics: mode, chi-squared– Lowest form of data, least sophisticated

• Names • Characteristics/Descriptive (i.e. pain - throbbing, stabbing,

dull)• Letters (i.e. M/F, Y/N)• Numbers may be assigned to designate categories but have

no numerical meaning (i.e. M=1, F=2)

Data levels of measurement: Nominal

Page 8: Section I. Statistics

– Permissible statistics: median, percentile– Can’t be added

• Rank order –1st, 2nd, 3rd

• Rating–Pain rating 0-10

• Likert scale

Data Levels of measurement: Ordinal

Page 9: Section I. Statistics

• Dissatisfied, somewhat dissatisfied, neither satisfied nor dissatisfied, somewhat satisfied, very satisfied– No numerical data to quantify – Answers run on a continuum

Likert scales

Page 10: Section I. Statistics

• Permissible statistics: mean, SD, correlation, regression, ANOVA– Rank ordering of objects.– Equivalent distance between each measurement– The Fahrenheit scale is a clear example of the

interval scale of measurement– Arbitrary zero does not represent the lowest value

Data Levels of measurement: Interval

Page 11: Section I. Statistics

• Highest level of measurement• Permissible statistics: same as interval plus more • The ratio scale of measurement is similar to the interval

scale in that it also represents quantity and has equality of units.

• has an absolute zero (no numbers exist below zero). Very often, physical measures will represent ratio data (for example, height and weight). Example: measuring a length of a piece of wood in centimeters: you have quantity, equal units, and the measure can’t go below zero centimeters.

Data Levels of measurement : Ratio

Page 12: Section I. Statistics

Subject Ratio level Interval level Ordinal level Nominal level

Cookie 180 70 6 2

Bunny 110 0 1 1

Frosty 165 55 4 2

Tootsie 130 20 3 1

Candy 175 65 5 2

Fluffy 115 5 2 1

Examples of data levels of measurement

Page 13: Section I. Statistics

• The colors of M&M candies would be which type of measurement?

A. IntervalB. NominalC. OrdinalD. Ratio

Question 1

Page 14: Section I. Statistics

• Height, weight, lab test results, and age are examples of which type of data measurement?

A. RatioB. NominalC. IntervalD. Ordinal

Question 2

Page 15: Section I. Statistics

• The Rankin scale is used to assess functional status after stroke. Measurements are:

• 0 = no symptoms at all• 1 = symptoms with no significant disability• 2 = slight disability; unable to carry out previous activities• 3 = moderate disability; needs some assistance, can walk alone• 4 = moderately severe disability; unable to walk or attend bodily functions

without assistance• 5 = severe disability; bedridden, incontinent, needs constant nursing care• 6 = dead

Rankin Scale

Page 16: Section I. Statistics

• The Rankin scale is which type of measurement?

A. RatioB. NominalC. IntervalD. Ordinal

Question 3

Page 17: Section I. Statistics

Section II. Descriptive Statistics and Intro to the Normal Distribution

Page 18: Section I. Statistics

Descriptive Statistics= Describing the Data

• For any study, consider what parts would be useful to describe in numbers– Sample– Variables of interest

• In any study where the data are numerical, data analysis should begin with descriptive statistics.

• The appropriate choice of descriptive statistics depends on the level of data that was collected!

Page 19: Section I. Statistics

Types of Summary Statistics

• Frequency distributions – Ungrouped– Grouped– Percentages

• Measures of central tendency• Measures of dispersion

Page 20: Section I. Statistics

Ungrouped Frequency Distributions

• The number of times something happened.• Used with categorical data (ordinal, nominal)• As simple as a tally or count

http://www.gigawiz.com/histograms.html

Page 21: Section I. Statistics

Example

• Using ungrouped frequency distributions to describe research variables

• How often newborns fit each demographic criteria or birth attendant reported a particular behavior (ex. using CHG vs. not)

From Rhee et al. (2008). Maternal and birth attendant hand washing and neonatal mortality in Southern Nepal. Archives of Pediatrics and Adolescent Medicine, 162(7), 603-608

Page 22: Section I. Statistics

Grouped Frequency Distributions

• The number of times something happened.• Used to break continuous data (often things like

age, weight, income) into groups.– You will always loose some information by doing this– There are conventions for groupings

• Groups ideally have equal ranges but may see open ended at ends of data spectrum

• All data points must fit into a group• Not too many, not too few (you don’t want to loose

patterns in the data)

Page 23: Section I. Statistics

Percentage Distributions

• What percentage of the time something happened.– Useful when comparing to studies with different

numbers of participants– Often presented with other frequency

distributions in the following format: No.(%)– Often graphically represented using pie charts, bar

charts

Page 24: Section I. Statistics

Example

• Questionnaires given to parents of under-immunized children.

• The tables indicate the number and percentage of participants selecting each response.

Luthy, K., Beckstrand, R., & Peterson, N. (2009). Parental hesitation as a factor in delayed Childhood Immunization

Page 25: Section I. Statistics

Question

• Which measure of central tendency is being used here to summarize participant’s age:– A- Mode– B- Median– C- Mean– D- Standard deviation

Page 26: Section I. Statistics

Measure of Central Tendency

• Used to describe a “typical” result or the middle of the dataset

• Most common measures:– Median– Mode– Mean

Page 27: Section I. Statistics

Median

• Literally the number in the middle of the dataset (odd # scores)– 50% of scores above and 50% of scores below this point (known

as the 50th percentile)• Most appropriately used for ordinal data • Because focus is on middle score, the median is less

affected by outliers

Page 28: Section I. Statistics

Mode

• The most common score(s)– May or may not be in the “middle” but is always a

number in the dataset– Most appropriate for nominal data (ex. Most

answers are “yes”).

Page 29: Section I. Statistics

Mean

• = Sum of Scores / Total # of Scores– Also known as an average

• Data must be continuous to generate a mean (interval and ratio level data only!)

• Most affected by outliers• May be denoted in a number of ways (M, X

mean)

Page 30: Section I. Statistics

Measures of Variance

• How spread out is the data? Or how different are the scores from one another?– Range

• Subtract the lowest number from the highest number in the set. Tells the total distance between ends of the data set.

– Variance (interval or ratio levels only!)• Computed mathematically and provides data on dispersion or

spread– Standard deviation (interval or ratio levels only!)

• Relates dispersion of values to the mean• Is an average of variance• Usually reported as SD

Page 31: Section I. Statistics

Normal Distribution

• In a true normal distribution, the mean, median, and mode are equal• No real distribution exactly fits• However, in most sets of data, the distribution is similar to the normal curve

Page 32: Section I. Statistics

Normal Distribution•Unique properties

All possible values fall under the curve

Probability of any score occurring is related to its location under the curve

•Important SDs: 68.3% of all values

within 1 SD from mean 95.5% within 2SD from

mean 99.7% within 3 SD from

mean

+/- 1 SD

+/- 2 SD

Page 33: Section I. Statistics

Section III.Stat theoryHypotheses

Type 1 and 2 ErrorsLevel of Significance

Power

Page 34: Section I. Statistics

Probability Theory (p values)• Deductive• Used to explain:

– Extent of a relationship– Probability of an event occurring– Probability that an event can be accurately

predicted• Expressed as lowercase p with values

expressed as percents

Page 35: Section I. Statistics

Probability

• If probability is 0.23, then p = 0.23.• There is a 23% probability that a particular

event will occur.• Probability is usually expected to be p <

0.05.

• Example?• Patients who cardiac arrest in the operating

room have a 5% chance of death.

Page 36: Section I. Statistics

Decision Theory

• Inductive reasoning• Assumes all groups in a study are the

same• Up to the researcher to provide evidence

(NEVER use the words PROVE!) that there really is a difference

• To test the assumption of no difference, a cutoff point is selected before analysis.

Page 37: Section I. Statistics

Hypothesis

• Statement of the expected outcome

• Example?• Nursing students who study in the library

have higher GPAs than nursing students who study in their dorm rooms/apartments.

Page 38: Section I. Statistics

Characteristics of a Hypothesis• Testable• Logical• Directly related to the research problem• Theoretically or Factually based• States relationship between variables• Stated so that it can be accepted or rejected

Page 39: Section I. Statistics

Research Hypothesis• Directional

– explains and predicts the direction and existence of a specific relationship

– relationship will be either positive or negative– more specific than the non-directional

hypothesis– cause-and-effect hypothesis

• Non - Directional

Page 40: Section I. Statistics

Null hypothesis

• Statistical statement that there is no difference between the groups under study

Page 41: Section I. Statistics

Cutoff Point

• level of significance or alpha (α)

• Point at which the results of statistical analysis are judged to indicate a statistically significant difference between groups

• For most nursing studies, level of significance is 0.05.

Page 42: Section I. Statistics

Cutoff Point (cont’d)

Absolute

NO “CLOSE ENOUGH” - If value is only a fraction above the cutoff point, groups are from the same population.

Results that reveal a significant difference of 0.001 are not considered more significant than the cutoff point.

Page 43: Section I. Statistics

Inference

A conclusion/judgment based on evidence

Judgments are made based on statistical results

Statistical inferences must be made cautiously and with great care

Page 44: Section I. Statistics

Generalization

• A generalization is the application of information that has been acquired from a specific instance to a general situation.

• Example?

Page 45: Section I. Statistics

Normal CurveA theoretical frequency distribution of all

possible values in a population.Levels of significance and probability are

based on the logic of the normal curve.

Page 46: Section I. Statistics

Normal Curve

Page 47: Section I. Statistics

One-Tailed Test (cont’d)

Page 48: Section I. Statistics

Two-Tailed Test

Page 49: Section I. Statistics

Type I and Type II Errors

Type I error occurs when the researcher rejects the null hypothesis when it is true.The results indicate that there is a significant

difference, when in reality there is not.

Type II error occurs when the researcher regards the null hypothesis as true but it is false. The results indicate there is no significant

difference, when in reality there is a difference.

Page 50: Section I. Statistics

Reasons for Errors

• Type I– Greater @.05 level

than .01

• Type II– Greater @.01 level than

.05– Flaws in research

methods• Multiple variables

interact• Precision of instruments• Small samples

Page 51: Section I. Statistics

Statistical Power (AKA Power Analysis)

• DEF: the probability of rejecting the null hypothesis when it should have been rejected OR

• Probability that a statistical test will detect a significant difference that exists

Page 52: Section I. Statistics

Power

• Maneuver to increase control over:

– Types of errors

– CORRECT DECISIONS

Page 53: Section I. Statistics

Power and Risk for Type II Error

Power analysis = 0.80 minimum

Influenced by sample sizeAs sample increases so does power

Influenced by effect size – degree to which a phenomenon is present in a populationThe larger the true difference between the two

groups the greater the power

Page 54: Section I. Statistics

Question #1The level of significance usually set in nursing studies is at either:

a. .5 or .1b. .05 or .01c. .005 or .001

Page 55: Section I. Statistics

Question #2

Which of the following is TRUE about the level of significance?

a. ensures that findings will be correct 95% of the time if an alpha value was less than .05 was used

b. refers to a statistic calculated during computer analysis

c. represents the risk the researcher is willing to take in making a type I error and is established before data is analyzed

Page 56: Section I. Statistics

Question #3

There is a greater risk of a Type I error with a 0.05 level of significance than with a 0.01 level of significance.A. TrueB. False

Page 57: Section I. Statistics

•Statistical Significance

•Clinical Significance

•Reliability

•Validity

•Generalizability & Inference

Section IV.

Page 58: Section I. Statistics

Statistical Significance• Known as the Alpha ()

• The threshold at which statistical significance is reached.

Page 59: Section I. Statistics

Cut Off Point

• Referred to as level of significance or alpha (α)• Point at which the results of statistical analysis

are judged to indicate a statistically significant difference between groups

• For many nursing studies, level of significance is 0.05.

• Typically written as α = 0.05

Page 60: Section I. Statistics

Cutoff Point (cont’d)

• The cutoff point is absolute.

• If the value obtained is only a fraction above the cutoff point no meaning can be attributed to the differences between the groups.

Page 61: Section I. Statistics

Levels of Acceptable Significance

• 0.05 • 0.01• 0.005• 0.001

Page 62: Section I. Statistics

Clinical Significance

• Findings can have statistical significance but not clinical significance.

• Related to practical importance of the findings• No common agreement in nursing about how

to judge clinical significance– Difference sufficiently important to warrant

changing the patient’s care?

Page 63: Section I. Statistics

Clinical Significance (cont’d)

• Who should judge clinical significance?– Patients and their families?– Clinician/researcher?– Society at large?

• Clinical significance is ultimately a value judgment.

Page 64: Section I. Statistics

Simpson & James (2005) Effects of Immediate Vs. Delayed Pushing During Second-Stage Labor….

Significance differences between groups: Fetal oxygen desaturation during second stage labor (immediate: M=12.5; delayed: M=4.6), p = .001Variable decelerations in fetal heart rate (immediate: M=22.4; delayed: M=15.6), p = .02 There were no differences in length of labor, method of birth, Apgar scores, or umbilical cord gases.

Page 65: Section I. Statistics

Question: A statistically significant finding means that:

a. Findings are clinically important and valuable.b. Interventions should be used in clinical practice.c. Obtained results are not likely to have been due

to chance.d. Results will be the same if the study is repeated

with another sample.

Page 66: Section I. Statistics

Question: A researcher reports that the results of a study were not statistically significant. How is this to be interpreted?

a. Intervention was not strong enough to make a difference.

b. Researcher does not have enough evidence to reject Ho.

c. Researcher’s logic or conceptualization in setting up the study was faulty.

d. Topic is of no further interest to nurse researchers or clinicians.

Page 67: Section I. Statistics

Testing Reliability of Measurement

• Examine reliability of study scales before using them.

• The degree of consistency with which an instrument measures a construct.

Page 68: Section I. Statistics

Reliability Coefficient

• A quantitative index• Usually ranges from .00 to 1.00• Provides an estimate of how reliable an

instrument is • Should be at least 0.70• Most common one is Cronbach’s alpha

Page 69: Section I. Statistics

Hollen et al. (1994) Measurement of QOL in patients with.…Psychometric assessment of the

LCSS.

LCSS has good reliability• Internal consistency of = 0.82• High reproducibility/stability (test-retest reliability (n=52, r>0.75)• High repeated inter-rater agreement /equivalence among experts (95-100% agreement)

Page 70: Section I. Statistics

Validity

1. The degree to which inferences made in a study are accurate = Internal Validity

2. The degree to which results can be generalized = External Validity

3. The degree to which an instrument measures what it is intended to measure = Validity

Page 71: Section I. Statistics

Hollen et al. (1994) Measurement of QOL in patients with.…Psychometric assessment of the LCSS.

Validity has been established for the LCSS

• Content validity ~ expert panel• Convergence validity ~ similar QOL tool• Construct validity ~ unrelated tools• Criterion-related validity ~ correlation with a

“gold” standard (e.g. Sickness Illness Profile)

Page 72: Section I. Statistics

Inference

•A conclusion or judgment based on evidence

•Judgments are made based on statistical results

•Statistical inferences must be made cautiously and with great care

Page 73: Section I. Statistics

Generalization

• A generalization is the application of information that has been acquired from a specific instance to a general situation.

• Generalizing requires making an inference.

• Both inference and generalization require the use of inductive reasoning.

Page 74: Section I. Statistics

Generalization (cont’d)

• An inference is made from a specific case and extended to a general truth, from a part to a whole, from the known to the unknown.

• In research, an inference is made from the study findings to a more general population.

Page 75: Section I. Statistics

Simpson & James (2005) Effects of Immediate Vs. Delayed Pushing During Second-Stage Labor….

“Results from this study suggest that delayed second-stage pushing until the urge to push and pushing with the open-glottis technique in nulliparous women with epidural anesthesia is more favorable for physiologic fetal well-being as measured by FSpO2 (p. 155).”

“The benefits of less fetal oxygen desaturation ….appear to outweigh any disadvantages of a longer second stage (p. 155).”

Page 76: Section I. Statistics

Question: Which of the following questions relates to generalization?

a. Are the findings generally significant to people in the study?

b. Can these findings be applied to other groups or settings?

c. Does the degree of control in the study allow for statistical significance?

d. How many alternative explanations can be proposed?

Page 77: Section I. Statistics

Section V. Common Statistical Tests

• Independent T-Test• One-Way ANOVA• Chi-Square• Correlation• Regression

Page 78: Section I. Statistics

Independent T-Test• To compare means between two groups• The continuous variable is measured once.For example:Research QuestionIs there a difference in self-efficacy for pain management in week 10 between participants with Fibromyalgia (FM) in guided imagery group and those in standard care group? HypothesesHo: µGI - µSC = 0 α = 0.05Ha: µGI - µSC ≠ 0

Page 79: Section I. Statistics

Independent T-Test (Cont’d)Tests of assumptions with the sample• Independent groups (no overlap).• Dependent variable is continuous (interval or ratio

level).• Normal distribution.• Homogeneity of Variance is met.

Group Statistics

Group N Mean Std. Deviation

Std. Error Mean

Self efficacy for pain management in week 10

Guided Imagery (GI) 24 64.5833 22.69249 4.63209

Standard Care (SC 24 49.8333 20.30992 4.14574

Page 80: Section I. Statistics

Independent T-Test (Cont’d)Ho: µGI - µSC = 0 α = 0.05 t = 2.373Ha: µGI - µSC ≠ 0 p = 0.011 = 1.1%

Conclusion:There is a difference in self-efficacy in week 10 between participants with Fibromyalgia (FM) in guided imagery group and those in standard care group. In our sample, in week 10, participants in guided imagery group had greater self-efficacy than those in standard care group.

Page 81: Section I. Statistics

One-Way Analysis of Variance (ANOVA)• Tests for differences between means.• More flexible than other analyses in that it can examine data from

two or more groups.For example:Research QuestionIs there a difference in depression scores depending on types of elderly housing and care (independent living, assisted living, and nursing care)?

HypothesesHo = µIL = µAL = µNC α = 0.05Ha = At least 2 groups differ

Page 82: Section I. Statistics

ANOVA (cont’d)

Variables

Independent Living

Assisted Living

Nursing Care p

(n=16) (n=19) (n=17)

Depression scores, Mean (SD)

12.25 (7.594)

12.84 (7.274)

16.44 (8.043)

0.234(> 0.05)

Tests of assumptions— Independent groups - Continuous dependent variable— Normal distribution - Homogeneity of Variance is met

Conclusion:There is no difference in depression scores depending on types of elderly housing and care (independent living, assisted living, and nursing care).

If significant, Post Hoc tests are used to determine the location of differences.

Page 83: Section I. Statistics

Chi-Square Test of Independence

• Used with nominal or ordinal data• Hypothesis:

– Ho: There is no difference in Y depending on X– Ha: There is a difference in Y depending on X

• Assumptions:– Frequency data– Adequate n: > 5 expected per cell and can be

violated up to 20% of cells.

Page 84: Section I. Statistics

Research QuestionIs there a difference in depression at week 12 depending on the helplessness category - low or high?Hypotheses• Ho: There is no difference in depression at week 12

depending on the helplessness category - low or high.

• Ha: There is a difference in depression at week 12 depending on the helplessness category - low or high

Example of Chi-Square Test

Page 85: Section I. Statistics

Crosstabulation

AHITotalLow High

Depression (cat.) at week 12

Not Depressed Count 26 14 40Expected Count 22.3 17.7 40.0

% within AHI 89.7% 60.9% 76.9%

Depressed Count 3 9 12Expected Count 6.7 5.3 12.0

% within AHI 10.3% 39.1% 23.1%

Total Count 29 23 52Expected Count 29.0 23.0 52.0

% within AHI 100.0% 100.0% 100.0%

2 = 5.99, df = 1, p = 0.07 or 7% -Arthritis Helplessness Index (AHI)

Conclusion:There is a difference in depression at week 12 depending on the helplessness category - low or high. Those people in the high helplessness group had higher level of depression compared to those in the low helplessness group.

Page 86: Section I. Statistics

Pearson Product-Moment Correlation

• Tests for the presence of a relationship between two variables– Called bivariate correlation

• Types of correlation are available for all levels of data. Best results are obtained using interval data.

• Results– Nature of the relationship (positive or negative)– Magnitude of the relationship (–1 to +1)– Strength of r: High= > 0.70; Moderate= 0.30-0.69; Low= < 0.30– Testing the significance of a correlation coefficient– The R2 is the variation between two variables expressed as a

percentage.

Page 87: Section I. Statistics

Maximum positive correlation

(r = 1.0)

Maximum negative correlation

(r = -1.0)

Scatterplots and Correlation Coefficients

Strong correlation & outlier(r = 0.71)

Page 88: Section I. Statistics

Correlation ResultsQUESTION

Which one is significant if level of significance used in this test is 0.01?

A. r = 0.56 (p = 0.03)B. r = –0.13 (p = 0.2)C. r = 0.65 (p = 0.002)D. r = 0.33 (p = 0.04)

Page 89: Section I. Statistics

Regression Analysis• Used when one wishes to predict the value of one

variable based on the value of one or more other variables

• For example:– one might wish to predict the possibility of passing the

credentialing exam based on grade point average (GPA) from a graduate program.

– Or to predict the length of stay in a neonatal unit based on the combined effect of multiple variables such as gestational age, birth weight, number of complications, and sucking strength.

Page 90: Section I. Statistics

Regression Analysis (cont’d)• Assumptions:

– Must have Independent Variable & Dependent Variable– Both variables must be continuous– Normally distributed data– Linear relationship (scatter plot)

• The outcome of analysis is the regression coefficient R.• When R is squared, it indicates the amount of variance

in the data that is explained by the equation.• The R2 is also called the coefficient of multiple

determination.

Page 91: Section I. Statistics

Regression Results

• R2 = 0.63• This result indicates that 63% of the variance

in length of stay can be predicted by the combined effect of age, weight, complications, and sucking strength.

Page 92: Section I. Statistics

Overlay of Scatterplot and Best-Fit Line

Page 93: Section I. Statistics

Conclusion

• Statistical tests selection depends on the research question.

• Some research questions can be answered by using basic statistical tests; while others require advanced statistical tests.