chapter 11 lecture research techniques: for the health sciences fifth edition analyzing and...

28
Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD University of Tennessee Graduate School of Medicine © 2014 Pearson Education, Inc.

Upload: edwina-evans

Post on 14-Jan-2016

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

Chapter 11 Lecture

© 2014 Pearson Education, Inc.

Research Techniques: For the Health Sciences

Fifth Edition

Analyzing and Interpreting Data: Descriptive Analysis

R. Eric Heidel, PhD

University of Tennessee Graduate School of Medicine

Page 2: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

The Meaning of Statistics

• Statistics is a language that can be employed to express concepts and relationships that cannot be communicated in any other way.

• A seasoned health scientist views it as a language to organize, analyze, and interpret numerical data.

Page 3: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

The Meaning of Statistics (cont'd)

• Descriptive Statistical Analysis– It can function to describe data: to explain

how the data look, what the center point of the data is, how spread out the data may be, and how one aspect of the data may be related to one or more other aspect.

– No conclusions can be extended beyond this immediate group, and any similarity to those outside the group cannot be assumed.

Page 4: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

The Meaning of Statistics (cont'd)

• A different function of statistics is to infer.• Inferential Statistical Analysis

– It involves observation of a sample taken from a given population.

– Conclusions about the population are inferred from the information obtained about the sample.

– Unlike descriptive data analysis, generalizations can be made from the sample to the respective population.

– It can be used for estimation and prediction.

• Extrapolation is a component of inferential statistics but not of descriptive statistics.

Page 5: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Statistical Analysis and Data

• Levels of Measurement– There are several ways to classify variables;

one of the principal techniques is the preciseness of measurement of the variable.

– Four levels of measurement:• Nominal measurement: variables are simply

placed into different categories (e.g., gender).• Ordinal measurement: measures both groups

and ranks the data through ordering of categories (e.g., dosage levels, degree of education, severity of illness, and social class). Rankings may be related rather than absolute.

Page 6: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Statistical Analysis and Data (cont'd)

• Interval measurement: categorizes, orders, and provides a meaningful measure of the differences in ordering. Variables can be separated by how much they differ from one another (e.g., height, weight, blood pressure, Celsius temperatures).

• Ratio measurement: is used when interval data have a true zero point (e.g., height, measure of temperature in Kelvin).

Page 7: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Statistical Analysis and Data (cont'd)

• Parametric and Nonparametric Data– The two types of data that are recognized in

the application of statistics are parametric data and nonparametric data.

• Parametric Data– They are either interval or ratio data.– Parametrical statistical tests assume that the

data are normally or near normally distributed.– They are frequently considered the more

powerful of the two.

Page 8: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Statistical Analysis and Data (cont'd)

• Nonparametric Data– They are data that are either counted

(nominal scale) or ranked (ordinal scale).– Nonparametric statistical tests, often referred

to as distribution-free tests, do not require the more restrictive assumption of a normally distributed population.

– Generally, they have wider applications and are less difficult to compute.

Page 9: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Statistical Analysis and Data (cont'd)

• Population and Sample– Population

• It can be defined as the set of elements we are planning to study.

• In the health sciences it usually refers to a group of people.

• It can be something other than a group of people, such as all daily maximum temperatures or all automobiles produced in a given time frame.

– Sample• It is a subset of the population.

Page 10: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Statistical Analysis and Data (cont'd)

• Parameters and Statistics– Parameter

• It is a characteristic of a population.

– Statistic• It is a characteristic of a sample.

Page 11: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Descriptive Data Analysis Techniques

• There are several statistical techniques available:

• Measures of central tendency– Mean– Median– Mode– Geometric mean

• Measures of spread or variation– Range– Standard deviation– Variance– Coefficient of variation

Page 12: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Descriptive Data Analysis Techniques (cont'd)• Standard measures• Measures of relationship

– Spearman rank order correlation– Pearson product-moment correlation

Page 13: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Measures of Central Tendency

• Mean– The mean of a set of data is commonly referred to as

the arithmetic mean or average.– It is computed by summing all the observations in the

group and dividing by the number of observations.– It may be considered the fulcrum or balance point of a

distribution.– It is one of the most useful statistical measures

because it• provides much information• is affected by all the scores• serves as a basis for the computation of other important

measures, such as variability

Page 14: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Measures of Central Tendency (cont'd)

• Median– A measure of position in that it is the point above and

below which one-half of the scores fall. It is the middle-most position.

– If there is an even number of scores, the median would be the middle point between the two middle scores.

– Unlike the mean, the median is not influenced by extreme scores.

– In some instances, it may be a more realistic measure of central tendency than the mean.

– The median is usually reserved for when a quick measure of central tendency is needed or when distributions are markedly skewed.

Page 15: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Measures of Central Tendency (cont'd)

• Mode– It is the most frequently occurring score.– It is the quickest estimate of central value and

shows the most typical case.• Geometric Mean

– It is often used in laboratory data. This is especially true with data in the form of concentration of substances.

– It is calculated using the antilogarithm of log x.

Page 16: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Measures of Spread or Variation

• Measures of central tendency are useful, but oftentimes more information is needed for an accurate description of the sample or population.

• This is particularly true in a comparison of two groups whose means are identical.– In such situations, it is important to know

whether the scores or observations for each group tend to be quite similar (homogeneous) or spread apart (heterogeneous).

• Measures of variation may be employed to show the degree of spread or variation among scores.

Page 17: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Measures of Spread or Variation (cont'd)

• Range– It is the simplest measure of variation.– It is the difference between the highest and

lowest scores.– It may be used justifiably as a hasty measure

of variability; but, since it takes into account only the extremes and not the bulk of observations, it is not generally employed.

Page 18: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Measures of Spread or Variation (cont'd)

• Standard Deviation and Variance– These are the most useful measures of variation.– Deviation: The distance of the measurements away

from the mean.– Variance is obtained from squared deviations from the

mean, thereby making the variance a different unit of measurement than the mean.

– The standard deviation is computed by obtaining the square root of the variance. This takes away the squaring of deviations, thereby making the standard deviation the same unit of measurement as the mean.

Page 19: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Measures of Spread or Variation (cont'd)

• Coefficient of Variation– It is computed by comparing the variability of

different samples, each having different arithmetic means and perhaps differing units of measurement.

Page 20: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

The Normal Distribution Curve

• It is known as Bell curve or Gaussian distribution.• Three distinct properties:

– The curve is bell shaped, extending infinitely in both directions.

– It follows the Empirical Rule or the 68-95-99.7 rule.

– Total area under the curve is equal to 1.0.

Page 21: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Characteristics of the Normal Curve

Page 22: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Standard Measures

• The standard score, also called the z-score, is calculated by subtracting the mean from the observed score and dividing the result by the standard deviation.

• Since z-scores can be expressed in negatives and decimals, oftentimes they are converted so as not to be cumbersome.

• To transform the score, multiply the z-score by the standard deviation and then add the mean.

Page 23: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Measures of Relationship

• Linear correlation: This method is used most frequently to describe the relationship between two or more variables or between two or more sets of data.

• The degree of relationship is expressed by the coefficient of correlation, r.

• Unique characteristics of the correlation coefficient:– It is a pure number.– It is nondimensional.– It may take on values between −1.00 and +1.00

• A correlation coefficient of zero indicates no relationship between the variables in question.

Page 24: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Measures of Relationship (cont'd)

• The closer the r is to 1.00 (positive or negative), the stronger the relationship.

• A perfect positive correlation of +1.00 specifies that for every unit increase in one variable, there is a proportional unit increase in the other variable.

• A perfect negative correlation of −1.00 means that for every unit increase in one variable, there is a unit decrease in the other variable.

• The scatterplot is a means of displaying the relationship between variables and is developed by graphically plotting each pair of variables that correspond to the x and y axis.– The line drawn through or near the coordinate points is referred

to as the line of best fit or regression line.

Page 25: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Measures of Relationship (cont'd)

• Spearman Rank Order Correlation– It is used to determine the relationship

between two ranked variables (rather than interval or ratio variables).

– It is designed for nonparametric data.– It is a less frequent measure but its valuable

use is to compare the judgments by two judges on a group of objects or items.

– It is an acceptable method for parametric data when there are fewer than 30 but greater than 9 paired variables.

Page 26: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Measures of Relationship (cont'd)

• Pearson Product-Moment Correlation– It is the most often used and most precise

coefficient of correlation.– It is used with parametric data.– It is the raw score equation that is convenient

for both calculator and computer use.

Page 27: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Personal Computers and Information Delivery Systems• Examples of computer programs that can be

employed by the health science researcher– Statistical Analysis System (SAS)– Statistical Package for the Social Sciences (SPSS for

Windows)– Minitab– Statistica– S-Plus– Stata– Open-Epi– R (open source)

Page 28: Chapter 11 Lecture Research Techniques: For the Health Sciences Fifth Edition Analyzing and Interpreting Data: Descriptive Analysis R. Eric Heidel, PhD

© 2014 Pearson Education, Inc.

Personal Computers and Information Delivery Systems (cont'd)• Selection of a statistical analysis package can

be an arduous process.• There are eight considerations for selection and

possible purchasing.