chapter 2

84
Chapter 2 Graphical Displays of Univariate Data 1-1 DESCRIPTIVE STATISTICS

Upload: tate-kaufman

Post on 30-Dec-2015

22 views

Category:

Documents


0 download

DESCRIPTION

DESCRIPTIVE STATISTICS. Chapter 2. Graphical Displays of Univariate Data. Objectives. Introduction of some basic statistical terms. Introduction of some graphical displays. Introduction. - PowerPoint PPT Presentation

TRANSCRIPT

Chapter 1

Chapter 2Graphical Displays of Univariate Data1-1DESCRIPTIVE STATISTICSObjectivesIntroduction of some basic statistical terms.Introduction of some graphical displays.1-2IntroductionWhat is statistics? Statistics is the science of collecting, organizing, summarizing, analyzing, and making inferences from data.The subject of statistics is divided into two broad areasdescriptive statistics and inferential statistics.1-3Statistics1-4 Descriptive Statistics Inferential StatisticsIncludes Collecting Organizing Summarizing Presenting dataIncludes Making inferences Hypothesis testing Determining relationships Making predictions Breakdown of the subject of statisticsIntroduction (continued)Explanation of the term data: Data are the values or measurements that variables describing an event can assume.Variables whose values are determined by chance are called random variables.Types of variables there are two types: qualitative and quantitative.1-5Introduction (continued)What are qualitative variables: These are variables that are nonnumeric in nature.What are quantitative variables: These are variables that can assume numeric values. Quantitative variables can be classified into two groups discrete variables and continuous variables.1-6Variables1-7QuantitativeQualitativeIncludes Discrete Continuous variablesBreakdown of the types of variablesIntroduction (continued)What are quantitative data: These are data values that are numeric.Examples: (1) The heights of female basketball players, (2) The 100 meter times of male sprinters.What are qualitative data: These are data values that can be placed into distinct categories according to some characteristic or attribute.Examples: (1) The eye color of female basketball players, (2) The preferred soft drinks of Americans .1-8Introduction (continued)What are discrete variables: These are variables that can assume values that can be counted.Examples: (1) The number of days it rained in your neighborhood for the month of March, (2) The number of girls in a family of four children.What are continuous variables: These are variables that can assume all values between any two values.Examples: (1) The time it takes to complete a quiz, (2) The top speeds of various sports cars.1-9Introduction (continued)In order for statisticians to do any analysis, data must be collected or sampled. We can sample the entire population or just a portion of the population.What is a population? A population consists of all elements that are being studied.What is a sample?: A sample is a subset of the population.1-10Introduction (continued)Example: If we are interested in studying the distribution of ACT math scores of freshmen at a college, then the population of ACT math scores will be the ACT math scores of all freshmen at that particular college. Example: If we selected every tenth ACT math scores of freshmen at a college, then this selected set will represent a sample of ACT math scores for the freshmen at that particular college. 1-11Introduction (continued)1-12Population all freshmen ACT math scoresSample every10th ACT math scoreIntroduction (continued)What is a census? A census is a sample of the entire population.Example: Every 10 years the U.S. government gathers information from the entire population. Since the entire population is sampled, this is referred to as a census.1-13Introduction (continued)Both populations and samples have characteristics that are associated with them.These are called parameters and statistics respectively.A parameter is a characteristic of or a fact about a population.Examples: (1) The average age for the entire student population on a campus is an example of a parameter, (2) The average number of hours worked per week for the entire faculty population on a campus is an example of a parameter.1-14Introduction (continued)A statistic is a characteristic of or a fact about a sample.Examples: (1) The average ACT math score for a sample of students on a campus is an example of a statistic, (2) The average commute for a sample of faculty members on a campus is an example of a statistic.1-15Introduction (continued)1-16Population Described by ParametersSample Described by Statistics1-2 Frequency DistributionsWhat is a frequency distribution? A frequency distribution is an organization of raw data in tabular form, using classes (or intervals) and frequencies.What is a frequency count? The frequency or the frequency count for a data value is the number of times the value occurs in the data set. 1-17Categorical or Qualitative Frequency DistributionsNOTE: We will consider categorical, ungrouped, and grouped frequency distributions.What is a categorical frequency distribution? A categorical frequency distribution represents data that can be placed in specific categories, such as gender, hair color, political affiliation etc.1-18Categorical or Qualitative Frequency Distributions: Example The blood types of 25 blood donors are given below. Summarize the data using a frequency distribution.

1-19ABBAOBOBOAOBOBBBAOABABOABABOACategorical Frequency Distribution for the Blood Types: Example Continued 1-20 Note: The classes for the distribution are the blood types. Class (Blood Type)Frequency, fA5B8O8AB4Totaln = 25Categorical or Qualitative Frequency Distributions: Example The favorite areas of study of 30 liberal arts students are given below. Summarize the data using a frequency distribution. 1-21ArtHistoryPremedJournalismMathMusicEnglishHistoryGovernmentStatsHistoryPremedPrevetEnglishStatsMathMathStatsStatsEnglishStatsStatsMathJournalismEnglishEnglish EnglishHistoryGovernmentEnglishCategorical Frequency Distribution for the favorite areas of study: Example Continued1-22 Note: The classes for the distribution are the areas of study. Class (Area of Study)Frequency, fEnglish6History4Math4Premed2Stats6Other8Total30Quantitative Frequency Distributions: UngroupedWhat is an ungrouped frequency distribution? An ungrouped frequency distribution simply lists the data values with the corresponding frequency counts with which each value occurs.

1-23Quantitative Frequency Distributions: Ungrouped - Example The at-rest pulse rate for 16 athletes at a meet were 57, 57, 56, 57, 58, 56, 54, 64, 53, 54, 54, 55, 57, 55, 60, and 58. Summarize the information with an ungrouped frequency distribution.

1-24Quantitative Frequency Distributions: Ungrouped - Example Continued 1-25Note: The (ungrouped) classes are the observed values themselves. Class (Pulse Rate)Frequency, f531543552562574582601641Totaln = 16 Quantitative Frequency Distributions: Ungrouped Example The oldest documented human beings are currently 113, 112, 112, 113, 113, 116, 115, 113, 115, and 113 years old. Summarize the information with an ungrouped frequency distribution.

1-26Quantitative Frequency Distributions: Ungrouped - Example Continued 1-27Note: The (ungrouped) classes are the observed values themselves. Class (Current Age)Frequency, f 1161115211351122Total10Relative FrequencyNOTE: Sometimes frequency distributions are displayed with relative frequencies as well.

What is a relative frequency for a class? The relative frequency of any class is obtained dividing the frequency (f) for the class by the total number of observations (n). 1-28Relative Frequency 1-29Example: The relative frequency for the ungrouped class of 57 will be 4/16 = 0.25.

Relative Frequency Distribution 1-30Note: The relative frequency for a class is obtainedby computing f/n. Class (Pulse Rate)Frequency, fRelative Frequency5310.06255430.18755520.12505620.12505740.25005820.12506010.06256410.0625Totaln = 16 1.0000Cumulative Frequency and Cumulative Relative FrequencyNOTE: Sometimes frequency distributions are displayed with cumulative frequencies and cumulative relative frequencies as well.

1-31Cumulative Frequency and Cumulative Relative Frequency What is a cumulative frequency for a class? The cumulative frequency for a specific class in a frequency table is the sum of the frequencies for all values at or below the given class. 1-32Cumulative Frequency and Cumulative Relative Frequency What is a cumulative relative frequency for a class? The cumulative relative frequency for a specific class in a frequency table is the sum of the relative frequencies for all values at or below the given class. 1-33Cumulative Frequency and Cumulative Relative Frequency1-34Note: Table withrelative and cumulativerelative frequencies. Class (Pulse Rate)Frequency fRelative FrequencyCumulative FrequencyCumulative Relative Frequency5310.062510.06255430.187540.25005520.125060.37505620.125080.50005740.2500120.75005820.1250140.87506010.0625150.93756410.0625161.0000Totaln = 16 1.0000Quantitative Frequency Distributions: GroupedWhat is a grouped frequency distribution? A grouped frequency distribution is obtained by constructing classes (or intervals) for the data, and then listing the corresponding number of values (frequency counts) in each interval.1-35Quantitative Frequency Distributions: GroupedThere are several procedures that one can use to construct a grouped frequency distribution.However, because of the many statistical software packages (MINITAB, SPSS etc.) and graphing calculators (TI-83 etc.) available today, it is not necessary to try to construct such distributions using pencil and paper. 1-36Quantitative Frequency Distributions: GroupedLater, we will encounter a graphical display called the histogram. We will see that one can directly construct grouped frequency distributions from these displays. 1-37Quantitative Frequency Distributions: Grouped -- Quick TipA frequency distribution should have a minimum of 5 classes and a maximum of 20 classes.For small data sets, one can use between 5 and 10 classes.For large data sets, one can use up to 20 classes.1-38Quantitative Frequency Distributions: Grouped - Example The weights of 30 female students majoring in Physical Education on a college campus are as follows: 143, 113, 107, 151, 90, 139, 136, 126, 122, 127, 123, 137, 132, 121, 112, 132, 133, 121, 126, 104, 140, 138, 99, 134, 119, 112, 133, 104, 129, and 123. Summarize the data with a frequency distribution using seven classes. 1-39Quantitative Frequency Distributions: Grouped - Example Continued NOTE: We will introduce the histogram here to help us construct a grouped frequency distribution.1-40Quantitative Frequency Distributions: Grouped - Example Continued What is a histogram? A histogram is a graphical display of a frequency or a relative frequency distribution that uses classes and vertical (horizontal) bars (rectangles) of various heights to represent the frequencies. 1-41Quantitative Frequency Distributions: Grouped - Example Continued The MINITAB statistical software was used to generate the histogram.The histogram has seven classes. Classes for the weights are along the x-axis and frequencies are along the y-axis.The number at the top of each rectangular box, represents the frequency for the class. 1-42Quantitative Frequency Distributions: Grouped - Example Continued 1-43Histogramwith 7 classes for theweights.

Quantitative Frequency Distributions: Grouped Example Continued ObservationsFrom the histogram, the classes (intervals) are 85 95, 95 105, 105 115 etc. with corresponding frequencies of 1, 3, 4, etc.We will use this information to construct the group frequency distribution. 1-44Quantitative Frequency Distributions: Grouped Example Continued Observations (continued)Observe that the upper class limit of 95 for the class 85 95 is listed as the lower class limit for the class 95 105. Since the value of 95 cannot be included in both classes, we will use the convention that the upper class limit is not included in the class.

1-45Quantitative Frequency Distributions: Grouped Example Continued Observations (continued)That is, the class 85 95 should be interpreted as having the values 85 and up to 95 but not including the value of 95.Using these observations, the grouped frequency distribution is constructed from the histogram and is given on the next slide.

1-46Quantitative Frequency Distributions: Grouped Example Continued 1-47Class (Weights)Frequency fRelative FrequencyCumulative FrequencyCumulative Relative Frequency085 - 09510.033310.0333095 - 10530.100040.1333105 - 115 40.133380.2666115 - 12560.2000140.4666125 - 13590.3000230.7666135 - 14560.2000290.9666145 - 15510.0333300.9999Totaln = 30 1.0000Note: Thisvalue should be equal to 1.It is 0.9999becauseof rounding. Quantitative Frequency Distributions: Grouped Example Continued Observations (continued)In the previous slide with the grouped frequency distribution, the sum of the relative frequencies did not add up to 1. This is due to rounding to four decimal places.The same observation should be noted for the cumulative relative frequency column.1-48Quantitative Frequency Distributions: Grouped Example The heights of the worlds tallest 28 buildings are given below (units of ft.) Summarize the data with a frequency distribution using 11 classes.

1483, 1483, 1451, 1362, 1283, 1260, 1250, 1227, 1205, 1165, 1140, 1136, 1135, 1127, 1093, 1087, 1083, 1058, 1053, 1046, 1046, 1039, 1018, 1017, 1014, 1007, 1002, 997

1-49Quantitative Frequency Distributions: Grouped Example Continued The MINITAB statistical software was used to generate the histogram.The histogram has 11 classes. Classes for the heights are along the x-axis and frequencies are along the y-axis.The number at the top of each rectangular box, represents the frequency for the class. 1-50Quantitative Frequency Distributions: Grouped - Example Continued1-51

Quantitative Frequency Distributions Grouped Example Continued ObservationsFrom the histogram, the classes (intervals) are 950 1000, 1000-1050, 1050 1100 etc. with corresponding frequencies of 1, 8, 5, etc.We will use this information to construct the group frequency distribution. 1-52Quantitative Frequency Distributions Grouped Example Continued Observations (continued)Observe that the upper class limit of 1000 for the class 950 1000 is listed as the lower class limit for the class 1000 1050. Since the value of 1000 cannot be included in both classes, we will use the convention that the upper class limit is not included in the class.

1-53Quantitative Frequency Distributions Grouped Example Continued Observations (continued)That is, the class 950-1000 should be interpreted as having the values 950 and up to 1000 but not including the value of1000.Using these observations, the grouped frequency distribution is constructed from the histogram and is given on the next slide.

1-54Quantitative Frequency Distributions Grouped Example Continued1-55Class (Bldg Height (ft))Frequency fRelative frequencyCumulative frequencyCumulative relative frequency950-100010.034510.03451000-105080.275990.31031050-110050.1724140.48181100-115040.1379180.62071150-120010.0345190.65521200-125020.0690210.72411250-130030.1034240.82761300-135000.0000240.82761350-140020.0690260.89661400-145000.0000260.89661450-150030.1034291.0000Total291.0000Quantitative Frequency Distributions Grouped Example Continued Observations (continued)In the previous slide with the grouped frequency distribution, the sum of the relative frequencies did add up to 1. This is due to coincidence, as rounding to four decimal places may or may not yield this.1-56Dot PlotsWhat is a dot plot? A dot plot is a plot that displays a dot for each value in a data set along a number line. If there are multiple occurrences of a specific value, then the dots will be stacked vertically.Example: The following frequency distribution shows the number of defectives observed by a quality control officer over a 30 day period. Construct a dot plot for the data.

1-57Dot Plots Example Continued1-58The next slide shows the dot plot for the number of defectives.Number of DefectsFrequency, f11233446596671Totaln = 30Dot Plots Example Continued1-59

Dot PlotsExample : The following frequency distribution shows the number of times an amateur golfer achieved par over the course of her last 20 18-hole rounds. Construct a dot plot for the data.

1-60Dot Plots Example Continued1-61The next slide shows the dot plot for the number of defectives.Number of pars achieved during last 20 rounds of golfFrequency, f518397116132141Total20Dot Plots Example Continued1-62

Bar Charts or Bar GraphsWhat is a bar chart (graph)? A bar chart or a bar graph is a graph that uses vertical or horizontal bars to represent the frequencies of the categories in a data set.Example: A sample of 300 college students was asked to indicate their favorite soft drink. The results of the survey are shown on the next slide. Display the information using a bar chart. 1-63Bar Charts Example Continued1-64The next slide shows the bar chart for the soft drink preferences of the students.Soft DrinkNumber of Students, fPepsi92Coke78Dr. Pepper487-Up42Others40Totaln = 300Bar Chart Example Continued1-65

Bar Charts - Quick TipBar charts are effective at reinforcing differences in magnitude.Bar charts are useful when the data set has categories (for example, hair color, gender, etc.).Bar charts are useful when the data are qualitative in nature.Note: The bars are equally separated.1-66Histograms Revisited 1-67Histogramwith 7 classes for theweights.

Histograms - Quick TipHistograms are useful when the data values are quantitative.A histogram gives an estimate of the shape of the distribution of the population from which the sample was taken.If the relative frequencies were plotted along the vertical axis to produce the histogram, the shape will be the same as when the frequencies are used.1-68Frequency PolygonsWhat is a frequency polygon? A frequency polygon is a graph that displays the data using lines to connect points plotted for the frequencies.Note: The frequencies represent the heights of the vertical bars in the histogram. Example: Display a frequency polygon for the weights of the 30 female students (presented previously).1-69Frequency Polygons - Example Continued1-70

FrequencyPolygonFrequency Polygons ObservationsThe frequency polygon is superimposed on the histogram.The polygon is mound-shaped.This indicates that the shape of the population from which the sample was taken is mound shaped.The line segments pass through the mid points at the top of the rectangles.The polygon is tied down at both ends.

1-71Stem-and-Leaf Displays or PlotsWhat is a stem-and-leaf plot? A stem-and-leaf plot is a data plot that uses part of a data value as the stem to form groups or classes and part of the data value as the leaf. Note: A stem-and-leaf plot has an advantage over a grouped frequency distribution, since a stem-and-leaf plot retains the actual data by showing them in graphic form.1-72Stem-and-Leaf Displays or Plots - ExampleExample: Consider the following values: 108, 96, 98, 107, 110, 104, 105 and 112. Construct a stem-and-leaf plot by using the units digits as the leaves.

1-73Stem-and-Leaf Plot Example Continued1-74Stems and leaves for the data values.Stem-and-leaf plot for the data values. Stem Leaf09 6 810 4 5 7 811 0 2DataStemLeaf9609698098104104105105107107108108110110112112Stem-and-Leaf Displays or Plots - Example Example: A sample of the number of admissions to a psychiatric ward at a local hospital during the full phases of the moon is as follows: 22, 30, 21, 27, 31, 36, 20, 28, 25, 33, 21, 38, 32, 35, 26, 19, 43, 30, 30, 34, 27, and 41. Display the data in a stem-and-leaf plot with the leaves represented by the unit digits.

1-75Stem-and-Leaf Plot Example Continued1-76 Stem Leaf1 92 0 1 1 2 5 6 7 7 83 0 0 0 1 2 3 4 5 6 84 1 3Time Series GraphsWhat is a time series graph? A time series graph is a plot which displays data that are observed over a given period of time.Note: From a time series graph, one can observe and analyze the behavior of the data over time.1-77Time Series Graphs -- Example Example: The following table gives the number of hurricanes for the years 1981 to 2005.

Display the data with a time series graph.

1-78Year81828384858687888990919293Hurricanes7235743578141Year94959697989900010203040506Hurricanes1331108894791Time Series Graphs Example Continued1-79

Time Series Graph for the Dow Jones Industrial Average from October 21, 2009 to October 23, 2009 Example1-80

Source: http://www.google.com/finance?client=ob&q=INDEXDJX:DJIPie Graphs or Pie ChartsWhat is a pie graph (chart)? A pie graph is a circular display that is divided into sectors (classes) according to the percentage of data values in each class.Note: A pie chart allows us to observe the proportions of the classes relative to the entire data set.Pie charts are readily used to display qualitative data.1-81Pie Graphs or Pie Charts - ExampleExample: present a pie chart for the soft drink data given earlier.1-82Soft DrinkNumber of Students, fPepsi92Coke78Dr. Pepper487-Up42Others40Totaln = 300Pie Graphs or Pie Charts - ExampleThe pie chart is presented on the next slide.Note: Each sector (slice) is proportional to the frequency count or percentage relative to the total sample size.1-83Pie Graphs or Pie Charts Example Continued1-84