general statistics

12
STEM AND LEAF PLOT tem-and-leaf plots are a method for showing the frequency with which certain classes of values occur. You could make a frequency distribution table or a histogram for the values, or you can use a stem-and-leaf plot and let the numbers themselves to show pretty much the same information. For instance, suppose you have the following list of values: 12, 13, 21, 27, 33, 34, 35, 37, 40, 40, 41. You could make a frequency distribution table showing how many tens, twenties, thirties, and forties you have: Frequency Class Frequency 10 - 19 2 20 - 29 2 30 - 39 4 40 - 49 3 You could make a histogram, which is a bar-graph showing the number of occurrences, with the classes being numbers in the tens, twenties, thirties, and forties:

Upload: ynel-quipanes

Post on 13-Sep-2015

217 views

Category:

Documents


3 download

DESCRIPTION

From a student- not mine

TRANSCRIPT

STEM AND LEAF PLOTtem-and-leaf plots are a method for showing the frequency with which certain classes of values occur. You could make a frequency distribution table or a histogram for the values, or you can use a stem-and-leaf plot and let the numbers themselves to show pretty much the same information.For instance, suppose you have the following list of values:12, 13, 21, 27, 33, 34, 35, 37, 40, 40, 41. You could make a frequency distribution table showing how many tens, twenties, thirties, and forties you have:FrequencyClassFrequency

10 - 192

20 - 292

30 - 394

40 - 493

You could make a histogram, which is a bar-graph showing the number of occurrences, with the classes being numbers in the tens, twenties, thirties, and forties:

(The shading of the bars in a histogram isn't necessary, but it can be helpful by making the bars easier to see, especially if you can't use color to differentiate the bars.)The downside of frequency distribution tables and histograms is that, while the frequency of each class is easy to see, the original data points have been lost. You can tell, for instance, that there must have been three listed values that were in the forties, but there is no way to tell from the table or from the histogram what those values might have been.

On the other hand, you could make a stem-and-leaf plot for the same data:

The "stem" is the left-hand column which contains the tens digits. The "leaves" are the lists in the right-hand column, showing all the ones digits for each of the tens, twenties, thirties, and forties. As you can see, the original values can still be determined; you can tell, from that bottom leaf, that the three values in the forties were40, 40, and41.Note that the horizontal leaves in the stem-and-leaf plot correspond to the vertical bars in the histogram, and the leaves have lengths that equal the numbers in the frequency table.

That's pretty much all there is to a stem-and-leaf plot. You're just listing out how many entries you have in certain classes of numbers, and what those entries are. Here are some more examples of stem-and-leaf plots, containinga few additionaldetails. Complete astem-and-leaf plot for the following list of grades on a recent test:73, 42, 67, 78, 99, 84, 91, 82, 86, 94I'll use the tens digits as the stem values and the ones digits as the leaves. For convenience sake, I'll order the list, but this is not required:42, 67, 73, 78, 82, 84, 86, 91, 94, 99

Subjects in a psychological study were timed while completing a certain task. Complete a stem-and-leaf plot for the following list of times:7.6, 8.1, 9.2, 6.8, 5.9, 6.2, 6.1, 5.8, 7.3, 8.1, 8.8, 7.4, 7.7, 8.2First, I'll reorder this list:5.8, 5.9, 6.1, 6.2, 6.8, 7.3, 7.4, 7.6, 7.7, 8.1, 8.1, 8.2, 8.8, 9.2These values have one decimal place, but the stem-and-leaf plot makes no accomodation for this. The stem-and-leaf plot only looks at the last digit (for the leaves) and all the digits before (for the stem). So I'll have to put a "key" or legend on this plotto show what I mean by the numbers in this plot. The ones digits will be the stem values, and the tenths will be the leaves.

Properly, every stem-and-leaf plot should have a key. Complete a stem-and-leaf plot for the following two lists of class sizes:Economics 101: 9, 13, 14, 15, 16, 16, 17, 19, 20, 21, 21, 22, 25, 25, 26Libertarianism: 14, 16, 17, 18, 18, 20, 20, 24, 29This example has two lists of values. Since the values are similar, I can plot them all on one stem-and-leaf plot by drawing leaves on either side of the stem. I will use the tens digits as the stem values, and the ones digits as the leaves. Since "9" (in the Econ 101 list)has no tens digit, the stem value will be "0".Copyright Elizabeth Stapel 2004-2011 All Rights Reserved

Complete a stem-and-leaf plot for the following list of values:100, 110, 120, 130, 130, 150, 160, 170, 170, 190,210, 230, 240, 260, 270, 270, 280. 290, 290Since all the ones digits are zeroes, I'll do this plot with the hundreds digits being the stem values and the tens digits being the leaves. I can do the plot like this:

...but the leaves are fairly long this way, because the values are so close together. To spread the values out a bit, I can break each leaf into two. For instance, the leaf for the two-hundreds class can be split into two classes, being the numbers between 200 and 240 and the numbers between 250 and 290. I can also reverse the order, so the smaller values are at the bottom of the "stem". The new plot looks like this:

For very compact data points, you can even split the leaves into five classes, like this:

Complete a stem-and-leaf plot for the following list of values:23.25, 24.13, 24.76, 24.81, 24.98, 25.31, 25.57, 25.89, 26.28, 26.34, 27.09If I try to use the last digit, the hundredths digit, for these numbers, the stem-and-leaf plot will be enormously long, because these values are so spread out. (With the numbers' first three digitsranging from232to270, I'd have thirty-nine leaves, most of which would be empty.) So instead of working with the given numbers, I'll round each of the numbers to the nearest tenth, and then use those new values for my plot. Roundinggives me the following list:23.3, 24.1, 24.8, 24.8, 25.0, 25.3, 25.6, 25.9, 26.3, 26.3, 27.1Then my plot looks like this:

Naturally, when you're drawing a stem-and-leaf plot, you should use a ruler to construct a neat table, and you should label everything clearly.ExampleA random sample of 64 people were selected to take the Stanford-Binet Intelligence Test. After each person completed the test, they were assigned an intelligence quotient (IQ) based on their performance on the test. The resulting 64 IQs are as follows:

Once the data are obtained, it might be nice to summarize the data. We could, of course, summarize the data using a histogram. One primary disadvantage of using a histogram to summarize data is that the original data aren't preserved in the graph. Astem-and-leaf plot, on the other hand, summarizes the data and preserves the data at the same time.The basic idea behind a stem-and-leaf plot is to divide each data point into a stem and a leaf. We could divide our first data point, 111, for example, into a stem of 11 and a leaf of 1. We could divide 85 into a stem of 8 and a leaf of 5. We could divide 83 into a stem of 8 and a leaf of 3. And so on. To create the plot then, we first create a column of numbers containing the ordered stems. Our IQ data set produces stems 6, 7, 8, 9, 10, 11, 12, 13, and 14. Once the column of stems are written down, we work our way through each number in the data set, and write its leaf in the row headed by its stem.Here's what the our stem-and-leaf plot would look like after adding the first five numbers 111, 85, 83, 98, and 107:

and here's what the completed stem-and-leaf plot would look like after adding all 64 leaves to the nine stems:

Now, rather than looking at a list of 64 unordered IQs, we have a nice picture of the data that quite readily tells us that: the distribution of IQs is bell-shaped most of the IQs are in the 90s and 100s the smallest IQ in the data set is 68, while the largest is 141That's all well and good, but we could do better. First and foremost, no one in their right mind is going to want to create too many of these stem-and-leaf plots by hand. Instead, you'd probably want to let some statistical software, such as Minitab or SAS, do the work for you. Here's what Minitab's stem-and-leaf plot of the 64 IQs looks like:

Hmmm.... how does the plot differ from ours? First, Minitab tells us that there aren= 64 numbers and that the leaf unit is 1.0. Then, ignoring the first column of numbers for now, the second column contains the stems from 6 to 14. Note, though, that Minitab uses two rows for each of the stems 7, 8, 9, 10, 11, 12, and 13. Minitab takes an alternative here that we could have taken as well. When you opt to use two rows for each stem, the first row is reserved for the leaves 0, 1, 2, 3, and 4, while the second row is reserved for the leaves 5, 6, 7, 8, and 9. For example, note that the first 9 row contains the 0 to 4 leaves, while the second 9 row contains the 5 to 9 leaves. The decision to use one or two rows for the stems depends on the data. Sometimes the one row per stem option produces the better plot, and sometimes the two rows per stem plot option produces the better plot.Do you notice any other differences between Minitab's plot and our plot? Note that the leaves in Minitab's plot are ordered. That's right... Minitab orders the data before producing the plot, and thereby creating what is called anordered stem-and-leaf plot.Now, back to that first column of numbers appearing in Minitab's plot. That column contains what are calleddepths. The depths are the frequencies accumulated from the top of the plot and the bottom of the plot until they converge in the middle. For example, the first number in the depths column is a 1. It comes from the fact that there is just one number in the first (6) stem. The second number in the depths column is also a 1. It comes from the fact that there is 1 leaf in the first (6) stem and 0 leaves in the second (the first 7) stem, and so 1 + 0 = 1. The third number in the depths column is a 3.It comes from the fact that there is 1 leaf in the first (6) stem, 0 leaves in the second (the first 7) stem, and 2 leaves in the third (the second 7) stem, and so 1 + 0 + 2 = 3. Minitab continues accumulating numbers down the column until it reaches 32 in the last 9 stem. Then, Minitab starts accumulating from the bottom of the plot. The 5 in the depths column comes, for example, fromthe fact that there is 1 leaf in the last (14) stem, 1 leaf in the second 13 stem, 0 leaves in the first 13 stem, 1 leaf in the second 12 stem, and 2 leaves in the first 12 stem, and so 1 + 1+ 0 + 1 + 2 = 5.Let's take a look at another example.ExampleLet's consider a random sample of 20 concentrations of calcium carbonate (CaCO3) in milligrams per liter.

Create a stem-and-leaf plot of the data.Solution.Let's take the efficient route, as most anyone would likely taken in practice, by letting Minitab generate the plot for us:

Minitab tells us that the leaf unit is 0.1, so that the stem of 127 and leaf of 8 represents the number 127.8. The depths column contains something a little different here, namely the 7 with parentheses around it. It seems that Minitab's algorithm for calculating the depths differs a bit here. It still accumulates the values from the top and the bottom, but it stops in each direction when it reaches the row containing the middle value (median) of the sample. The frequency of that row containing the median is simply placed in parentheses. That is, the median of the 20 numbers is 131.45. Therefore, because the 131 stem contains 7 leaves, the depths column for that row contains a 7 in parentheses.In our previous example, the median of the 64 IQs is 99.5. Because 99.5 falls between two rows of the display, namely between the stems 99 and 100, Minitab calculates the depths instead as described in that example, and omits the whole "parentheses around the frequency of the median row" thing.