mcgraw-hill ryerson data management 12 section 5.1 7.1 continuous random variables

23
McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

Upload: silas-alexander

Post on 18-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

McGraw-Hill Ryerson

Data Management 12

Section 5.17.1Continuous Random Variables

Page 2: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

5.1

Success Criteria

I will know I am successful when I can • identify and distinguish between continuous and discrete variables• represent a uniform probability distribution using a rectangular area model or a table• create a frequency histogram and a frequency polygon to represent a sample of values of acontinuous random variable (using technology or with pencil and paper)

What are some other success criteria?

I am learning to • distinguish between discrete variables and continuous variables• work with sample values for situations that can take on continuous values• represent a probability distribution using a mathematical model• represent a sample of values of a continuous random variable using a frequency table, a frequency histogram, and a frequency polygon

7.1 Continuous Random Variables

Page 3: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

Click to Reveal

Example: The number of bees in a hive is discrete data.The mass of honey produced is continuous data.

A beekeeper collects data such as the number of bees in a hive or the amount of honey produced by the bees in a hive.

Suggest some possible values for the number of bees in a hive and for the amount of honey produced in a hive. You may wish to use the Internet to help you provide reasonable estimates.

7.1 Continuous Random Variables

Example: There could be 50 000 bees in a hive. One hive might make about 50 kg of honey in a year.

What is different about the types of numbers used for each?

Page 4: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

5.1

Investigate Comparing Discrete and Continuous Random Variables

1. Consider attributes of students in your class, such as number of siblings or height.a) List several attributes that are counted using discrete values.b) List several attributes that are measured using continuous values.

2. Some students were asked for the number of siblings in their families. The table shows the results.a) Classify the number of siblings as a discrete or a continuous variable. Explain your reasoning.b) Represent the data using a histogram.

7.1 Continuous Random Variables

Page 5: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

5.1

Investigate Comparing Discrete and Continuous Random Variables

3. Students recorded the time, to the nearest minute, spent on math homework one evening. The table shows the results.a) Classify time as a discrete or a continuous variable. Explain your reasoning.

b) Why is the time shown in intervals?

c) Draw a scatter plot of these data. For the time value, use the midpoint of each interval. Sketch a smooth curve through the points on the scatter plot.

7.1 Continuous Random Variables

Page 6: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

5.1

Investigate Comparing Discrete and Continuous Random Variables

d) Reflect Does the shape of the curve make sense? Explain.

e) Extend Your Understanding Consider the choice of intervals in the table. Why must you be careful not to have too few or too many intervals?

7.1 Continuous Random Variables

Page 7: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

5.1

Example 1 At a local supermarket, a new checkout lane is opened whenever the wait time is more than 6 min. As a result, the time required for a customer to wait at the checkout lanes varies from 0 to 6 min, with all times in between being equally likely.a) What kind of a distribution is this? How do you know?b) Sketch a graph that illustrates this distribution.

Determine a Probability Using a Uniform Distribution

7.1 Continuous Random Variables

c) What is the probability that a customer will wait between 3 min and 6 min to check out?d) How many values are possible for the time required to be served at the checkout? Explain your answer.e) Is it possible to determine the probability that a customer will need to wait exactly 3 min at the checkout lane, using the area under the graph? Explain your answer.

Page 8: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

5.1

Example 2 The heights of all students in a mathematics of data management class are measured to the nearest centimetre, and recorded in the table.

Frequency Table, Frequency Histogram, Frequency Polygon

7.1 Continuous Random Variables

a) Can you use the data in the table to determine whether the data seem to follow a uniform distribution? Can you make a reasonable estimate of the mean height in this class?

Page 9: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

5.1

Example 2 b) Use a table like the one below to determine the frequency for each interval.

Frequency Table, Frequency Histogram, Frequency Polygon

7.1 Continuous Random Variables

If a data value falls on the boundary between two intervals, it is usually placed in the lower interval. For example, you would record a data value of 160 cm in the 150 cm–160 cm interval.

Click for Hint

Page 10: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

5.1

Example 2 c) Using the completed frequency table, can you now answer part a) more easily?d) In what ways can a frequency table help you to analyse the raw data from a sample like this one?e) Use the frequency table to draw a frequency histogram. Then add a frequency polygon to the histogram.

Frequency Table, Frequency Histogram, Frequency Polygon

7.1 Continuous Random Variables

f) How is the shape of the frequency polygon related to the shape of the probability density distribution for height? Can you use the area under the frequency polygon to calculate probabilities for any range of values?

Page 11: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

ReflectR1. Fred notices that all of his height measurements are whole centimetres, and because of this, he says that height is a discrete variable. Do you agree or disagree? Explain.

R2. Why is it useful to construct a tally column before you complete the frequency column in a frequency table? Why would you not just create the frequency column directly from the raw data?

Click to Reveal

Disagree. Even if all the values used are whole numbers, in theory, one could measure height to any degree of precision that a measuring device allows. The variable is continuous even if Fred’s data use only discrete values.

The tally column is helpful while you are still counting, as the final tally keeps changing. If you record directly in a frequency column, you might have to keep erasing and changing your numbers as you count the raw data.

7.1 Continuous Random Variables

Page 12: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

Click for Answer

1. True or false?

Any variable that could have decimal or fractional numbers must be a continuous variable.

False

7.1 Continuous Random Variables

Page 13: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

Click for Answer

2. True or false?

Even though the mean time to check out at a local supermarket is 3 min, the probability of taking exactly 3 min to check out is zero.

True

7.1 Continuous Random Variables

Page 14: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

Click for Answer

3. Select the best answer.

Which variable is discrete?

A barometric pressure

B number of raindrops that fall on your hat

C temperature in degrees Celsius

D temperature in degrees Fahrenheit

B

7.1 Continuous Random Variables

Page 15: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

Click for Answer

4. Select the best answer.

Paul needs to add a data value of exactly $20 to the frequency table. Which interval should he add it to?

A $10–$20

B $20–$30

C Either A or B

D Neither A nor B A

7.1 Continuous Random Variables

Page 16: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

Section 5.1

The following pages contain solutions for the previous

questions.

Page 17: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

Solutions

Investigate Comparing Discrete and Continuous Random Variables

Examples: number of siblings, number of coins in pocket, number of books in bag

1. Consider attributes of students in your class, such as number of siblings or height.a) List several attributes that are counted using discrete values.

Examples: weight, height, longest jumping distance, reaction time

b) List several attributes that are measured using continuous values.

Number of siblings is a discrete variable; it can take on only whole number values.

2. Some students were asked for the number of siblings in their families. The table shows the results.a) Classify the number of siblings as a  discrete or a continuous variable.Explain your reasoning.

b) Represent the data using a histogram.

Page 18: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

Solutions

Investigate Comparing Discrete and Continuous Random Variables

Since time is a continuous variable, there are an infinite number of possible values. It must therefore be grouped into intervals.

3. Students recorded the time, to the nearest minute, spent on math homework one evening. The table shows the results.a) Classify time as a discrete or a continuous variable. Explain your reasoning.

Time is a continuous variable. It can take on any real value.

b) Why is the time shown in intervals?

c) Draw a scatter plot of these data. For the time value, use the midpoint of each interval. Sketch a smooth curve through the points on the scatter plot.

Page 19: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

Solutions

Investigate Comparing Discrete and Continuous Random Variables

Example: If there are too few intervals, the shape of the distribution will not be apparent because most of the entries will be in only a small number of intervals. If there are too many intervals, there will be few or no entries in each interval, and this will also obscure the shape of the distribution.

Example: Yes. The shape of the curve makes sense because most people in the class will spend an average amount of time (35 min) on homework, and the number of students who spend more or less time will gradually decrease as you go farther from that average time.

d) Reflect Does the shape of the curve make sense? Explain.

e) Extend Your Understanding Consider the choice of intervals in the table. Why must you be careful not to have too few or too many intervals?

Page 20: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

Solutions

Since all outcomes are equally likely, this is a uniform distribution.

a) What kind of a distribution is this? How do you  know?

Determine a Probability Using a Uniform Distribution

Example 1

Since all values from 0 min to 6 min are equally probable, the graph is a horizontal line from 0 min to 6 min. The area under the graph represents the total of all of the probabilities. Therefore the area must equal 1. The base of the rectangle has a length of 6 min.

b) Sketch a graph that illustrates this distribution.

Page 21: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

Solutions

Determine a Probability Using a Uniform Distribution

Example 1

The probability that a customer will wait between 3 min and 6 min is equal to the shaded area under the graph from 3 min to 6 min.

c) What is the probability that a customer will wait between 3 min and 6 min  to check out?

Since this is a continuous distribution, any real number between 0 and 6 min is a possible value. An infinite number of possible values exist for the time required to be served at the checkout.

d) How many values are possible for the time required to be served at the checkout? Explain your answer.

If you pick a single value such as 3 min, the rectangle under the graph will have a width of 0 min. The probability for a single value of a continuous distribution is 0. The area cannot be used for single values of a continuous variable, only for a range of values.

e) Is it possible to determine the probability that a customer will need to wait exactly 3 min at the checkout lane, using the area under the graph? Explain your answer.

Page 22: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

Solutions

No. The data are difficult to analyse in this form. It is not obvious whether the distribution is uniform or not. Similarly, it is difficult to estimate the value of the mean with any accuracy.

a) Can you use the data in the table to determine whether the data seem to follow a uniform distribution? Can you make a reasonable estimate of the mean height in this class?

Frequency Table, Frequency Histogram, Frequency Polygon

Example 2

If a data value falls on the boundary between two intervals, it is usually placed in the lower interval. For example, you would record a data value of 160 cm in the 150 cm–160 cm interval.

b) Use a table like the one shown to determine the frequency for each interval.

Click for Hint

Yes. From the frequency table, it appears that the frequencies vary from 0 to 8. The distribution is not uniform. The mean height appears to be around 165 cm.

c) Using the completed frequency table, can you now answer part a) more easily?

Page 23: McGraw-Hill Ryerson Data Management 12 Section 5.1 7.1 Continuous Random Variables

Solutions

The frequency table groups the raw data into intervals. The frequency in each interval makes the shape of the distribution more obvious (if you turn your head sideways, the tally column looks like a rudimentary histogram) and gives an indication of the location of the mean.

d) In what ways can a frequency table help you to analyse the raw data from a sample like this one?

Frequency Table, Frequency Histogram, Frequency Polygon

Example 2

e) Use the frequency table to draw a frequency histogram. Then add a frequency polygon to the histogram.

The shape of the frequency polygon gives an indication of the shape of the probability distribution for height but it represents a small sample relative to the overall population that it represents. Also, the total area under the frequency polygon is not equal to 1. No. You cannot calculate probabilities using areas under the frequency polygon. You need to use a probability distribution to determine probabilities.

f) How is the shape of the frequency polygon related to the shape of the probability density distribution for height? Can you use the area under the frequency polygon to calculate probabilities for any range of values?