l6 normal distribution

Upload: donald-yum

Post on 06-Mar-2016

232 views

Category:

Documents


0 download

DESCRIPTION

lecture notes

TRANSCRIPT

  • 1

    Chapter 6

    The Normal Distribution

    David Chow

    Oct 2014

  • 2

    Learning Objectives

    In this chapter, you will learn:

    To compute probabilities from the normal

    distribution

    To determine whether a set of data is

    approximately normal

  • 3

    Importance of a Normal Dist

    Many continuous variables seem to be normally

    distributed

    Many discrete variables can be approximated by a

    normal distribution

    Eg: The binomial distribution is symmetric when n is

    large (more precisely, when n 5 and n(1-) 5)

    By the central limit theorem, sampling distributions

    are approximately normal (to be discussed in ch7)

  • 4

    Properties

    Bell-shaped & symmetric

    By symmetry, = median = mode

    Location is characterized by ,

    Spread is characterized by .

    The variable X has infinite range

    I.e., - < X < +

    In symbols, X ~ N (, 2).

    Remark: f(X) =

    f(X)

    X

    can take

    any values.

  • 5

    Probability and Area

    The bell-shaped curve is called a density

    function.

    Probability of X is found by the

    corresponding area under the density curve.

    Hence, total area under the curve = 1

    Such area (probability) can be found from

    statistical tables in any statistics textbook.

  • 6

    Probability and Area

    Unlike discrete probability distributions, the probability of a particular value from a continuous distribution is zero.

    Eg: P (download time = 4s) = 0

    Reason 1: Probability is the area under the density curve.

    Reason 2: A continuous variable has infinitely many possible values.

  • 7

    Probability and Area

    If X is continuous, a probability is meaningful if it

    corresponds to a range (or an interval) of X.

    Eg: P ( download time < 4.0s)

    Eg: P (a X b), X = height

    Whether it is

  • 8

    Shape

    By varying and , we obtain different normal distributions.

    Eg: Waiting time (X) at two university train stations:

    1. At Shatin Univ, = 6 min, = 1.8 min,

    2. At Pokfulam Univ, = 5 min, = 1.5 min.

    Identify (1) & (2) in the graph. What do they have in common?

    = 1.35 min

  • 9

    The Standardized Normal

    Any normal distribution (with any and ) can be transformed into the standardized normal distribution (Z).

    Transformation formula:

    The random variable Z is also normally distributed, with = 0 and = 1

    I.e., Z ~ N (0, 1).

    Z = no. of std away from mean

    XZ

  • 10

    Eg: Waiting Time

    2.050

    100200

    XZ

    X = waiting time for customers at a bank

    X is normally distributed with a mean of 100s and

    standard deviation of 50s

    X ~ N ( = 100, 2 = 502)

    1. Find the Z-value for X = 200s

    2. What is X if Z = -1.5?

    Z = +2.0 means X = 200s is two std ____ mean

    Z = -1.5 means X = ____

    ANSWER

  • 11

    Eg: Height https://www.youtube.com/watch?v=4R8xm19DmPM

    Fig1: Suppose the height

    of adult US females is

    normally distributed

    Mean = 162.2cm

    Standard deviation = 6.8cm

    Fig2: What is the

    probability a randomly

    selected female is

    taller than 170.5cm?

    Fig3: X to Z

  • 12

    Example

    Z

    100

    2.0 0

    200 X ( = 100, = 50)

    ( = 0, = 1)

    The transformation does not change the

    shape, only the ______ has changed

    The same distribution can be expressed

    in original units (X), or

    in standardized units (Z)

  • 13

    Finding Normal Probability

    Standardized normal distribution

    Row: value of Z to the 1st decimal point

    Column: value of Z to the 2nd decimal point

    Cumulative standardized normal distribution

  • 14

    Finding Normal Probability

    The Cumulative Z-Table (attached) gives the probability of ____

    To find P (a < X < b) where X ~ N (, 2),

    Translate X-values to Z-values,

    Check the required probability from the table

    A visual check is often useful

    X

    Area Z

    Summarizing this

    chapter with a chart

  • 15

    Eg1: P (Z < 2)

    The value within the table gives the probability from Z = up to the desired Z value.

    .9772

    P (Z < 2.00) = .9772

    Remember the empirical rule?

    2.0

    .

    .

    .

    Z 0.00 0.01 0.02

    0.0

    0.1

    Z 0 2.00

    0.9772

  • 16

    Eg2: Standardized Normal Distribution

    a. Find the standard deviation of the normally distributed variable x.

    b. What are the required probabilities?

    ANSWER

  • 17

    Eg3: Verify the Empirical Rule

    range 6

  • 18

    Eg4: Downloading

    Time

    X = the time it takes (in s) to download an image file, X ~ N ( = 8.0, 2 = 5.02).

    Find P(X < 8.6)

    X

    8.6 8.0

    = 8

    = 5

    Z 0.12 0

    = 0

    = 1

    = 0.5478

    = P(Z < 0.12)

    P(X < 8.6)

    Z .00 .01 .02

    0.0 .5000 .5040 .5080

    0.1 .5398 .5438 .5478

    0.2 .5793 .5832 .5871

    0.3 .6179 .6217 .6255

    Next, find P(8.0 < X < 8.6)

  • 19

    Eg4: Downloading Time (Find X Given the Probability)

    Find X such that 20% of download

    times are less than X.

    First, use the table to find the Z-value of the

    given probability of 0.20. Z = ____

    Second, convert the Z-value to X units using

    the transformation formula.

    So 20% of the download times are ____. X ? 8.0

    .2000

    Z ? 0

    Z . .03 .04 .05

    -0.9 . .1762 .1736 .1711

    -0.8 . .2033 .2005 .1977

    -0.7 . .2327 .2296 .2266 80.3

    0.5)84.0(0.8

    ZX

  • 20

    Assessing Normality

    There are different ways to assess normality. For example,

    1. Graphically, construct a histogram or a box-and-whisker plot.

    2. Check the descriptive measures:

    Do the mean, median and mode have similar values?

    Is the range approximately 6?

    3. Use the empirical rule:

    About 67% of the observations lie within .

    About 95% of the observations lie within 2.

    4. A more precise alternative is to construct the normal probability plot

    (not included)

  • 21

    Review Questions: T or F

    1. In a standard normal distribution, the probability that Z is greater than 0.5 is 0.5

    2. In a standard normal distribution, the probability that Z is greater than 1.96 is 2

    3. For a continuous random variable x, the probability density function f(x) represents the probability at a given value of x

    4. Larger values of the standard deviation result in a normal curve that is shifted to the left

  • 22

    Review Question

    Source: Educator.com

    https://www.youtube.com/watch?v=bYnIIZbeFes

  • 23

    Appendix

    Cumulative-Z

    & Excel Commands

  • 24

  • 25

    Excel Command

    FIND AREA

    Find cumulative probability (i.e., area from the left) given X- or Z- values

    =NORMDIST(X, , , true) returns the cumulative probability of a normal distribution. Eg: =NORMDIST(5, 4, 1, true)

    gives P(X < 5, given = 4, = 1), I.e., 0.8413.

    =NORMSDIST(Z) gives the

    cumulative probability of a standardized normal. Eg: =NORMSDIST(0.12) gives

    P(Z < 0.12).

    FIND X-VALUES

    Find X- or Z- values given the cumulative probability (i.e., area from the left)

    =NORMSINV(cumulative probability) Eg: =NORMSINV(0.5)

    =NORMINV(cumulative

    probability, , )