probability

18
Probability Normal Distribution

Upload: imani-fisher

Post on 30-Dec-2015

29 views

Category:

Documents


3 download

DESCRIPTION

Probability. Normal Distribution. What is Normal Distribution?. Any event can have at least one possible outcome. A trial is a single event. An experiment consists of the same trial being performed repeatedly under the same conditions. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Probability

Probability

Normal Distribution

Page 2: Probability

What is Normal Distribution?Any event can have at least one possible

outcome.

A trial is a single event. An experiment consists of the same trial being performed repeatedly under the same conditions.

If an experiment is performed with enough trials, the populations of each possible outcome can be distributed according to different patterns.

this is a typical Poisson

Distribution. Note the lack of

symmetry.

This is the symmetrical Gaussian, or

Normal Distribution. Learn this!

notice the numbers. We’ll deal with them later

Page 3: Probability

How it works...The Normal Distribution is characterised by

grouped continuous data.

The typical graph is a histogram of the populations of each grouped range of possible outcome valuesFor example, if we

returned to the era of norm-standardised testing for NCEA, then the distribution of test scores, as percentages, would look something like this:

To pass, you would need a score of 50% or greater. Notice about 50% of all candidates achieved this score.

Page 4: Probability

that 50% pass-rate is quite important.

For any Normally-Distributed data, the central peak is the mean, μ.

AND

50% of all data is <μ;

which means 50% of all data is >μ

Page 5: Probability

a quick bit of revision...

the standard deviation now comes into its own.

Recall:

For a set of continuous data, the mean, μ, is a measure of central tendancy - it is one value that represents the peak data value population.

The majority of the data does not equal μ.

The standard deviation, σ, is analogous to the mean difference of every data value from μ.

and now, back to the graph and it’s numbers...

Page 6: Probability

features of Normal Distribution...

the peak is the mean, μ. 50% of the data lie either

side of μ.

68% of all data lie within 1σ of μ

95% of all data lie within 2σ of μ

99% of all data lie within 3σ of μ

the distribution is symmetrical

about μ

the x-axis is asymptotic

Page 7: Probability

summary:the x-axis is asymptotic

the peak is the mean, μ.

the distribution is symmetrical about μ; 50% of the data lie either side of μ.

68% of all data lie within 1σ of μ

95% of all data lie within 2σ of μ

99% of all data lie within 3σ of μ} these

percentages are rounded

this distribution lets us calculate the probability that any outcome will be within a specified range of values

Page 8: Probability

Ah, the wonder that is uThe Normal Distribution of outcome

frequencies is defined in terms of how many standard deviations either side of the mean contain a specified range of outcome values.

In order to calculate the probability that the outcome of a random event X will lie within a specified multiple of σ either side of μ, we use an intermediate Random Variable, Z.

Z

X - μσZ =

For this relationship to hold true, for Z, σ = 1 and μ = 0; hence, for a Normally distributed population, the range is from -3σ to +3σ

Page 9: Probability

Z, PQR, and YouProbability calculations using Z give the likelihood that an outcome will be within a specified multiple of σ from the mean. There are three models used:P(t) = the probability that an outcome t is any value of X up to a defined multiple of σ beyond μ

P(Z<t) ≡ P(μ<Z<t) + 0.5Q(t) = the probability that an outcome t is any value of X between μ and a defined multiple of σ

P(μ<Z<t)R(t) = the probability that an outcome t is any value of X greater than a defined multiple of σ below μ

P(Z>t)

Page 10: Probability

Solving PQR ProblemsRead the problem carefully.

Draw a diagram - sketch the Bell Curve, and use this to identify the problem as P, Q or R

You could now use the Z probability tables to calculate P, Q or R, or use a Graphic Calculator such as the Casio fx-9750G Plus

1. Enter RUN mode.2. OPTN3. F64. F35. F6This gives the F-menu for PQR. 1. Choose the function (P, Q or R) appropriate to your problem;

2. enter the value of t, and EXE.

Page 11: Probability

Calculating Z from Real DataThe PQR function assumes a perfectly-symmetrical distribution

about μ. Real survey distributions are rarely perfect.

For any set of real data, we can calculate μ and σ, and therefore Z.

For example, if μ=33 and σ=8, then to find P(X<20):

P(X<20) = P(Z < )[ ]20-μσ

= P(Z < )[ ]20-338

= P(Z < -1.625)

Now, use the R function, and subtract the result from 1.

SO... use

to calculate Z, and then use PQR.

X - μσZ =

Page 12: Probability

continuity•Normally-distributed data is often

continuous.

•If asked to calculate probability for continuous data above a value q, apply the principle of measurement error, and take 0.5 the basic unit above the stated value.

•This is because any value in the range q-0.5 to q+0.5 will be recorded as q.

Page 13: Probability

Inverse NormalThis is the reverse process to finding the

probability.

Given the probability that an event’s outcome will lie within a defined range, we can rearrange the Z equation to give X = Z σ

+ μBut... we cannot define X, as it represents

the entire range of values of all possible outcomes.What the equation will give us is the value k.k is the upper or lower limit of the range

of X that is included in the P calculations

Page 14: Probability

Use the PQR model, and sketch a bell curve to identify the regions being included in the p range.Use the ND table to find the value of Z:Z range is from -1 to +1, so find 0.982 - 0.5 = 0.482

Gives Z = 2.097

So, k = 4 x 2.097 + 25 = 33.388or, the short way...

using a graphic calculator, for example the trusty Casio fx9750G;

•MODE: STATS•F5 F1 F3➜ ➜•Area = probability, as a decimalσ =μ = EXECUTE

an example... if X is a normally-distributed variable with σ=4, μ=25, and p(X<k) = 0.982, what is k?

The long way...

Page 15: Probability

Combinations of Variance

•The real world is rarely a simple place. However, apparently complex relationships can be rationalised to form straightforward equations.

•In addition to functional relationships that involve a single Random Variable, there can be interactions between two or more independent random variables, X and Y.

Page 16: Probability

Sum of Random Variables•For each Random Variable there is a

calculable variance - this is true irrespective of the number of possible outcomes.

•Sums of Random Variables occur when we want the likelihood of a specific pair of outcomes (T) from two independent events;

X + Y = T

Simply,

VAR(T) = VAR(X + Y) = VAR(X) + VAR(Y)

and

VAR(X - Y) = VAR(X) + VAR(Y)

Got it? Whether adding or subtracting, you always add the

independent Variances!

Got it? Whether adding or subtracting, you always add the

independent Variances!

Page 17: Probability

Linear Combinations of

Random VariablesWe know that for a single Random Variable X, the linear function is

E(aX + b) = aE(X) + b, and VAR(aX + b) = a2VAR(X)

If we introduce a second Random Variable Y,

E(aX + bY) = aE(X) + bE(Y)

so

VAR(aX + bY) = a2VAR(X) + b2VAR(Y)

NOTE: this only holds true if X

and Y are independent

Page 18: Probability

an example...A hydroponic lettuce grower has her weekly costs expressed by two random variables - the number of plants X, and the liquid fertiliser concentrate costs Y. Both variables are independent. The standard deviation of X is 150, and the standard deviation of Y is 4 litres. Each lettuce costs $0.50 to irrigate and each litre of concentrate costs $10. Find the standard deviation of her costs.

X: σ = 150, so VAR(X) = 1502 = 22500

Y: σ = 4, so VAR(Y) = 42 = 16

VAR(0.5X + 10Y) = 0.52VAR(X) + 102VAR(Y)

= (0.25 x 22500) + (100 x 16)

= 5625 + 1600

= 7225

so, σ = 7225½ = $85