learn this statistics
TRANSCRIPT
-
8/2/2019 Learn This Statistics
1/49
Statistics for Engineers
Antony Lewis
http://cosmologist.info/teaching/STAT/
-
8/2/2019 Learn This Statistics
2/49
-
8/2/2019 Learn This Statistics
3/49
1 2
42%
58%
Starter question
Have you previously done any statistics?
1. Yes
2. No
-
8/2/2019 Learn This Statistics
4/49
BOOKS
Chatfield C, 1989. Statistics for
Technology, Chapman & Hall, 3rd ed.
Mendenhall W and Sincich T,1995. Statistics for Engineering
and the Sciences
-
8/2/2019 Learn This Statistics
5/49
Books
Devore J L, 2004.Probability and Statistics forEngineering and the Sciences,
Thomson, 6th ed.
Wikipedia also has good articles on many topics covered in the course.
Miller and Freund's Probability
and Statistics for Engineers
Richard A. Johnson
-
8/2/2019 Learn This Statistics
6/49
Workshops
- Doing questions for yourself is very important to learn the material
- Hand in questions at the workshop, or by 12 noon on Monday theweek after receiving it (at the maths school office in Pevensey II).
- Marks do not count, but good way to get feedback
-
8/2/2019 Learn This Statistics
7/49
Probability
Event: a possible outcome or set of possible outcomes ofan experiment or observation. Typically denoted by acapital letter: A, Betc.
Probability of an event A: denoted by P(A).
E.g. The result of a coin toss
E.g. P(result of a coin toss is heads)
Measured on a scale between 0 and 1 inclusive. If A is impossible
P(A) = 0, if A is certain then P(A)=1.
-
8/2/2019 Learn This Statistics
8/49
Event has not occurred
Event has occurred
If there a fixed number of equally likely outcomes () is the fraction of theoutcomes that are in A.
E.g. for a coin toss there are two possible outcomes, Heads or Tails
All possible outcomes
Intuitive idea: P(A) is the typical fraction of times A would occur if anexperiment were repeated very many times.
HT
A
P(result of a coin toss is heads) = 1/2.
-
8/2/2019 Learn This Statistics
9/49
Probability of a statement S:P(S) denotes degree of belief that S is true.
Conditional probability: P(A|B) means the probability of A given that Bhas happened or is true.
E.g. P(tomorrow it will rain).
e.g.P(result of coin toss is heads | the coin is fair) =1/2
P(Tomorrow is Tuesday | it is Monday) = 1
P(card is a heart | it is a red suit) = 1/2
-
8/2/2019 Learn This Statistics
10/49
Conditional Probability
In terms of P(B) and P(A and B) we have
B
() gives the probability of an event in the B set. Given that the event is inB, (|) is the probability of also being in A. It is the fraction of the outcomes that are also in
Probabilities are always conditional on something, for example prior knowledge, butoften this is left implicit when it is irrelevant or assumed to be obvious from the context.
-
8/2/2019 Learn This Statistics
11/49
Rules of probability
1. Complement Rule
Denote all events that are not A as Ac.
Since either A or not A must happen, P(A) + P(Ac) = 1.
E.g. when throwing a fair dice, P(not 6) = 1-P(6) = 1
1/6 = 5/6.
Hence
P(Event happens) = 1 - P(Event doesn't happen)
so 1 1 ()
-
8/2/2019 Learn This Statistics
12/49
We can re-arrange the definition of the conditional probability
2. Multiplication Rule
()
()or
You can often think of ( and ) as being the probability of first getting with probability (), and then getting with probability .
This is the same as first getting with probability () and then getting with probability .
-
8/2/2019 Learn This Statistics
13/49
Example:
A batch of 5 computers has 2 faulty computers. If thecomputers are chosen at random (withoutreplacement), what is the probability that the first twoinspected are both faulty?
Answer:
P(first computer faulty AND second computer faulty)
= P(first computer faulty) P(second computer faulty | first computer faulty)
=
25
14
220 1
10
Use
()
-
8/2/2019 Learn This Statistics
14/49
1 2 3 4
18%
10%10%
62%
Drawing cards
Drawing two random cards from a packwithout replacement, what is the
probability of getting two hearts?[13 of the 52 cards in a pack are hearts]
1. 1/16
2. 3/51
3. 3/524. 1/4
-
8/2/2019 Learn This Statistics
15/49
first is a heart AND second is a heart
first is a heart
second is a heart first is a heart)
Drawing cards
Drawing two random cards from a packwithout replacement, what is the
probability of getting two hearts?
To start with 13/52 of the cards are hearts.
After one is drawn, only 12/51 of the remaining cards arehearts.
So the probability of two hearts is
1352 1251
14
1251
351
-
8/2/2019 Learn This Statistics
16/49
Special Multiplication Rule
If two events A and Bare independentthen P(A| B) = P(A) and P(B| A) = P(B):
knowing that A has occurred does not affect the probability that Bhasoccurred and vice versa.
P(A and B) ()
Probabilities for any number of independent events can be multiplied to get thejoint probability.
In that case
E.g. A fair coin is tossed twice, what is the chance of getting a head and then a tail?
E.g. Items on a production line have 1/6 probability of being faulty. If you select threeitems one after another, what is the probability you have to pick three items to find thefirst faulty one?
P(H1 and T2) = P(H1)P(T2) = x = .
1st OK 2nd OK 3rd faulty
5
6
5
6
1
6
25
216 0.116..
-
8/2/2019 Learn This Statistics
17/49
+ - =
Note: A or B = includes the possibility that both A and Boccur.
3. Addition Rule
For any two events and ,
or + ( )
-
8/2/2019 Learn This Statistics
18/49
1 2 3 4 5
7%4%
0%
75%
14%
Throw of a die
Throwing a fair dice, let events be
A = get an odd numberB = get a 5 or 6
What is P(A or B)?
1. 1/6
2. 1/3
3. 1/24. 2/3
5. 5/6
-
8/2/2019 Learn This Statistics
19/49
Throw of a die
Throwing a fair dice, let events be
A = get an odd number
B = get a 5 or 6
What is P(A or B)?
or +
This is consistent since 1,3,5,6
odd + 5 or 6 5
3
6 +
2
6
1
6
4
6
2
3
-
8/2/2019 Learn This Statistics
20/49
Probability of not getting either A or B = probability of not getting A and not getting B
i.e. P(A or B) = 1P(not A and not B)
1 ( )
Alternative
A
=
=
Complements Rule
-
8/2/2019 Learn This Statistics
21/49
={2,4,6}, = {1,2,3,4} so {2,4}.
Throw of a dice
Throwing a fair dice, let events be
A = get an odd number
B = get a 5 or 6
What is P(A or B)?
Alternative answer
Hence
or 1 1 2,4 1 13
23
-
8/2/2019 Learn This Statistics
22/49
This alternative form has the advantage of generalizing easily to lots of possible
events:
or or or 1 ( )Remember: for independent events, P .
Lots of possibilities
Example: There are three alternative routes A, B, or C to work, each
with some probability of being blocked. What is the probability I canget to work?
The probability of me not being able to get to work is the probability of all
three being blocked. So the probability of me being able to get to work is
P(A clear or B clearor C clear) = 1
P(A blockedand B blockedand C blocked).
e.g. if , ,
then
P(can get to work) = P(A clear or B clear or C clear)
= 1 = 1P(A blockedand B blockedand C blocked
1 130 2930
-
8/2/2019 Learn This Statistics
23/49
1 2 3 4 5
8%
23% 23%
8%
38%
Problems with a device
There are three common ways for a system to experience problems,with independent probabilities over a year
A = overheats, P(A)=1/3
B = subcomponent malfunctions, P(B) = 1/3
C = damaged by operator, P(C) = 1/10
What is the probability that the system has one or more of these
problems during the year?
1. 1/3
2. 2/5
3. 3/54. 3/4
5. 5/6
-
8/2/2019 Learn This Statistics
24/49
Problems with a device
There are three common ways for a system to experienceproblems, with independent probabilities over a year
A = overheats, P(A)=1/3
B = subcomponent malfunctions, P(B) = 1/3
C = damaged by operator, P(C) = 1/10
What is the probability that the system has one or more of these
problems during the year?
has a problem 1
1 23 23 910
1 410 35
-
8/2/2019 Learn This Statistics
25/49
Special Addition Rule
If 0, the events are mutually exclusive, so
or + ()
A
B
C
E.g. Throwing a fair dice,
P(getting 4,5 or 6)
In general if several events , , , , are mutually exclusive(i.e. at most one of them can happen in a single experiment) then
or or or + + + ( )
= P(4)+P(5)+P(6) = 1/6+1/6+1/6=1/2
-
8/2/2019 Learn This Statistics
26/49
Complements Rule: 1 ( )Q.What is the probability that a random card is not the ace of spades?A. 1-P(ace of spades) = 1-1/52 = 51/52
Multiplication Rule: (|)QWhat is the probability that two cards taken (without replacement) areboth Aces?
A
first ace second ace first ace
Addition Rule: + ( )QWhat is the probability of a random card being a diamond or an ace?
A
diamond + ace diamond and ace
+
Rules of probability recap
-
8/2/2019 Learn This Statistics
27/49
1. 2. 3. 4. 5. 6.
16%
10%
29%
39%
3%3%
Failing a drugs test
A drugs test for athletes is 99% reliable:applied to a drug taker it gives a positive
result 99% of the time, given to a non-taker itgives a negative result 99% of the time. It isestimated that 1% of athletes take drugs.
A random athlete has failed the test. What isthe probability the athlete takes drugs?
1. 0.01
2. 0.3
3. 0.5
4. 0.7
5. 0.98
6. 0.99
-
8/2/2019 Learn This Statistics
28/49
Similar example:TV screens produced by a manufacturerhave defects 10% of the time.
An automated mid-production test is found to be 80%
reliable at detecting faults (if the TV has a fault, the testindicates this 80% of the time, if the TV is fault-free there isa false positive only 20% of the time).
If a TV fails the test, what is the probability that it has adefect?
Split question into two parts
1. What is the probability that a random TV fails the test?
2. Given that a random TV has failed the test, what is theprobability it is because it has a defect?
-
8/2/2019 Learn This Statistics
29/49
Example:TV screens produced by a manufacturer havedefects 10% of the time.
An automated mid-production test is found to be 80%
reliable at detecting faults (if the TV has a fault, the testindicates this 80% of the time, if the TV is fault-free there isa false positive only 20% of the time).
What is the probability of a random TV failing the mid-production test?
Answer:Let D=TV has a defectLet F=TV fails test
0.8 0.1 + 0.2 1 0.1 0.26
Two independent ways to fail the test:
TV has a defect and test shows this, -OR- TV is OK but get a false positive
The question tells us: 0.1 0.8 0.2
+ ( ) +
-
8/2/2019 Learn This Statistics
30/49
If, ... , form a partition (a mutually exclusive list of all possibleoutcomes) and Bis any event then
+ + + ()
A
A
A
A
A1
2
3
4
5B
+ +
=
() () ()
Is an example of the
+ ( ) +
Total Probability Rule
-
8/2/2019 Learn This Statistics
31/49
Example:TV screens produced by a manufacturer havedefects 10% of the time.
An automated mid-production test is found to be 80%
reliable at detecting faults (if the TV has a fault, the testindicates this 80% of the time, if the TV is fault-free there isa false positive only 20% of the time).
If a TV fails the test, what is the probability that it has adefect?
Answer:Let D=TV has a defectLet F=TV fails test
+ 0.8 0.1 + 0.2 1 0.1 0.26
We previously showed using the total probability rule that
When we get a test fail, what fraction of the time is it because the TV has a defect?
-
8/2/2019 Learn This Statistics
32/49
All TVs
10% defects
80% of TVs with defects fail the test
20% of OK TVs give false positive
+
: TVs that fail the test
+ ( )
: TVs without defect
-
8/2/2019 Learn This Statistics
33/49
All TVs
10% defects
20% of OK TVs give false positive
+
: TVs that fail the test
+ ( )
: TVs without defect
80% of TVs with defects fail the test
-
8/2/2019 Learn This Statistics
34/49
All TVs
10% defects
80% of TVs with defects fail the test
20% of OK TVs give false positive
+
: TVs that fail the test
+ ( )
: TVs without defect
-
8/2/2019 Learn This Statistics
35/49
Example:TV screens produced by a manufacturer havedefects 10% of the time.
An automated mid-production test is found to be 80%
reliable at detecting faults (if the TV has a fault, the testindicates this 80% of the time, if the TV is fault-free there isa false positive only 20% of the time).
If a TV fails the test, what is the probability that it has adefect?
Answer:Let D=TV has a defectLet F=TV fails test
+ 0.8 0.1 + 0.2 1 0.1 0.26
We previously showed using the total probability rule that
Know 0.8, 0.1:
0.3077
When we get a test fail, what fraction of the time is it because the TV has a defect?
0.80.10.26
The Rev Thomas Bayes
-
8/2/2019 Learn This Statistics
36/49
Note: as in the example, the Total Probability rule is often used toevaluate P(B):
)
| () and and + and + and +
The Rev Thomas Bayes(1702-1761)
= ()=
The multiplication rule gives
Bayes Theorem
Bayes Theorem
If you have a model that tells you how likely B is given A, Bayes theorem
allows you to calculate the probability of A if you observe B. This is the key tolearning about your model from statistical data.
-
8/2/2019 Learn This Statistics
37/49
Example: Evidence in court
The cars in a city are 90% black and 10% grey.
A witness to a bank robbery briefly sees theescape car, and says it is grey. Testing the witnessunder similar conditions shows the witnesscorrectly identifies the colour 80% of the time (ineither direction).
What is the probability that the escape car wasactually grey?
Answer: Let G = car is grey, B=car is black, W = Witness says car is grey.
.Bayes Theorem
Use total probability rule to write
+
Hence:
0.8 0.1 + 0.2 0.9 0.26
0.80.10.26 0.31
-
8/2/2019 Learn This Statistics
38/49
1 2 3 4 5
13%
23%
17%
19%
29%
Failing a drugs test
A drugs test for athletes is 99% reliable:applied to a drug taker it gives a positive
result 99% of the time, given to a non-taker itgives a negative result 99% of the time. It isestimated that 1% of athletes take drugs.
Part 1. What fraction of randomly testedathletes fail the test?
1. 1%
2. 1.98%
3. 0.99%4. 2%
5. 0.01%
-
8/2/2019 Learn This Statistics
39/49
Failing a drugs test
A drugs test for athletes is 99% reliable: applied to a drug takerit gives a positive result 99% of the time, given to a non-taker it
gives a negative result 99% of the time. It is estimated that 1%of athletes take drugs.
What fraction of randomly tested athletes fail the test?
Let F=fails testLet D=takes drugs
Question tells us 0.01, (|) 0.99, 0.01
From total probability rule:
+ 0.990.01+0.010.99=0.0198
i.e. 1.98% of randomly tested athletes fail
-
8/2/2019 Learn This Statistics
40/49
1. 2. 3. 4. 5.
26%
17%
13%
0%
43%
1. 0.01
2. 0.3
3. 0.54. 0.7
5. 0.99
Failing a drugs test
A drugs test for athletes is 99% reliable:applied to a drug taker it gives a positive
result 99% of the time, given to a non-taker itgives a negative result 99% of the time. It isestimated that 1% of athletes take drugs.
A random athlete has failed the test. What isthe probability the athlete takes drugs?
-
8/2/2019 Learn This Statistics
41/49
Failing a drugs test
A drugs test for athletes is 99% reliable: applied to a drug takerit gives a positive result 99% of the time, given to a non-taker it
gives a negative result 99% of the time. It is estimated that 1%of athletes take drugs.
A random athlete is tested and gives a positive result. What isthe probability the athlete takes drugs?
Bayes Theorem gives
Let F=fails testLet D=takes drugs
Question tells us 0.01, (|) 0.99, 0.01
We need + 0.990.01+0.010.99
Hence:
0.990.01
0.0198
0.0099
0.0198
1
2
= 0.0198
-
8/2/2019 Learn This Statistics
42/49
Reliability of a system
General approach: bottom-up analysis. Need to break down the system intosubsystems just containing elements in series or just containing elements in
parallel.
Find the reliability of each of these subsystems and then repeat the process atthe next level up.
-
8/2/2019 Learn This Statistics
43/49
p1
p2
p3
pn
The system only works if all nelements work. Failures of different elementsare assumed to be independent (so the probability of Element 1 failing does
alter after connection to the system).
(1 2 )
1 1 1 ( 1 )
=
Series subsystem: in the diagram = probability that element ifails, so
1 = probability that it does not fail.
Hence 1 ( )
1 ( 1 )
=
-
8/2/2019 Learn This Statistics
44/49
Parallel subsystem: the subsystem only fails if all the elements fail.
p
p
p
1
2
n
(1 2 )
=
= 1 2 ( ) [Special multiplication ruleassuming failures independent]
E l
-
8/2/2019 Learn This Statistics
45/49
Example:
Subsystem 1:
P(Subsystem 1 doesn't fail)
= 1 0.05 1 0.03 0.9215HenceP(Subsystem 1 fails)=0.0785
0.0785
0.0785
Subsystem 2: (two units of subsystem 1)
P(Subsystem 2 fails)
=0.0785 x 0.0785 =0.006162
0.02 0.006162 0.01
Subsystem 3:P(Subsystem 3 fails)= 0.1 x 0.1 = 0.01
Answer:P(System doesn't fail) =
(1 - 0.02)(1 - 0.006162)(1 - 0.01)= 0.964
-
8/2/2019 Learn This Statistics
46/49
Answer to (b)
Let B = event that the system does not
failLet C = event that component * does fail
We need to find P(Band C).
Use (|) . We know P(C) = 0.1.
-
8/2/2019 Learn This Statistics
47/49
P(B| C) = P(system does not fail given component * has failed)
0.02 0.10.006162Final diagram is then
P(B| C) = (1 - 0.02)(1 0.006162)(1 - 0.1) = 0.8766
If * failed replace with
Hence sinceP(C) = 0.1
P(Band C) = P(B| C) P(C) = 0.8766 x 0.1 = 0.08766
Triple redundancy
-
8/2/2019 Learn This Statistics
48/49
1 2 3 4 5
53%
20%
24%
0%2%
Triple redundancy
What is probability that this systemdoes not fail, given the failure
probabilities of the components?
13
1
3
12
1. 17/18
2. 2/9
3. 1/94. 1/3
5. 1/18
Triple redundancy
-
8/2/2019 Learn This Statistics
49/49
Triple redundancy
What is probability that this systemdoes not fail, given the failure
probabilities of the components?
13
1
3
12
P(failing) = P(1 fails)P(2 fails)P(3 fails)
Hence: P(not failing) = 1 P(failing) = 1