cs626 data analysis and simulation - cs.wm.edu

Post on 29-Oct-2021

6 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

1

CS626 Data Analysis and Simulation

Today: Probability Primer

Quick Reference: Sheldon Ross: Introduction to Probability Models 9th Edition, AP, Ch. 1, Berthold, Hand: Intelligent Data Analysis, Springer 99, Chapter 2 by Feelders, Statistics Concepts.

Instructor: Peter Kemper R 104A, phone 221-3462, email:kemper@cs.wm.edu

2

Overview

Random Variable Definition PMF p() and CDF F() Indicator variable Bernoulli variable Binomial variable and Binomial distribution

Combinatorics refresher

3

Random variable

S can contain any type of elements. Often interest in numerical values that are a function of S

instead of E ⊆ SSuch functions are called random variables Mean to develop methods for experiments that yield numbers Compact description of an experiment (S may be too fine grained)

Concerns Probabilities of elements of S imply probabilities on the elements

of the range of a random variable. Range can be a discrete set of values or continuous. For the continuous setting, P(X ≤ x) needs to be well-defined, { s| X ≤ x } be an event, i.e. measurable.

4

Random variable

Definition(s)(special)A random variable X on a discrete sample space S is a function X : S -> R that assigns a real number X(s) to each sample point s ∈ S.

(in general)A random variable X on a probability space (S,F,P) is a function X : S -> R that assigns a real number X(s) to each sample point s ∈ S, such that for every real number x, the set of sample points {s|X(s) ≤ x} is an event, that is a member of F.

5

Quick Example

Let X denote the random variable that is defined as the sum of two fair dice:

Since X must be one of the values 2-12, then:

6

Another Example

A coin has P(H)=pLet’s toss it till we see the first heads. Let N denote the number of flips required Assumption: outcomes of successive flips are independent N is a random variable with range {1, 2, 3, … }

and

7

An Experiment

Lifetime of a battery, S=[0,∞) Will it last at least two years ?

E : event that the battery lasts two or more yearsRandom variable I

I is called indicator random variable for event E

8

What can we say about a Random Variable ?

Discrete Random Variable: Set of possible values is countable (or even finite)

Continuous Random Variable: Set of possible values is non-denumerable

We need terminology, ways to describe them more to the point …

What are characteristics of a random variable ?

9

Cumulative Distribution Function

Definition: The Cumulative Distribution Function (cdf, also distribution function, probability dist function) FX(b) of random variable X for any real number b,

is

Example: the life of a car is within some interval (a, b)Properties of cdf

Omit subscript X if clear

10

Cumulative Distribution Function

DiscreteContinuousMixed

Note cdfs are always of similar shape,although variables differ a lot …

11

Properties of CDF

All probability questions about X can be answered in terms of the cdf F(.)Example:

The probability that X is strictly smaller than b is

Note that P{X < b} does not necessarily equal F(b) since F(b) also includes the probability that X equals b

12

Probability Mass Function of a Discrete Random Variable

For a discrete random variable X, we define the probability of mass function (pmf) pX(a) of X by

If X must assume one of the values x1, x2, …, (at most countable) then

Since X must take on one of the values xi, so

13

Discrete Random Variables

The cumulative distribution function F can be expressed in terms of p(a) by

Example: suppose X has a probability mass function given by:

Then the cumulative distribution function F of X is

14

Bernoulli Random Variable

Suppose an experiment is either a success or failure. Let X = 1 if the outcome is a success X = 0 if it is a failure, then the probability of mass function of X is

X is said to be a Bernoulli random variable if its probability is given by the above equation for some p from 0 to1.

15

Binomial Random Variable

n independent trials of Bernoulli experiment, Each with a probability of success of p and a failure probability of 1-p. Let X be number of successes that occur in n trialsX is a binomial random variable with parameters (n, p).A probability mass function of X

where number of different groups of i objects out of n objects

16

Combinatorics

Let’s do a consistency check:

Uses The binomial theorem

17

A combinatorics refresher

Let’s consider n objects you can choose from and r is the number to be chosen

Permutation (order matters)Number of permutations with repetition Number of permutations without repetition

Combination (order does not matter)Number of combinations without repetition

(choose r=5 out of n=10 different objects) Number of combinations with repetition

(choose r=3 objects of n=10 different types of objects)

18

Example: Binomial random variable

Machine produces items, occasionally with defects Probability of defective item: 0.1, Defects are independent of each other. What is the probability that from a sample of three items at most

one will be defective? Let X be the number of defective items in the sample, then X is a binomial random variable with parameters (3, 0.1)

19

Another Example

Failures of airplane engines Let Fi = 1-p be the failure of i-th engine (in mid-flight) Assumptions: engine failures are independent Airplane is operational if at least half of its engines work. Failures are independent =>number of engines remaining

operative is a binomial random variable

Question: For what values of p is a four-engine plane preferable to a two-engine plane? Probability that a four-engine plane makes a successful flight

Probability that a two-engine plane makes a successful flight

20

Another Example Continued So the four-engine plane is safer if

The four-engine plane is thus safer when the engine success probability is at least as large as 2/3

21

Summary

Random Variable Definition PMF p() and CDF F() Indicator variable Bernoulli variable Binomial variable and Binomial distribution

Combinatorics refresher

Is that all ?

No, there are a few more distributions of random variables we should look at in the near future …

top related