probability and distributions. deterministic vs. random processes in deterministic processes, the...

Probability and Distributions

Deterministic vs. Random Processes

• In deterministic processes, the outcome can be predicted exactly in advance• Eg. Force = mass x acceleration. If we are given values

for mass and acceleration, we exactly know the value of force

• In random processes, the outcome is not known exactly, but we can still describe the probability distribution of possible outcomes • Eg. 10 coin tosses: we don’t know exactly how many

heads we will get, but we can calculate the probability of getting a certain number of heads

Events

• An event is an outcome or a set of outcomes of a random process

Example: Tossing a coin three times

Event A = getting exactly two heads = {HTH, HHT, THH}

Example: Picking real number X between 1 and 20

Event A = chosen number is at most 8.23 = {X ≤ 8.23}

Example: Tossing a fair dice

Event A = result is an even number = {2, 4, 6}

• Notation: P(A) = Probability of event A• Probability Rule 1:

0 ≤ P(A) ≤ 1 for any event A

Sample Space

• The sample space S of a random process is the set of all possible outcomes Example: one coin toss

S = {H,T} Example: three coin tosses

S = {HHH, HTH, HHT, TTT, HTT, THT, TTH, THH}Example: roll a six-sided dice

S = {1, 2, 3, 4, 5, 6}Example: Pick a real number X between 1 and 20

S = all real numbers between 1 and 20

• Probability Rule 2: The probability of the whole sample space is 1

P(S) = 1

Equally Likely Outcomes Rule

• If all possible outcomes from a random process have the same probability, then

• P(A) = (# of outcomes in A)/(# of outcomes in S)

• Example: One Dice Tossed

P(even number) = |2,4,6| / |1,2,3,4,5,6| = 3/6 = 1/2

• Note: equal outcomes rule only works if the number of outcomes is “countable”• Eg. of an uncountable process is sampling any fraction between 0 and

1. Impossible to count all possible fractions !

Combinations of Events• The complement Ac of an event A is the event that A does

not occur• Probability Rule 3:

P(Ac) = 1 - P(A)• The union of two events A and B is the event that either A

or B or both occurs• The intersection of two events A and B is the event that

both A and B occur

Event A Complement of A Union of A and B Intersection of A and B

Disjoint Events• Two events are called disjoint if they can not happen

at the same time • Events A and B are disjoint means that the intersection of

A and B is zero

• Example: coin is tossed twice • S = {HH,TH,HT,TT}• Events A={HH} and B={TT} are disjoint • Events A={HH,HT} and B = {HH} are not disjoint

• Probability Rule 4: If A and B are disjoint events then

P(A or B) = P(A) + P(B)

Independent events• Events A and B are independent if knowing that A occurs

does not affect the probability that B occurs

• Example: tossing two coinsEvent A = first coin is a head

Event B = second coin is a head

• Disjoint events cannot be independent!• If A and B can not occur together (disjoint), then knowing that A

occurs does change probability that B occurs

• Probability Rule 5: If A and B are independent

P(A and B) = P(A) x P(B)

P( 2 H in two Tosses) = 0.5 * 0.5 = 0.25

Independent

multiplication rule for independent events

Distributions

• The magnitude of an event will vary over a range of values with time. This variation can be described by some type of distribution function. – Frequency – Cumulative

Frequency Distribution

• A frequency distribution is an arrangement of the values that one or more variables take in a sample. Each entry in the table contains the frequency or count of the occurrences of values within a particular group or interval.

Cumulative Distribution Function (CDF)

• CDF is the probability of Variable X, taking on a number that is less than or equal to number X. This may also be known as the "area in so far" function.

Median Flow is at 0.5 value on the CDF

Normal Distribution

Probability Distribution

• A probability is a numerical value that measures the uncertainty that a particular event will occur. The probability of an event ordinarily represents the proportion of times under identical circumstances that the outcome can be expected to occur.

• A probability distribution of a random variable X provides a probability for each possible value. Those probabilities must sum to 1, and they are denoted by: P[X = x] where x represents any one of the possible values that the random variable may assume.

Types of Distributions

• Discrete (binary, nominal, ordinal):– Bernoulli– Binomial– Poisson– Geometric

• Continuous distributions (interval, ratio):– Uniform– Normal (Gaussian)– Gamma– Chi Square– Student t

Statistics of a Distribution• Central Value

– Mean – Medium– Mode

• Variability– Min, Max and Range– Variance– Standard Deviation– Coefficient of Variation (CV) - a measure of dispersion of a

probability distribution (Standard Deviation / Mean)

• Shape- Skewness - a measure of symmetry- Kurtosis - a measure of whether the data are peaked or flat

relative to a normal distribution.

Basic Statistics • Mean -

• Variance -

• Standard Deviation -

• Coefficient of Variation -

• Skew Coefficient -

ii nxX

/n = number of observationsxi = observation i Excel function: AVERAGE

22 )(1

Excel function: VAR

Excel Function: STDEV

XSCV /

)2)(1( S

Excel Function: Skew

Other Metrics • Central Tendency

– Mean

– Median• Point in the distribution where half of the values in the distribution

lie below the point, and half lie above the point

– Mode• Value of x at which the distribution is at its maximum

Continuous Uniform DistributionUniform

1 12 23 34 45 56 67 78 89 100 111 122 133 144 155 166 177 188 199

Series1

Uniform

1 34 67 100 133 166 199 232 265 298 331 364 397 430 463 496 529 562 595

Series1

All events within a range has a equal chance of occurrence.

Frequency Cumulative

Probability density function

Used in stochastic modeling

Normal Distribution

• Symmetrical – equal number of events on either side of the mean value.

• Mean, medium and mode values are equal.

• f(x) =

Gamma Distribution

• A skewed distribution, not symmetric.

• Mean, medium and mode are not equal.

• f(x, k, Θ) =

Inference • Most spatial analysis is based on comparing

sample events to theoretical distributions. • With a normal distribution

– +/- 1 standard deviations = 0.68 of the events– +/- 2 standard deviations = 0.955 of the events– +/- 3 standard deviations = 0.997 of the events

• P(x > +3SD) = 0.0015 • Z statistic – normal deviate transformation

– Z = (X – Expect Mean of X)/ Expected SD of X– Z = (10 – 5) / 1.5 = +3.33

Nearest Neighbor Analysis

Nearest neighbor analysis examines the distances between each point and the closest point to it, and then compares these to expected values for a random sample of points from a CSR (complete spatial randomness) pattern. CSR is generated by means of two assumptions: 1) that all places are equally likely to be the recipient of a case (event) and 2) all cases are located independently of one another.

The mean nearest neighbor distance =

where N is the number of points. di is the nearest

neighbor distance for point i.

The expected value of the nearest neighbor distance in a random pattern =

where A is the area and B is the length of the perimeter of the study area.

The variance =

Nearest Neighbor Distance

Pointsfor Random NND MeanExpected

NND Observed MeanR

And the Z statistic =

This approach assumes:

Equations for the expected mean and variance cannot be used for irregularly shaped study areas. The study area is a regular rectangle or square. Area (A) is calculated by (Xmax – Xmin) * (Ymax – Ymin), where these represent the study area boundaries.

R statistic = Observed Mean d / Expect d

R = 1 random, R 0 cluster, R 2+ uniform

2 x 0.5

A = 1, B = 5

E (di) = 0.05277

Var (d) = 8.85 x 10-6

A = 1, B = 4

E(di) = 0.05222

Var(d) = 8.48 x 10-6

2 x 2: E(di) = 0.10444

Real world study areas are complex and violate the assumptions of most equations for expected values.

Wilderness Campsites

Solution

* Simulate randomization using Monte Carlo Methods.

Compare simulated distribution to observed.

* If possible use the “true” area and perimeter to compute the expected value.

* Software that does not ask for area/perimeter or a shapefile of the study area will assume a

rectangle based on the minimum and maximum coordinates.

probability and distributions. deterministic vs. random processes in deterministic processes, the...

head event b

b occursexample

b occursprobability

number of outcomes

possible outcomes example

equal outcomes rule

head disjoint events

set of outcomes

Documents

stochastic taylor expansions for poisson processes and...

deterministic methods

tamilnadu board class 11 maths chapter 12 · deterministic...

exploiting negative curvature in deterministic and ......

discrete-time fourier series (dtfs) · 2017-08-04 ·...

approximating markovchains - pnasfor deterministic and...

deterministic petrophysics

€¦ · web view(b) deterministic push down automata...

chapter 2 - vulnerability assessment of predicted climate...

deterministic optimization

stationary stochastic processes for scientists and...

deterministic annealing

probability theory in molecular simulations extended...

g. volpi - infn frascati animma 2011. search for rare sm or...

optimal control of infinite-dimensional piecewise...

reliability-based design optimization of structures ... ·...

disentangling mechanisms that mediate the balance between...

subsurface hydrology: physical processes and...

deterministic processes dominate soil microbial community...

piecewise deterministic markov processes for continuous...