ece 6540, lecture 01
TRANSCRIPT
Variations
ECE 6540, Lecture 01Introduction and Review of Probability
Estimation Theory: Some Definitions
Definitions Question: What is a statistic? How do we define it?
3
Definitions Question: What is a statistic? How do we define it?
Answer: A statistic is any function of sampled data
The function must be independent of the data’s underlying probability distribution
4
Definitions Examples:
𝑥𝑥~𝒩𝒩 0,1 , y~𝒩𝒩 0,2
Are these statistics? 𝑥𝑥
𝑥𝑥 + 𝑦𝑦
𝑥𝑥2
𝑥𝑥𝑥𝑥 − 𝑦𝑦ln 𝑥𝑥 + 2 + 3𝑦𝑦
𝐸𝐸 𝑥𝑥
5
Definitions Examples:
𝑥𝑥~𝒩𝒩 0,1 , y~𝒩𝒩 0,2
Are these statistics? 𝑥𝑥 Yes!
𝑥𝑥 + 𝑦𝑦 Yes!
𝑥𝑥2 Yes!
𝑥𝑥𝑥𝑥 − 𝑦𝑦ln 𝑥𝑥 + 2 + 3𝑦𝑦 Yes!
𝐸𝐸 𝑥𝑥 No!
6
Definitions Question: What is an estimator? How do we define it?
7
Definitions Question: What is an estimator? How do we define it?
Answer: An estimator is a statistic that estimates a specific value
8
Definitions Examples:
𝑥𝑥~𝒩𝒩 0,1 , y~𝒩𝒩 0,1
A familiar statistic12𝑥𝑥 + 𝑦𝑦 is an estimator of what?
Is it a good estimator? Why or why not?
9
Definitions Examples:
𝑥𝑥~𝒩𝒩 0,1 , y~𝒩𝒩 0,1
A less familiar statistic23𝑥𝑥 + 1
3𝑦𝑦 is an estimator of what?
Is it a good estimator? Why or why not?
10
Definitions Question: What is an estimation theory? How do we define it?
11
Definitions Question: What is an estimation theory? How do we define it?
Answer: Estimation theory is the study estimators and their properties.
12
Definitions Question: What is a [statistical] detector (not to be confused with communications
detector)?
13
Definitions Question: What is a [statistical] detector (not to be confused with communications
detector)?
Answer: (Warning: definition is a but fuzzy) A detector is a statistic or process
that determines the presence of a signal within noise
A [hypothesis] test is a method for determining what distribution a detector (statistic) belongs to
14
Definitions Example:
𝑛𝑛~𝒩𝒩 0,1 , 𝑥𝑥 is any signal
Hypothesis Test Null Hypothesis: y = n
Alternative Hypothesis: y = x + n
What is a good detector?
15
Definitions Example:
𝑛𝑛~𝒩𝒩 0,1 , 𝑥𝑥 is any signal
Hypothesis Test Null Hypothesis: y = n
Alternative Hypothesis: y = x + n
What is a good detector? Optimal detector: s = 𝑦𝑦 2
Optimal test: s > 𝜆𝜆 (threshold 𝜆𝜆 is determined from a Chi-square distribution)
16
Definitions Question: What is an detection theory? How do we define it?
17
Definitions Question: What is an detection theory? How do we define it?
Answer: Detection theory is the study detectors and their properties.
18
Definitions Question: In engineering, how do we define the “best” or “optimal” of something
(e.g., the best estimator or the best detector)
19
Definitions Question: In engineering, how do we define the “best” or “optimal” of something
(e.g., the best estimator or the best detector)
Answer: Trick question, “best” and “optimal” is always based on some criteria
that WE define.
20
Applications
Definitions Question: What are some applications of estimation theory?
22
Applications RADAR (Radio Detection And Ranging) / Sonar
Detection: Is there a reflection from an aircraft?
Estimation: How far is the aircraft / what is its precise location?
Related (Waveform Design): Can I design waveforms to make the above easier/harder?
Detection Theory Estimation Theory
Credit: https://en.wikipedia.org/wiki/Radar23
Applications Communications Detection: Did I receive a message?
Estimation: What is the message?
Related (Coding): Can I design codes to make the above easier/harder?
Credit: http://www.ohlone.edu/instr/speech/longdesc-diagramcommunication.html
24
Applications It is pervasive in signal processing Estimation theory = applied statistics
Most modern signal processing tools involve statistics— Array processing— Compressive sensing (most proofs are probability based)— Network science (probabilistic graphical models)— Optimal filter design— De-noising — Tracking (e.g., Kalman filter)— Statistical Modelling / Analysis
25
Applications Two examples: Gaussian Random Variable
— What is an optimal estimate for the expected value?
Laplace Random Variable— What is an optimal estimate for the expected value?
26
Schedule
Schedule Part 1: Classical Estimation Theory Minimum Mean Square Error Estimators
Minimum Variance Unbiased Estimators
Maximum Likelihood Estimators
Part 2: Bayesian Estimation Theory Maximum a Priori Estimators
Minimax Estimators
Part 3: Detection Theory Neyman-Pearson Tests
Generalized Maximum Likelihood Tests
28
Schedule Question: What is the difference between classical and Bayesian statistics? Why are these differences important?
29
Schedule Quick warning! In the first few classes, I am going to throw a lot of information at you
(some of which you may know, some of which you may not)
I do not expect you to retain everything 100%. My goal is to expose you to these concepts and make you more comfortable about the concepts.
30
Probability Review (with some things you may have not seen)
Probability Review Quick Note: Notation everywhere is different
We will try to stick with Kay’s notation
32
Probability Review Probability events: Probability that ‘event’ A
— Pr 𝐴𝐴
Probability of ‘event’ A AND B— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴,𝐵𝐵
Probability of ‘event’ A OR B— Pr 𝐴𝐴 ∪ 𝐵𝐵
Probability of an ‘event’ A, given ‘event’ B— 𝑃𝑃 𝐴𝐴|𝐵𝐵
33
Probability Review Probability events: Probability that ‘event’ A
— Pr 𝐴𝐴
Probability of ‘event’ A AND B— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴,𝐵𝐵
Probability of ‘event’ A OR B— Pr 𝐴𝐴 ∪ 𝐵𝐵
Probability of an ‘event’ A, given ‘event’ B— 𝑃𝑃 𝐴𝐴|𝐵𝐵
Ω
Event A
Event B
<- Universe
34
Probability Review Probability events: Probability that ‘event’ A
— Pr 𝐴𝐴
Probability of ‘event’ A AND B— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴,𝐵𝐵
Probability of ‘event’ A OR B— Pr 𝐴𝐴 ∪ 𝐵𝐵
Probability of an ‘event’ A, given ‘event’ B— 𝑃𝑃 𝐴𝐴|𝐵𝐵
Ω
Event A
Event B
<- Universe
35
Probability Review Probability events: Probability that ‘event’ A
— Pr 𝐴𝐴
Probability of ‘event’ A AND B— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴,𝐵𝐵
Probability of ‘event’ A OR B— Pr 𝐴𝐴 ∪ 𝐵𝐵
Probability of an ‘event’ A, given ‘event’ B— 𝑃𝑃 𝐴𝐴|𝐵𝐵
Ω
Event A
Event B
<- Universe
36
Event A
Event B
Probability Review Probability events: Probability that ‘event’ A
— Pr 𝐴𝐴
Probability of ‘event’ A AND B— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴,𝐵𝐵
Probability of ‘event’ A OR B— Pr 𝐴𝐴 ∪ 𝐵𝐵
Probability of an ‘event’ A, given ‘event’ B— 𝑃𝑃 𝐴𝐴|𝐵𝐵
Ω <- Universe
37
Probability Review Probability events: Probability that ‘event’ A
— Pr 𝐴𝐴
Probability of ‘event’ A AND B— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴,𝐵𝐵
Probability of ‘event’ A OR B— Pr 𝐴𝐴 ∪ 𝐵𝐵
Probability of an ‘event’ A, given ‘event’ B— 𝑃𝑃 𝐴𝐴|𝐵𝐵
New UniverseΩ = B
38
Probability Review EXAMPLE: Probability events (fair coin flips): Probability that ‘event’ A
— Pr 𝐴𝐴
Probability of ‘event’ A AND B— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴,𝐵𝐵
Probability of ‘event’ A OR B— Pr 𝐴𝐴 ∪ 𝐵𝐵
Probability of an ‘event’ A, given ‘event’ B— 𝑃𝑃 𝐴𝐴|𝐵𝐵
Event 𝐴𝐴 → Coin 1 is headsEvent 𝐵𝐵 → Coin 2 is heads
39
Probability Review EXAMPLE: Probability events (fair coin flips): Probability that ‘event’ A
— Pr 𝐴𝐴 = 1/2
Probability of ‘event’ A AND B— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴,𝐵𝐵 = 1/4
Probability of ‘event’ A OR B— Pr 𝐴𝐴 ∪ 𝐵𝐵 = 3/4
Probability of an ‘event’ A, given ‘event’ B— 𝑃𝑃 𝐴𝐴|𝐵𝐵 = 1/2
Event 𝐴𝐴 → Coin 1 is headsEvent 𝐵𝐵 → Coin 2 is heads
40
Probability Review Relationships between probability events: Chain Rule
— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵|𝐴𝐴 = 𝑃𝑃 𝐵𝐵 𝑃𝑃 𝐴𝐴|𝐵𝐵
Bayes Theorem
— Pr 𝐴𝐴|𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵|𝐴𝐴Pr 𝐵𝐵
“OR” Rule— Pr 𝐴𝐴 ∪ 𝐵𝐵 = Pr 𝐴𝐴 + Pr 𝐵𝐵 − Pr 𝐴𝐴 ∩𝐵𝐵
Independence Events— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵
Disjoint Events— Pr 𝐴𝐴 ∩ 𝐵𝐵 = 0
41
Probability Review Relationships between probability events: Chain Rule
— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵|𝐴𝐴 = 𝑃𝑃 𝐵𝐵 𝑃𝑃 𝐴𝐴|𝐵𝐵
Bayes Theorem
— Pr 𝐴𝐴|𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵|𝐴𝐴Pr 𝐵𝐵
“OR” Rule— Pr 𝐴𝐴 ∪ 𝐵𝐵 = Pr 𝐴𝐴 + Pr 𝐵𝐵 − Pr 𝐴𝐴 ∩𝐵𝐵
Independence Events— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵
Disjoint Events— Pr 𝐴𝐴 ∩ 𝐵𝐵 = 0
New UniverseΩ = B
Weight this by probability to be in B
42
Probability Review Relationships between probability events: Chain Rule
— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵|𝐴𝐴 = 𝑃𝑃 𝐵𝐵 𝑃𝑃 𝐴𝐴|𝐵𝐵
Bayes Theorem
— Pr 𝐴𝐴|𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵|𝐴𝐴Pr 𝐵𝐵
“OR” Rule— Pr 𝐴𝐴 ∪ 𝐵𝐵 = Pr 𝐴𝐴 + Pr 𝐵𝐵 − Pr 𝐴𝐴 ∩𝐵𝐵
Independence Events— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵
Disjoint Events— Pr 𝐴𝐴 ∩ 𝐵𝐵 = 0
New UniverseΩ = B
Derive from above
43
Probability Review Relationships between probability events: Chain Rule
— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵|𝐴𝐴 = 𝑃𝑃 𝐵𝐵 𝑃𝑃 𝐴𝐴|𝐵𝐵
Bayes Theorem
— Pr 𝐴𝐴|𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵|𝐴𝐴Pr 𝐵𝐵
“OR” Rule— Pr 𝐴𝐴 ∪ 𝐵𝐵 = Pr 𝐴𝐴 + Pr 𝐵𝐵 − Pr 𝐴𝐴 ∩𝐵𝐵
Independence Events— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵
Disjoint Events— Pr 𝐴𝐴 ∩ 𝐵𝐵 = 0
Event A
Event B
Remove overlap
44
Probability Review Relationships between probability events: Chain Rule
— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵|𝐴𝐴 = Pr 𝐵𝐵 Pr 𝐴𝐴|𝐵𝐵
Bayes Theorem
— Pr 𝐴𝐴|𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵|𝐴𝐴Pr 𝐵𝐵
“OR” Rule— Pr 𝐴𝐴 ∪ 𝐵𝐵 = Pr 𝐴𝐴 + Pr 𝐵𝐵 − Pr 𝐴𝐴 ∩𝐵𝐵
Independence Events— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵 Pr 𝐵𝐵 = Pr 𝐵𝐵|𝐴𝐴
Disjoint Events— Pr 𝐴𝐴 ∩ 𝐵𝐵 = 0
45
Probability Review Relationships between probability events: Chain Rule
— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵|𝐴𝐴 = Pr 𝐵𝐵 Pr 𝐴𝐴|𝐵𝐵
Bayes Theorem
— Pr 𝐴𝐴|𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵|𝐴𝐴Pr 𝐵𝐵
“OR” Rule— Pr 𝐴𝐴 ∪ 𝐵𝐵 = Pr 𝐴𝐴 + Pr 𝐵𝐵 − Pr 𝐴𝐴 ∩𝐵𝐵
Independence Events— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵
Disjoint Events— Pr 𝐴𝐴 ∩ 𝐵𝐵 = 0
Event A
Event B
46
Probability Review Random Variables: Continuous-values random variables 𝑋𝑋
Capital-case (𝑋𝑋) means random (this is the notation we will use)
Lower-case (𝑥𝑥) means fixed value
Probability Density Functions (PDF) with parameter 𝜽𝜽 𝑝𝑝𝜃𝜃 𝑥𝑥
𝑝𝑝𝑋𝑋,𝜃𝜃 𝑥𝑥
𝑝𝑝 𝑥𝑥;𝜃𝜃
All three notations mean the same thing!
Kay’s notation
47
Probability Density Functions and Cumulative Density Functions
Probability Review Probability Density Functions (PDF) Definition A valid PDF is any function 𝑓𝑓 𝑥𝑥 that is both
— Non-negative p 𝑥𝑥 ≥ 0— Unit area ∫−∞
∞ 𝑝𝑝 𝑥𝑥 𝑑𝑑𝑥𝑥 = 1
Credit: https://commons.wikimedia.org/wiki/File:Normal_Distribution_PDF.svg49
Probability Review Cumulative Density Functions (CDF) Definition A valid CDF is any function 𝐹𝐹 𝑥𝑥 that is both
— Monotonically increasing (non-deceasing )— Normalized: 𝐹𝐹 −∞ = 0, 𝐹𝐹 ∞ = 1
From a PDF as
— 𝑃𝑃 𝑥𝑥;𝜃𝜃 = Pr 𝑋𝑋 ≤ 𝑥𝑥 = ∫−∞𝑥𝑥 𝑝𝑝 𝜏𝜏;𝜃𝜃 𝑑𝑑𝜏𝜏
Figure
Credit: https://en.wikipedia.org/wiki/Normal_distribution#/media/File:Normal_Distribution_CDF.svg 50
Probability Review Gaussian (or Normal) Random Variable 𝑋𝑋:𝑁𝑁 𝜇𝜇,𝜎𝜎2
PDF is also known as the “bell curve”
mean variance
𝑝𝑝 𝑥𝑥; 𝜇𝜇,𝜎𝜎 =1
2 𝜋𝜋𝜎𝜎2exp −
𝑥𝑥 − 𝜇𝜇 2
2𝜎𝜎2
Credit:https://commons.wikimedia.org/wiki/File:Normal_Distrib
ution_PDF.svg
51
Probability Review Cumulative Distribution Function (CDF):
𝑃𝑃 𝑥𝑥;𝜃𝜃 = Pr 𝑋𝑋 ≤ 𝑥𝑥 = �−∞
𝑥𝑥
𝑝𝑝 𝜏𝜏;𝜃𝜃 𝑑𝑑𝜏𝜏
FigureFigure Figure
Credit: https://en.wikipedia.org/wiki/Normal_distribution#/media/File:Normal_Distribution_CDF.svg
52
Probability Review Example: Uniform Random Variable 𝑋𝑋: uniform 𝑎𝑎,𝑏𝑏
𝑝𝑝 𝑥𝑥;𝑎𝑎,𝑏𝑏 = �1
𝑏𝑏−𝑎𝑎for 𝑎𝑎 ≤ 𝑥𝑥 ≤ 𝑏𝑏
0 for otherwise Used for many applications
Credit: http://www.epixanalytics.com/modelassist/CrystalBall/Model_Assist.htm#Distributions/Continuous_distributions/Uniform.htm
53
Probability Review Example: Beta Random Variable 𝑋𝑋: beta(𝛼𝛼,𝛽𝛽)
𝑝𝑝 𝑥𝑥;𝛼𝛼,𝛽𝛽 = 1𝐵𝐵 𝛼𝛼,𝛽𝛽
𝑥𝑥𝛼𝛼−1 1− 𝑥𝑥 𝛽𝛽−1
Used in control systems, population genetics, Bayesian inference
Credit: https://en.wikipedia.org/wiki/Beta_distribution#/media/File:Beta_distribution_pdf.svg
Beta function
54
Probability Review Example: Chi-squared random variable
𝑋𝑋: 𝒳𝒳N2
𝑝𝑝 𝑥𝑥;𝑁𝑁 =𝑥𝑥𝑁𝑁2−1𝑒𝑒−
𝑥𝑥2
2𝑘𝑘2Γ 𝑁𝑁
2
for 𝑥𝑥 > 0
0 for otherwise
Used in detection theory
Credit: https://en.wikipedia.org/wiki/Chi-squared_distribution#/media/File:Chi-square_pdf.svg
55
Probability Review Transformation of Random Variables (Common Transformations)
Let 𝑥𝑥 and 𝑦𝑦 be independent such that 𝑥𝑥~𝒩𝒩 0,1 and 𝑦𝑦~𝒩𝒩 0,1
𝑥𝑥 + 𝑦𝑦~ ?
𝑥𝑥 2 ~ ?
𝑛𝑛 𝑛𝑛−1𝑛𝑛
𝑥𝑥+𝑦𝑦𝑥𝑥2+𝑦𝑦2
~ ?
𝑥𝑥 2
𝑦𝑦 2 ~?
𝑥𝑥𝑦𝑦
~ ?
56
Probability Review Transformation of Random Variables (Common Transformations)
Let 𝑥𝑥 and 𝑦𝑦 be independent such that 𝑥𝑥~𝒩𝒩 0,1 and 𝑦𝑦~𝒩𝒩 0,1
𝑥𝑥 + 𝑦𝑦~𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑎𝑎𝑛𝑛
𝑥𝑥 2 ~ 𝑐𝑐𝑐𝑐𝑐 − 𝑠𝑠𝑠𝑠𝑠𝑠𝑎𝑎𝑛𝑛𝑠𝑠𝑑𝑑
𝑛𝑛 𝑛𝑛−1𝑛𝑛
𝑥𝑥+𝑦𝑦𝑥𝑥2+𝑦𝑦2
~ 𝑠𝑠𝑠𝑠𝑠𝑠𝑑𝑑𝑠𝑠𝑛𝑛𝑠𝑠′𝑠𝑠 − 𝑇𝑇
𝑥𝑥 2
𝑦𝑦 2 ~𝐹𝐹
𝑥𝑥𝑦𝑦
~𝐶𝐶𝑎𝑎𝑠𝑠𝑐𝑐𝑐𝑦𝑦
57
Expectations and Moments
Probability Review Random Variables 𝑋𝑋 <- denoted by capital letter usually
Two types: discrete, continuous
Defined by a probability distribution function (PDF) and cumulative distribution function (CDF)
For discrete-valued random variables, the PDF is replaced by a probability mass function (PMF)
59
Probability Review Expectation
𝐸𝐸 𝑋𝑋 = �−∞
∞
𝑥𝑥 𝑝𝑝 𝑥𝑥 𝑑𝑑𝑥𝑥
Expectation of a function
𝐸𝐸 𝑔𝑔 𝑋𝑋 = �−∞
∞
𝑔𝑔 𝑥𝑥 𝑝𝑝 𝑥𝑥 𝑑𝑑𝑥𝑥
Expectation with an unknown parameter
𝐸𝐸 𝑋𝑋;𝜃𝜃 = �−∞
∞
𝑔𝑔 𝑥𝑥 𝑝𝑝 𝑥𝑥;𝜃𝜃 𝑑𝑑𝑥𝑥
Useful!! Can be easy to find the expectation of a function without finding find the PDF
60
Probability Review Moments
𝑋𝑋: 𝑁𝑁 𝑛𝑛,𝜎𝜎2𝑌𝑌 = 𝑔𝑔 𝑋𝑋 = 𝑋𝑋 −𝑛𝑛 2
— Mean: E 𝑋𝑋 = ∫−∞∞ 𝑥𝑥 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑥𝑥
— 2nd Moment: E 𝑋𝑋2 = ∫−∞∞ 𝑥𝑥2 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑥𝑥
— Variance: E 𝑋𝑋 −𝑛𝑛 2 = ∫−∞∞ (𝑥𝑥 −𝑛𝑛)2 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑥𝑥
Note: In general, the PDF of 𝑋𝑋2, 𝑋𝑋 − 𝐸𝐸 𝑋𝑋 2, … do not have the same PDF as 𝑋𝑋
61
Probability Review Expectation Examples Let
— 𝑋𝑋: 𝑁𝑁 𝑛𝑛,𝜎𝜎2
— 𝑌𝑌 = 𝑔𝑔 𝑋𝑋 = 𝑋𝑋 −𝑛𝑛 2
Compute
— E 𝑋𝑋;𝑛𝑛,𝜎𝜎2
— E 𝑔𝑔 𝑋𝑋 ;𝑛𝑛,𝜎𝜎2
— E 𝑋𝑋 + 10;𝑛𝑛,𝜎𝜎2
— E 𝑔𝑔 2𝑋𝑋 ;𝑛𝑛,𝜎𝜎2
62
Probability Review Expectation Examples Let
— 𝑋𝑋: 𝑁𝑁 𝑛𝑛,𝜎𝜎2
— 𝑌𝑌 = 𝑔𝑔 𝑋𝑋 = 𝑋𝑋 −𝑛𝑛 2
Compute
— E 𝑋𝑋;𝑛𝑛,𝜎𝜎2 = 𝑛𝑛
— E 𝑔𝑔 𝑋𝑋 ;𝑛𝑛,𝜎𝜎2 = 𝜎𝜎2
— E 𝑋𝑋 + 10;𝑛𝑛,𝜎𝜎2 = E 𝑋𝑋;𝑛𝑛,𝜎𝜎2 + 10 = 𝑛𝑛+ 10
— E 𝑔𝑔 2𝑋𝑋 ;𝑛𝑛,𝜎𝜎2 = 4𝜎𝜎2
63
Probability Review Expectation Examples Let
— 𝑋𝑋: 𝑠𝑠𝑥𝑥𝑝𝑝𝑛𝑛𝑛𝑛𝑠𝑠𝑛𝑛𝑠𝑠𝑐𝑐𝑎𝑎𝑛𝑛 𝜆𝜆
— 𝑝𝑝 𝑥𝑥;𝜆𝜆 = �𝜆𝜆𝑠𝑠−𝜆𝜆𝑥𝑥 for 𝑥𝑥 ≥ 00 for 𝑥𝑥 < 0
Compute
— E 𝑋𝑋;𝜆𝜆
64
Probability Review Expectation Examples Let
— 𝑋𝑋: 𝑠𝑠𝑥𝑥𝑝𝑝𝑛𝑛𝑛𝑛𝑠𝑠𝑛𝑛𝑠𝑠𝑐𝑐𝑎𝑎𝑛𝑛 𝜆𝜆
— 𝑝𝑝 𝑥𝑥;𝜆𝜆 = �𝜆𝜆𝑠𝑠−𝜆𝜆𝑥𝑥 for 𝑥𝑥 ≥ 00 for 𝑥𝑥 < 0
Compute
— E 𝑋𝑋;𝜆𝜆 = ∫0∞𝑥𝑥 𝑝𝑝 𝑥𝑥 𝑑𝑑𝑥𝑥 = ∫0
∞𝑥𝑥𝜆𝜆𝑠𝑠−𝜆𝜆𝑥𝑥 𝑑𝑑𝑥𝑥
E 𝑋𝑋;𝜆𝜆 = �−𝑠𝑠−𝜆𝜆𝑥𝑥 𝜆𝜆𝑥𝑥 + 1
𝜆𝜆0
∞
=−𝑠𝑠−𝜆𝜆∞ −𝜆𝜆∞+ 1
𝜆𝜆+𝑠𝑠−0𝑥𝑥 1
𝜆𝜆
E 𝑋𝑋;𝜆𝜆 =1𝜆𝜆
L’Hospital’s Rule = 0
65
Probability Review Expectation Examples Let
— 𝑋𝑋: 𝑠𝑠𝑥𝑥𝑝𝑝𝑛𝑛𝑛𝑛𝑠𝑠𝑛𝑛𝑠𝑠𝑐𝑐𝑎𝑎𝑛𝑛 𝜆𝜆
— 𝑝𝑝 𝑥𝑥;𝜆𝜆 = �𝜆𝜆𝑠𝑠−𝜆𝜆𝑥𝑥 for 𝑥𝑥 ≥ 00 for 𝑥𝑥 < 0
Compute
— E 𝑋𝑋2;𝜆𝜆
66
Probability Review Expectation Examples Let
— 𝑋𝑋: 𝑠𝑠𝑥𝑥𝑝𝑝𝑛𝑛𝑛𝑛𝑠𝑠𝑛𝑛𝑠𝑠𝑐𝑐𝑎𝑎𝑛𝑛 𝜆𝜆
— 𝑝𝑝 𝑥𝑥;𝜆𝜆 = �𝜆𝜆𝑠𝑠−𝜆𝜆𝑥𝑥 for 𝑥𝑥 ≥ 00 for 𝑥𝑥 < 0
Compute
— E 𝑋𝑋2;𝜆𝜆 = ∫0∞𝑥𝑥2 𝑝𝑝 𝑥𝑥 𝑑𝑑𝑥𝑥 = ∫0
∞𝑥𝑥2𝜆𝜆𝑠𝑠−𝜆𝜆𝑥𝑥 𝑑𝑑𝑥𝑥
E 𝑋𝑋2;𝜆𝜆 = �−𝑠𝑠−𝜆𝜆𝑥𝑥 𝜆𝜆2𝑥𝑥2 + 2𝜆𝜆𝑥𝑥 + 2
𝜆𝜆20
∞
=−𝑠𝑠−𝜆𝜆∞ −𝜆𝜆2∞2 + 2𝜆𝜆𝑥𝑥 + 2
𝜆𝜆2 +𝑠𝑠−0𝑥𝑥 1𝜆𝜆2
E 𝑋𝑋2;𝜆𝜆 =2𝜆𝜆2
= 0
67
Two Random Variables (and their relationships)
Several Random Variables Joint PDFs of Random Variables 𝑝𝑝 𝑥𝑥,𝑦𝑦 Joint PDF of 2 random variables
Vector form
Define 𝑾𝑾 = 𝑋𝑋𝑌𝑌 , 𝒘𝒘 =
𝑥𝑥𝑦𝑦
𝑝𝑝 𝒘𝒘 = 𝑝𝑝 𝑥𝑥,𝑦𝑦 Joint PDF of 2 random variables (short version)
Credit: https://en.wikipedia.org/wiki/Multivariate_normal_distribution#/media/File:MultivariateNormal.png
69
Several Random Variables Chain Rule for 2 Random Variables
𝑝𝑝 𝑥𝑥,𝑦𝑦 = 𝑝𝑝 𝑥𝑥 𝑝𝑝 𝑦𝑦|𝑥𝑥 = 𝑝𝑝 𝑦𝑦 𝑝𝑝 𝑥𝑥|𝑦𝑦
If Random Variables are independent
𝑝𝑝 𝑥𝑥,𝑦𝑦 = 𝑝𝑝 𝑥𝑥 𝑝𝑝 𝑦𝑦
70
Several Random Variables Moments of two random variables 𝑋𝑋,𝑌𝑌
— Mean: E 𝑋𝑋 , E 𝑌𝑌— 2nd Moment: E 𝑋𝑋2 ,𝐸𝐸 𝑌𝑌2
— Variance: E 𝑋𝑋 −𝑛𝑛𝑥𝑥2 , E 𝑋𝑋 −𝑛𝑛𝑦𝑦
2
— Cross-Correlation: E 𝑋𝑋𝑌𝑌— Cross-variance: E 𝑋𝑋 −𝑛𝑛𝑥𝑥 𝑌𝑌 −𝑛𝑛𝑦𝑦 = E 𝑋𝑋𝑌𝑌 − E 𝑋𝑋 E 𝑌𝑌
If 𝑋𝑋,𝑌𝑌 are uncorrelated Cross-Correlation: 𝐸𝐸 𝑋𝑋𝑌𝑌 = 𝐸𝐸 𝑋𝑋 𝐸𝐸 𝑌𝑌
Co-Variance: 𝐸𝐸 𝑋𝑋 − 𝐸𝐸 𝑋𝑋 𝑌𝑌 −𝐸𝐸 𝑌𝑌 = 0
Variance sum: var 𝑋𝑋 + 𝑌𝑌 = var 𝑋𝑋 + var 𝑌𝑌
Important Note: — Independent implies uncorrelated, but Correlated does not imply independent
71
Several Random Variables Example: Two Normal Distributions Consider the two random variables
𝑋𝑋~𝒩𝒩 0,1 , 𝑌𝑌~𝒩𝒩 𝑥𝑥, 1
Compute the joint PDF 𝑝𝑝 𝑥𝑥,𝑦𝑦
72
Several Random Variables Example: Two Normal Distributions Consider the two random variables
𝑋𝑋~𝒩𝒩 0,1 , 𝑌𝑌~𝒩𝒩 𝑥𝑥, 1
Compute the joint PDF 𝑝𝑝 𝑥𝑥,𝑦𝑦
𝑝𝑝 𝑥𝑥,𝑦𝑦 = 𝑝𝑝 𝑥𝑥 𝑝𝑝 𝑦𝑦|𝑥𝑥 Chain Rule
𝑝𝑝 𝑥𝑥 = 12𝜋𝜋(1)2
exp − 𝑥𝑥 2
2(1)2
𝑝𝑝 𝑦𝑦|𝑥𝑥 = 12𝜋𝜋(1)2
exp − 𝑦𝑦−𝑥𝑥 2
2(1)2
𝑝𝑝 𝑥𝑥,𝑦𝑦 = 12𝜋𝜋(1)2
exp − 𝑥𝑥2
2(1)21
2𝜋𝜋(1)2exp − 𝑦𝑦−𝑥𝑥 2
2(1)2
𝑝𝑝 𝑥𝑥,𝑦𝑦 = 12𝜋𝜋
exp −𝑥𝑥2+ 𝑦𝑦−𝑥𝑥 2
2= 1
2𝜋𝜋exp −2𝑥𝑥2+𝑦𝑦2−2𝑦𝑦𝑥𝑥
2
𝑝𝑝 𝑥𝑥,𝑦𝑦 = 12𝜋𝜋
exp −12
𝑥𝑥𝑦𝑦
𝑇𝑇 2 −1−1 1
𝑥𝑥𝑦𝑦 → 𝑋𝑋
𝑌𝑌 ~𝒩𝒩 00 , 2 −1
−1 1−1
= 1 11 2
73
Several Random Variables Example: Two Normal Distributions Consider the two random variables
𝑋𝑋~𝒩𝒩 0,1 , 𝑌𝑌~𝒩𝒩 𝑥𝑥, 1
Compute the joint PDF 𝑝𝑝 𝑥𝑥,𝑦𝑦
𝑋𝑋𝑌𝑌 ~𝒩𝒩 0
0 , 1 11 2
Question: How do we interpret this distribution?
74