ece 6540, lecture 01

Variations

ECE 6540, Lecture 01Introduction and Review of Probability

Estimation Theory: Some Definitions

Definitions Question: What is a statistic? How do we define it?

3

Definitions Question: What is a statistic? How do we define it?

Answer: A statistic is any function of sampled data

The function must be independent of the data’s underlying probability distribution

4

Definitions Examples:

𝑥𝑥~𝒩𝒩 0,1 , y~𝒩𝒩 0,2

Are these statistics? 𝑥𝑥

𝑥𝑥 + 𝑦𝑦

𝑥𝑥2

𝑥𝑥𝑥𝑥 − 𝑦𝑦ln 𝑥𝑥 + 2 + 3𝑦𝑦

𝐸𝐸 𝑥𝑥

5


𝑥𝑥~𝒩𝒩 0,1 , y~𝒩𝒩 0,2

Are these statistics? 𝑥𝑥 Yes!

𝑥𝑥 + 𝑦𝑦 Yes!

𝑥𝑥2 Yes!

𝑥𝑥𝑥𝑥 − 𝑦𝑦ln 𝑥𝑥 + 2 + 3𝑦𝑦 Yes!

𝐸𝐸 𝑥𝑥 No!

6

Definitions Question: What is an estimator? How do we define it?

7

Definitions Question: What is an estimator? How do we define it?

Answer: An estimator is a statistic that estimates a specific value

8


𝑥𝑥~𝒩𝒩 0,1 , y~𝒩𝒩 0,1

A familiar statistic12𝑥𝑥 + 𝑦𝑦 is an estimator of what?

Is it a good estimator? Why or why not?

9


𝑥𝑥~𝒩𝒩 0,1 , y~𝒩𝒩 0,1

A less familiar statistic23𝑥𝑥 + 1

3𝑦𝑦 is an estimator of what?

Is it a good estimator? Why or why not?

10

Definitions Question: What is an estimation theory? How do we define it?

11

Definitions Question: What is an estimation theory? How do we define it?

Answer: Estimation theory is the study estimators and their properties.

12

Definitions Question: What is a [statistical] detector (not to be confused with communications

detector)?

13

Definitions Question: What is a [statistical] detector (not to be confused with communications

detector)?

Answer: (Warning: definition is a but fuzzy) A detector is a statistic or process

that determines the presence of a signal within noise

A [hypothesis] test is a method for determining what distribution a detector (statistic) belongs to

14

Definitions Example:

𝑛𝑛~𝒩𝒩 0,1 , 𝑥𝑥 is any signal

Hypothesis Test Null Hypothesis: y = n

Alternative Hypothesis: y = x + n

What is a good detector?

15

Definitions Example:

𝑛𝑛~𝒩𝒩 0,1 , 𝑥𝑥 is any signal

Hypothesis Test Null Hypothesis: y = n

Alternative Hypothesis: y = x + n

What is a good detector? Optimal detector: s = 𝑦𝑦 2

Optimal test: s > 𝜆𝜆 (threshold 𝜆𝜆 is determined from a Chi-square distribution)

16

Definitions Question: What is an detection theory? How do we define it?

17

Definitions Question: What is an detection theory? How do we define it?

Answer: Detection theory is the study detectors and their properties.

18

Definitions Question: In engineering, how do we define the “best” or “optimal” of something

(e.g., the best estimator or the best detector)

19

Definitions Question: In engineering, how do we define the “best” or “optimal” of something

(e.g., the best estimator or the best detector)

Answer: Trick question, “best” and “optimal” is always based on some criteria

that WE define.

20

Applications

Definitions Question: What are some applications of estimation theory?

22

Applications RADAR (Radio Detection And Ranging) / Sonar

Detection: Is there a reflection from an aircraft?

Estimation: How far is the aircraft / what is its precise location?

Related (Waveform Design): Can I design waveforms to make the above easier/harder?

Detection Theory Estimation Theory

Credit: https://en.wikipedia.org/wiki/Radar23

Applications Communications Detection: Did I receive a message?

Estimation: What is the message?

Related (Coding): Can I design codes to make the above easier/harder?

Credit: http://www.ohlone.edu/instr/speech/longdesc-diagramcommunication.html

24

Applications It is pervasive in signal processing Estimation theory = applied statistics

Most modern signal processing tools involve statistics— Array processing— Compressive sensing (most proofs are probability based)— Network science (probabilistic graphical models)— Optimal filter design— De-noising — Tracking (e.g., Kalman filter)— Statistical Modelling / Analysis

25

Applications Two examples: Gaussian Random Variable

— What is an optimal estimate for the expected value?

Laplace Random Variable— What is an optimal estimate for the expected value?

26

Schedule

Schedule Part 1: Classical Estimation Theory Minimum Mean Square Error Estimators

Minimum Variance Unbiased Estimators

Maximum Likelihood Estimators

Part 2: Bayesian Estimation Theory Maximum a Priori Estimators

Minimax Estimators

Part 3: Detection Theory Neyman-Pearson Tests

Generalized Maximum Likelihood Tests

28

Schedule Question: What is the difference between classical and Bayesian statistics? Why are these differences important?

29

Schedule Quick warning! In the first few classes, I am going to throw a lot of information at you

(some of which you may know, some of which you may not)

I do not expect you to retain everything 100%. My goal is to expose you to these concepts and make you more comfortable about the concepts.

30

Probability Review (with some things you may have not seen)

Probability Review Quick Note: Notation everywhere is different

We will try to stick with Kay’s notation

32

Probability Review Probability events: Probability that ‘event’ A

— Pr 𝐴𝐴

Probability of ‘event’ A AND B— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴,𝐵𝐵

Probability of ‘event’ A OR B— Pr 𝐴𝐴 ∪ 𝐵𝐵

Probability of an ‘event’ A, given ‘event’ B— 𝑃𝑃 𝐴𝐴|𝐵𝐵

33


— Pr 𝐴𝐴




Ω

Event A

Event B

<- Universe

34


— Pr 𝐴𝐴




Ω

Event A

Event B

<- Universe

35


— Pr 𝐴𝐴




Ω

Event A

Event B

<- Universe

36

Event A

Event B


— Pr 𝐴𝐴




Ω <- Universe

37


— Pr 𝐴𝐴




New UniverseΩ = B

38

Probability Review EXAMPLE: Probability events (fair coin flips): Probability that ‘event’ A

— Pr 𝐴𝐴




Event 𝐴𝐴 → Coin 1 is headsEvent 𝐵𝐵 → Coin 2 is heads

39

Probability Review EXAMPLE: Probability events (fair coin flips): Probability that ‘event’ A

— Pr 𝐴𝐴 = 1/2

Probability of ‘event’ A AND B— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴,𝐵𝐵 = 1/4

Probability of ‘event’ A OR B— Pr 𝐴𝐴 ∪ 𝐵𝐵 = 3/4

Probability of an ‘event’ A, given ‘event’ B— 𝑃𝑃 𝐴𝐴|𝐵𝐵 = 1/2

Event 𝐴𝐴 → Coin 1 is headsEvent 𝐵𝐵 → Coin 2 is heads

40

Probability Review Relationships between probability events: Chain Rule

— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵|𝐴𝐴 = 𝑃𝑃 𝐵𝐵 𝑃𝑃 𝐴𝐴|𝐵𝐵

Bayes Theorem

— Pr 𝐴𝐴|𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵|𝐴𝐴Pr 𝐵𝐵

“OR” Rule— Pr 𝐴𝐴 ∪ 𝐵𝐵 = Pr 𝐴𝐴 + Pr 𝐵𝐵 − Pr 𝐴𝐴 ∩𝐵𝐵

Independence Events— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵

Disjoint Events— Pr 𝐴𝐴 ∩ 𝐵𝐵 = 0

41



Bayes Theorem





New UniverseΩ = B

Weight this by probability to be in B

42



Bayes Theorem





New UniverseΩ = B

Derive from above

43



Bayes Theorem





Event A

Event B

Remove overlap

44


— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵|𝐴𝐴 = Pr 𝐵𝐵 Pr 𝐴𝐴|𝐵𝐵

Bayes Theorem



Independence Events— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵 Pr 𝐵𝐵 = Pr 𝐵𝐵|𝐴𝐴


45


— Pr 𝐴𝐴 ∩ 𝐵𝐵 = Pr 𝐴𝐴 Pr 𝐵𝐵|𝐴𝐴 = Pr 𝐵𝐵 Pr 𝐴𝐴|𝐵𝐵

Bayes Theorem





Event A

Event B

46

Probability Review Random Variables: Continuous-values random variables 𝑋𝑋

Capital-case (𝑋𝑋) means random (this is the notation we will use)

Lower-case (𝑥𝑥) means fixed value

Probability Density Functions (PDF) with parameter 𝜽𝜽 𝑝𝑝𝜃𝜃 𝑥𝑥

𝑝𝑝𝑋𝑋,𝜃𝜃 𝑥𝑥

𝑝𝑝 𝑥𝑥;𝜃𝜃

All three notations mean the same thing!

Kay’s notation

47

Probability Density Functions and Cumulative Density Functions

Probability Review Probability Density Functions (PDF) Definition A valid PDF is any function 𝑓𝑓 𝑥𝑥 that is both

— Non-negative p 𝑥𝑥 ≥ 0— Unit area ∫−∞

∞ 𝑝𝑝 𝑥𝑥 𝑑𝑑𝑥𝑥 = 1

Credit: https://commons.wikimedia.org/wiki/File:Normal_Distribution_PDF.svg49

Probability Review Cumulative Density Functions (CDF) Definition A valid CDF is any function 𝐹𝐹 𝑥𝑥 that is both

— Monotonically increasing (non-deceasing )— Normalized: 𝐹𝐹 −∞ = 0, 𝐹𝐹 ∞ = 1

From a PDF as

— 𝑃𝑃 𝑥𝑥;𝜃𝜃 = Pr 𝑋𝑋 ≤ 𝑥𝑥 = ∫−∞𝑥𝑥 𝑝𝑝 𝜏𝜏;𝜃𝜃 𝑑𝑑𝜏𝜏

Figure

Credit: https://en.wikipedia.org/wiki/Normal_distribution#/media/File:Normal_Distribution_CDF.svg 50

Probability Review Gaussian (or Normal) Random Variable 𝑋𝑋:𝑁𝑁 𝜇𝜇,𝜎𝜎2

PDF is also known as the “bell curve”

mean variance

𝑝𝑝 𝑥𝑥; 𝜇𝜇,𝜎𝜎 =1

2 𝜋𝜋𝜎𝜎2exp −

𝑥𝑥 − 𝜇𝜇 2

2𝜎𝜎2

Credit:https://commons.wikimedia.org/wiki/File:Normal_Distrib

ution_PDF.svg

51

Probability Review Cumulative Distribution Function (CDF):

𝑃𝑃 𝑥𝑥;𝜃𝜃 = Pr 𝑋𝑋 ≤ 𝑥𝑥 = �−∞

𝑥𝑥

𝑝𝑝 𝜏𝜏;𝜃𝜃 𝑑𝑑𝜏𝜏

FigureFigure Figure

Credit: https://en.wikipedia.org/wiki/Normal_distribution#/media/File:Normal_Distribution_CDF.svg

52

Probability Review Example: Uniform Random Variable 𝑋𝑋: uniform 𝑎𝑎,𝑏𝑏

𝑝𝑝 𝑥𝑥;𝑎𝑎,𝑏𝑏 = �1

𝑏𝑏−𝑎𝑎for 𝑎𝑎 ≤ 𝑥𝑥 ≤ 𝑏𝑏

0 for otherwise Used for many applications

Credit: http://www.epixanalytics.com/modelassist/CrystalBall/Model_Assist.htm#Distributions/Continuous_distributions/Uniform.htm

53

Probability Review Example: Beta Random Variable 𝑋𝑋: beta(𝛼𝛼,𝛽𝛽)

𝑝𝑝 𝑥𝑥;𝛼𝛼,𝛽𝛽 = 1𝐵𝐵 𝛼𝛼,𝛽𝛽

𝑥𝑥𝛼𝛼−1 1− 𝑥𝑥 𝛽𝛽−1

Used in control systems, population genetics, Bayesian inference

Credit: https://en.wikipedia.org/wiki/Beta_distribution#/media/File:Beta_distribution_pdf.svg

Beta function

54

Probability Review Example: Chi-squared random variable

𝑋𝑋: 𝒳𝒳N2

𝑝𝑝 𝑥𝑥;𝑁𝑁 =𝑥𝑥𝑁𝑁2−1𝑒𝑒−

𝑥𝑥2

2𝑘𝑘2Γ 𝑁𝑁

2

for 𝑥𝑥 > 0

0 for otherwise

Used in detection theory

Credit: https://en.wikipedia.org/wiki/Chi-squared_distribution#/media/File:Chi-square_pdf.svg

55

Probability Review Transformation of Random Variables (Common Transformations)

Let 𝑥𝑥 and 𝑦𝑦 be independent such that 𝑥𝑥~𝒩𝒩 0,1 and 𝑦𝑦~𝒩𝒩 0,1

𝑥𝑥 + 𝑦𝑦~ ?

𝑥𝑥 2 ~ ?

𝑛𝑛 𝑛𝑛−1𝑛𝑛

𝑥𝑥+𝑦𝑦𝑥𝑥2+𝑦𝑦2

~ ?

𝑥𝑥 2

𝑦𝑦 2 ~?

𝑥𝑥𝑦𝑦

~ ?

56

Probability Review Transformation of Random Variables (Common Transformations)

Let 𝑥𝑥 and 𝑦𝑦 be independent such that 𝑥𝑥~𝒩𝒩 0,1 and 𝑦𝑦~𝒩𝒩 0,1

𝑥𝑥 + 𝑦𝑦~𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑛𝑎𝑎𝑛𝑛

𝑥𝑥 2 ~ 𝑐𝑐𝑐𝑐𝑐 − 𝑠𝑠𝑠𝑠𝑠𝑠𝑎𝑎𝑛𝑛𝑠𝑠𝑑𝑑

𝑛𝑛 𝑛𝑛−1𝑛𝑛

𝑥𝑥+𝑦𝑦𝑥𝑥2+𝑦𝑦2

~ 𝑠𝑠𝑠𝑠𝑠𝑠𝑑𝑑𝑠𝑠𝑛𝑛𝑠𝑠′𝑠𝑠 − 𝑇𝑇

𝑥𝑥 2

𝑦𝑦 2 ~𝐹𝐹

𝑥𝑥𝑦𝑦

~𝐶𝐶𝑎𝑎𝑠𝑠𝑐𝑐𝑐𝑦𝑦

57

Expectations and Moments

Probability Review Random Variables 𝑋𝑋 <- denoted by capital letter usually

Two types: discrete, continuous

Defined by a probability distribution function (PDF) and cumulative distribution function (CDF)

For discrete-valued random variables, the PDF is replaced by a probability mass function (PMF)

59

Probability Review Expectation

𝐸𝐸 𝑋𝑋 = �−∞

∞

𝑥𝑥 𝑝𝑝 𝑥𝑥 𝑑𝑑𝑥𝑥

Expectation of a function

𝐸𝐸 𝑔𝑔 𝑋𝑋 = �−∞

∞

𝑔𝑔 𝑥𝑥 𝑝𝑝 𝑥𝑥 𝑑𝑑𝑥𝑥

Expectation with an unknown parameter

𝐸𝐸 𝑋𝑋;𝜃𝜃 = �−∞

∞

𝑔𝑔 𝑥𝑥 𝑝𝑝 𝑥𝑥;𝜃𝜃 𝑑𝑑𝑥𝑥

Useful!! Can be easy to find the expectation of a function without finding find the PDF

60

Probability Review Moments

𝑋𝑋: 𝑁𝑁 𝑛𝑛,𝜎𝜎2𝑌𝑌 = 𝑔𝑔 𝑋𝑋 = 𝑋𝑋 −𝑛𝑛 2

— Mean: E 𝑋𝑋 = ∫−∞∞ 𝑥𝑥 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑥𝑥

— 2nd Moment: E 𝑋𝑋2 = ∫−∞∞ 𝑥𝑥2 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑥𝑥

— Variance: E 𝑋𝑋 −𝑛𝑛 2 = ∫−∞∞ (𝑥𝑥 −𝑛𝑛)2 𝑓𝑓 𝑥𝑥 𝑑𝑑𝑥𝑥

Note: In general, the PDF of 𝑋𝑋2, 𝑋𝑋 − 𝐸𝐸 𝑋𝑋 2, … do not have the same PDF as 𝑋𝑋

61

Probability Review Expectation Examples Let

— 𝑋𝑋: 𝑁𝑁 𝑛𝑛,𝜎𝜎2

— 𝑌𝑌 = 𝑔𝑔 𝑋𝑋 = 𝑋𝑋 −𝑛𝑛 2

Compute

— E 𝑋𝑋;𝑛𝑛,𝜎𝜎2

— E 𝑔𝑔 𝑋𝑋 ;𝑛𝑛,𝜎𝜎2

— E 𝑋𝑋 + 10;𝑛𝑛,𝜎𝜎2

— E 𝑔𝑔 2𝑋𝑋 ;𝑛𝑛,𝜎𝜎2

62


— 𝑋𝑋: 𝑁𝑁 𝑛𝑛,𝜎𝜎2

— 𝑌𝑌 = 𝑔𝑔 𝑋𝑋 = 𝑋𝑋 −𝑛𝑛 2

Compute

— E 𝑋𝑋;𝑛𝑛,𝜎𝜎2 = 𝑛𝑛

— E 𝑔𝑔 𝑋𝑋 ;𝑛𝑛,𝜎𝜎2 = 𝜎𝜎2

— E 𝑋𝑋 + 10;𝑛𝑛,𝜎𝜎2 = E 𝑋𝑋;𝑛𝑛,𝜎𝜎2 + 10 = 𝑛𝑛+ 10

— E 𝑔𝑔 2𝑋𝑋 ;𝑛𝑛,𝜎𝜎2 = 4𝜎𝜎2

63


— 𝑋𝑋: 𝑠𝑠𝑥𝑥𝑝𝑝𝑛𝑛𝑛𝑛𝑠𝑠𝑛𝑛𝑠𝑠𝑐𝑐𝑎𝑎𝑛𝑛 𝜆𝜆

— 𝑝𝑝 𝑥𝑥;𝜆𝜆 = �𝜆𝜆𝑠𝑠−𝜆𝜆𝑥𝑥 for 𝑥𝑥 ≥ 00 for 𝑥𝑥 < 0

Compute

— E 𝑋𝑋;𝜆𝜆

64




Compute

— E 𝑋𝑋;𝜆𝜆 = ∫0∞𝑥𝑥 𝑝𝑝 𝑥𝑥 𝑑𝑑𝑥𝑥 = ∫0

∞𝑥𝑥𝜆𝜆𝑠𝑠−𝜆𝜆𝑥𝑥 𝑑𝑑𝑥𝑥

E 𝑋𝑋;𝜆𝜆 = �−𝑠𝑠−𝜆𝜆𝑥𝑥 𝜆𝜆𝑥𝑥 + 1

𝜆𝜆0

∞

=−𝑠𝑠−𝜆𝜆∞ −𝜆𝜆∞+ 1

𝜆𝜆+𝑠𝑠−0𝑥𝑥 1

𝜆𝜆

E 𝑋𝑋;𝜆𝜆 =1𝜆𝜆

L’Hospital’s Rule = 0

65




Compute

— E 𝑋𝑋2;𝜆𝜆

66




Compute

— E 𝑋𝑋2;𝜆𝜆 = ∫0∞𝑥𝑥2 𝑝𝑝 𝑥𝑥 𝑑𝑑𝑥𝑥 = ∫0

∞𝑥𝑥2𝜆𝜆𝑠𝑠−𝜆𝜆𝑥𝑥 𝑑𝑑𝑥𝑥

E 𝑋𝑋2;𝜆𝜆 = �−𝑠𝑠−𝜆𝜆𝑥𝑥 𝜆𝜆2𝑥𝑥2 + 2𝜆𝜆𝑥𝑥 + 2

𝜆𝜆20

∞

=−𝑠𝑠−𝜆𝜆∞ −𝜆𝜆2∞2 + 2𝜆𝜆𝑥𝑥 + 2

𝜆𝜆2 +𝑠𝑠−0𝑥𝑥 1𝜆𝜆2

E 𝑋𝑋2;𝜆𝜆 =2𝜆𝜆2

= 0

67

Two Random Variables (and their relationships)

Several Random Variables Joint PDFs of Random Variables 𝑝𝑝 𝑥𝑥,𝑦𝑦 Joint PDF of 2 random variables

Vector form

Define 𝑾𝑾 = 𝑋𝑋𝑌𝑌 , 𝒘𝒘 =

𝑥𝑥𝑦𝑦

𝑝𝑝 𝒘𝒘 = 𝑝𝑝 𝑥𝑥,𝑦𝑦 Joint PDF of 2 random variables (short version)

Credit: https://en.wikipedia.org/wiki/Multivariate_normal_distribution#/media/File:MultivariateNormal.png

69

Several Random Variables Chain Rule for 2 Random Variables

𝑝𝑝 𝑥𝑥,𝑦𝑦 = 𝑝𝑝 𝑥𝑥 𝑝𝑝 𝑦𝑦|𝑥𝑥 = 𝑝𝑝 𝑦𝑦 𝑝𝑝 𝑥𝑥|𝑦𝑦

If Random Variables are independent

𝑝𝑝 𝑥𝑥,𝑦𝑦 = 𝑝𝑝 𝑥𝑥 𝑝𝑝 𝑦𝑦

70

Several Random Variables Moments of two random variables 𝑋𝑋,𝑌𝑌

— Mean: E 𝑋𝑋 , E 𝑌𝑌— 2nd Moment: E 𝑋𝑋2 ,𝐸𝐸 𝑌𝑌2

— Variance: E 𝑋𝑋 −𝑛𝑛𝑥𝑥2 , E 𝑋𝑋 −𝑛𝑛𝑦𝑦

2

— Cross-Correlation: E 𝑋𝑋𝑌𝑌— Cross-variance: E 𝑋𝑋 −𝑛𝑛𝑥𝑥 𝑌𝑌 −𝑛𝑛𝑦𝑦 = E 𝑋𝑋𝑌𝑌 − E 𝑋𝑋 E 𝑌𝑌

If 𝑋𝑋,𝑌𝑌 are uncorrelated Cross-Correlation: 𝐸𝐸 𝑋𝑋𝑌𝑌 = 𝐸𝐸 𝑋𝑋 𝐸𝐸 𝑌𝑌

Co-Variance: 𝐸𝐸 𝑋𝑋 − 𝐸𝐸 𝑋𝑋 𝑌𝑌 −𝐸𝐸 𝑌𝑌 = 0

Variance sum: var 𝑋𝑋 + 𝑌𝑌 = var 𝑋𝑋 + var 𝑌𝑌

Important Note: — Independent implies uncorrelated, but Correlated does not imply independent

71

Several Random Variables Example: Two Normal Distributions Consider the two random variables

𝑋𝑋~𝒩𝒩 0,1 , 𝑌𝑌~𝒩𝒩 𝑥𝑥, 1

Compute the joint PDF 𝑝𝑝 𝑥𝑥,𝑦𝑦

72




𝑝𝑝 𝑥𝑥,𝑦𝑦 = 𝑝𝑝 𝑥𝑥 𝑝𝑝 𝑦𝑦|𝑥𝑥 Chain Rule

𝑝𝑝 𝑥𝑥 = 12𝜋𝜋(1)2

exp − 𝑥𝑥 2

2(1)2

𝑝𝑝 𝑦𝑦|𝑥𝑥 = 12𝜋𝜋(1)2

exp − 𝑦𝑦−𝑥𝑥 2

2(1)2

𝑝𝑝 𝑥𝑥,𝑦𝑦 = 12𝜋𝜋(1)2

exp − 𝑥𝑥2

2(1)21

2𝜋𝜋(1)2exp − 𝑦𝑦−𝑥𝑥 2

2(1)2

𝑝𝑝 𝑥𝑥,𝑦𝑦 = 12𝜋𝜋

exp −𝑥𝑥2+ 𝑦𝑦−𝑥𝑥 2

2= 1

2𝜋𝜋exp −2𝑥𝑥2+𝑦𝑦2−2𝑦𝑦𝑥𝑥

2

𝑝𝑝 𝑥𝑥,𝑦𝑦 = 12𝜋𝜋

exp −12

𝑥𝑥𝑦𝑦

𝑇𝑇 2 −1−1 1

𝑥𝑥𝑦𝑦 → 𝑋𝑋

𝑌𝑌 ~𝒩𝒩 00 , 2 −1

−1 1−1

= 1 11 2

73




𝑋𝑋𝑌𝑌 ~𝒩𝒩 0

0 , 1 11 2

Question: How do we interpret this distribution?

74

ece 6540, lecture 01

Documents