an introduction to information theory · fundamentals of information theory memory channels...
TRANSCRIPT
![Page 1: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/1.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
An Introduction to Information Theory
Guangyue Han
The University of Hong Kong
July, 2018@HKU
Guangyue Han The University of Hong Kong
![Page 2: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/2.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Outline of the Talk
I Fundamentals of Information Theory
I Research Directions: Memory Channels
I Research Directions: Continuous-Time Information Theory
Guangyue Han The University of Hong Kong
![Page 3: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/3.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Outline of the Talk
I Fundamentals of Information Theory
I Research Directions: Memory Channels
I Research Directions: Continuous-Time Information Theory
Guangyue Han The University of Hong Kong
![Page 4: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/4.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Fundamentals of Information Theory
Guangyue Han The University of Hong Kong
![Page 5: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/5.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
The Birth of Information Theory
C. E. Shannon. A mathematical theory of communication. BellSyst. Tech. J., 27:379-423,623-656, 1948.
Guangyue Han The University of Hong Kong
![Page 6: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/6.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
The Birth of Information Theory
C. E. Shannon. A mathematical theory of communication. BellSyst. Tech. J., 27:379-423,623-656, 1948.
Guangyue Han The University of Hong Kong
![Page 7: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/7.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
The Birth of Information Theory
C. E. Shannon. A mathematical theory of communication. BellSyst. Tech. J., 27:379-423,623-656, 1948.
Guangyue Han The University of Hong Kong
![Page 8: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/8.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Classical Information Theory
Guangyue Han The University of Hong Kong
![Page 9: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/9.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Classical Information Theory
Guangyue Han The University of Hong Kong
![Page 10: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/10.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Information Theory Nowadays
Guangyue Han The University of Hong Kong
![Page 11: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/11.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Entropy: Definition
The entropy H(X ) of a discrete random variable X is defined by
H(X ) = −∑x∈X
p(x) log p(x).
Let
X =
{1 with probability p,
0 with probability 1− p.
Then,
H(X ) = −p log p − (1− p) log(1− p) , H(p).
Guangyue Han The University of Hong Kong
![Page 12: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/12.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Entropy: Definition
The entropy H(X ) of a discrete random variable X is defined by
H(X ) = −∑x∈X
p(x) log p(x).
Let
X =
{1 with probability p,
0 with probability 1− p.
Then,
H(X ) = −p log p − (1− p) log(1− p) , H(p).
Guangyue Han The University of Hong Kong
![Page 13: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/13.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Entropy: Definition
The entropy H(X ) of a discrete random variable X is defined by
H(X ) = −∑x∈X
p(x) log p(x).
Let
X =
{1 with probability p,
0 with probability 1− p.
Then,
H(X ) = −p log p − (1− p) log(1− p) , H(p).
Guangyue Han The University of Hong Kong
![Page 14: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/14.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Entropy: Measure of Uncertainty
Guangyue Han The University of Hong Kong
![Page 15: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/15.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Entropy: Measure of Uncertainty
Guangyue Han The University of Hong Kong
![Page 16: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/16.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Joint Entropy and Conditional Entropy
The joint entropy H(X ,Y ) of a pair of discrete random variables(X ,Y ) with a joint distribution p(x , y) is defined as
H(X ,Y ) = −∑x∈X
∑y∈Y
p(x , y) log p(x , y).
If (X ,Y ) ∼ p(x , y), the conditional entropy H(Y |X ) is defined as
H(Y |X ) =∑x∈X
p(x)H(Y |X = x)
= −∑x∈X
p(x)∑y∈Y
p(y |x) log p(y |x)
= −∑x∈X
∑y∈Y
p(x , y) log p(y |x).
Guangyue Han The University of Hong Kong
![Page 17: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/17.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Joint Entropy and Conditional Entropy
The joint entropy H(X ,Y ) of a pair of discrete random variables(X ,Y ) with a joint distribution p(x , y) is defined as
H(X ,Y ) = −∑x∈X
∑y∈Y
p(x , y) log p(x , y).
If (X ,Y ) ∼ p(x , y), the conditional entropy H(Y |X ) is defined as
H(Y |X ) =∑x∈X
p(x)H(Y |X = x)
= −∑x∈X
p(x)∑y∈Y
p(y |x) log p(y |x)
= −∑x∈X
∑y∈Y
p(x , y) log p(y |x).
Guangyue Han The University of Hong Kong
![Page 18: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/18.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Joint Entropy and Conditional Entropy
The joint entropy H(X ,Y ) of a pair of discrete random variables(X ,Y ) with a joint distribution p(x , y) is defined as
H(X ,Y ) = −∑x∈X
∑y∈Y
p(x , y) log p(x , y).
If (X ,Y ) ∼ p(x , y), the conditional entropy H(Y |X ) is defined as
H(Y |X ) =∑x∈X
p(x)H(Y |X = x)
= −∑x∈X
p(x)∑y∈Y
p(y |x) log p(y |x)
= −∑x∈X
∑y∈Y
p(x , y) log p(y |x).
Guangyue Han The University of Hong Kong
![Page 19: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/19.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Joint Entropy and Conditional Entropy
The joint entropy H(X ,Y ) of a pair of discrete random variables(X ,Y ) with a joint distribution p(x , y) is defined as
H(X ,Y ) = −∑x∈X
∑y∈Y
p(x , y) log p(x , y).
If (X ,Y ) ∼ p(x , y), the conditional entropy H(Y |X ) is defined as
H(Y |X ) =∑x∈X
p(x)H(Y |X = x)
= −∑x∈X
p(x)∑y∈Y
p(y |x) log p(y |x)
= −∑x∈X
∑y∈Y
p(x , y) log p(y |x).
Guangyue Han The University of Hong Kong
![Page 20: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/20.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Joint Entropy and Conditional Entropy
The joint entropy H(X ,Y ) of a pair of discrete random variables(X ,Y ) with a joint distribution p(x , y) is defined as
H(X ,Y ) = −∑x∈X
∑y∈Y
p(x , y) log p(x , y).
If (X ,Y ) ∼ p(x , y), the conditional entropy H(Y |X ) is defined as
H(Y |X ) =∑x∈X
p(x)H(Y |X = x)
= −∑x∈X
p(x)∑y∈Y
p(y |x) log p(y |x)
= −∑x∈X
∑y∈Y
p(x , y) log p(y |x).
Guangyue Han The University of Hong Kong
![Page 21: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/21.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
All Entropies Together
Chain Rule
H(X ,Y ) = H(X ) + H(Y |X ).
Proof.
H(X ,Y ) = −∑x∈X
∑y∈Y
p(x , y) log p(x , y)
= −∑x∈X
∑y∈Y
p(x , y) log p(x)p(y |x)
= −∑x∈X
∑y∈Y
p(x , y) log p(x)−∑x∈X
∑y∈Y
p(x , y) log p(y |x)
= −∑x∈X
p(x) log p(x)−∑x∈X
∑y∈Y
p(x , y) log p(y |x)
= H(X ) + H(Y |X ).
Guangyue Han The University of Hong Kong
![Page 22: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/22.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
All Entropies Together
Chain Rule
H(X ,Y ) = H(X ) + H(Y |X ).
Proof.
H(X ,Y ) = −∑x∈X
∑y∈Y
p(x , y) log p(x , y)
= −∑x∈X
∑y∈Y
p(x , y) log p(x)p(y |x)
= −∑x∈X
∑y∈Y
p(x , y) log p(x)−∑x∈X
∑y∈Y
p(x , y) log p(y |x)
= −∑x∈X
p(x) log p(x)−∑x∈X
∑y∈Y
p(x , y) log p(y |x)
= H(X ) + H(Y |X ).
Guangyue Han The University of Hong Kong
![Page 23: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/23.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
All Entropies Together
Chain Rule
H(X ,Y ) = H(X ) + H(Y |X ).
Proof.
H(X ,Y ) = −∑x∈X
∑y∈Y
p(x , y) log p(x , y)
= −∑x∈X
∑y∈Y
p(x , y) log p(x)p(y |x)
= −∑x∈X
∑y∈Y
p(x , y) log p(x)−∑x∈X
∑y∈Y
p(x , y) log p(y |x)
= −∑x∈X
p(x) log p(x)−∑x∈X
∑y∈Y
p(x , y) log p(y |x)
= H(X ) + H(Y |X ).
Guangyue Han The University of Hong Kong
![Page 24: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/24.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
All Entropies Together
Chain Rule
H(X ,Y ) = H(X ) + H(Y |X ).
Proof.
H(X ,Y ) = −∑x∈X
∑y∈Y
p(x , y) log p(x , y)
= −∑x∈X
∑y∈Y
p(x , y) log p(x)p(y |x)
= −∑x∈X
∑y∈Y
p(x , y) log p(x)−∑x∈X
∑y∈Y
p(x , y) log p(y |x)
= −∑x∈X
p(x) log p(x)−∑x∈X
∑y∈Y
p(x , y) log p(y |x)
= H(X ) + H(Y |X ).
Guangyue Han The University of Hong Kong
![Page 25: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/25.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
All Entropies Together
Chain Rule
H(X ,Y ) = H(X ) + H(Y |X ).
Proof.
H(X ,Y ) = −∑x∈X
∑y∈Y
p(x , y) log p(x , y)
= −∑x∈X
∑y∈Y
p(x , y) log p(x)p(y |x)
= −∑x∈X
∑y∈Y
p(x , y) log p(x)−∑x∈X
∑y∈Y
p(x , y) log p(y |x)
= −∑x∈X
p(x) log p(x)−∑x∈X
∑y∈Y
p(x , y) log p(y |x)
= H(X ) + H(Y |X ).
Guangyue Han The University of Hong Kong
![Page 26: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/26.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
All Entropies Together
Chain Rule
H(X ,Y ) = H(X ) + H(Y |X ).
Proof.
H(X ,Y ) = −∑x∈X
∑y∈Y
p(x , y) log p(x , y)
= −∑x∈X
∑y∈Y
p(x , y) log p(x)p(y |x)
= −∑x∈X
∑y∈Y
p(x , y) log p(x)−∑x∈X
∑y∈Y
p(x , y) log p(y |x)
= −∑x∈X
p(x) log p(x)−∑x∈X
∑y∈Y
p(x , y) log p(y |x)
= H(X ) + H(Y |X ).
Guangyue Han The University of Hong Kong
![Page 27: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/27.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
All Entropies Together
Chain Rule
H(X ,Y ) = H(X ) + H(Y |X ).
Proof.
H(X ,Y ) = −∑x∈X
∑y∈Y
p(x , y) log p(x , y)
= −∑x∈X
∑y∈Y
p(x , y) log p(x)p(y |x)
= −∑x∈X
∑y∈Y
p(x , y) log p(x)−∑x∈X
∑y∈Y
p(x , y) log p(y |x)
= −∑x∈X
p(x) log p(x)−∑x∈X
∑y∈Y
p(x , y) log p(y |x)
= H(X ) + H(Y |X ).
Guangyue Han The University of Hong Kong
![Page 28: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/28.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Mutual Information
Original Definition
The mutual information I (X ;Y ) between two discrete randomvariables X ,Y with joint distribution p(x , y) is defined as
I (X ;Y ) =∑x ,y
p(x , y) logp(x , y)
p(x)p(y).
Alternative Definitions
I (X ;Y ) = H(X ) + H(Y )− H(X ,Y )
= H(X )− H(X |Y )
= H(Y )− H(Y |X )
Guangyue Han The University of Hong Kong
![Page 29: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/29.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Mutual Information
Original Definition
The mutual information I (X ;Y ) between two discrete randomvariables X ,Y with joint distribution p(x , y) is defined as
I (X ;Y ) =∑x ,y
p(x , y) logp(x , y)
p(x)p(y).
Alternative Definitions
I (X ;Y ) = H(X ) + H(Y )− H(X ,Y )
= H(X )− H(X |Y )
= H(Y )− H(Y |X )
Guangyue Han The University of Hong Kong
![Page 30: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/30.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Mutual Information
Original Definition
The mutual information I (X ;Y ) between two discrete randomvariables X ,Y with joint distribution p(x , y) is defined as
I (X ;Y ) =∑x ,y
p(x , y) logp(x , y)
p(x)p(y).
Alternative Definitions
I (X ;Y ) = H(X ) + H(Y )− H(X ,Y )
= H(X )− H(X |Y )
= H(Y )− H(Y |X )
Guangyue Han The University of Hong Kong
![Page 31: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/31.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Mutual Information
Original Definition
The mutual information I (X ;Y ) between two discrete randomvariables X ,Y with joint distribution p(x , y) is defined as
I (X ;Y ) =∑x ,y
p(x , y) logp(x , y)
p(x)p(y).
Alternative Definitions
I (X ;Y ) = H(X ) + H(Y )− H(X ,Y )
= H(X )− H(X |Y )
= H(Y )− H(Y |X )
Guangyue Han The University of Hong Kong
![Page 32: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/32.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Mutual Information
Original Definition
The mutual information I (X ;Y ) between two discrete randomvariables X ,Y with joint distribution p(x , y) is defined as
I (X ;Y ) =∑x ,y
p(x , y) logp(x , y)
p(x)p(y).
Alternative Definitions
I (X ;Y ) = H(X ) + H(Y )− H(X ,Y )
= H(X )− H(X |Y )
= H(Y )− H(Y |X )
Guangyue Han The University of Hong Kong
![Page 33: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/33.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Mutual Information and Entropy
Guangyue Han The University of Hong Kong
![Page 34: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/34.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Mutual Information and Entropy
Guangyue Han The University of Hong Kong
![Page 35: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/35.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Asymptotic Equipartition Property Theorem
AEP TheoremIf X1,X2, . . . , are i.i.d ∼ p(x), then
−1
nlog p(X1,X2, . . . ,Xn)→ H(X ) in probability.
Proof.
−1
nlog p(X1,X2, . . . ,Xn) = −1
n
n∑i=1
log p(Xi )→ −E[log p(X )] = H(X ).
Guangyue Han The University of Hong Kong
![Page 36: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/36.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Asymptotic Equipartition Property Theorem
AEP TheoremIf X1,X2, . . . , are i.i.d ∼ p(x), then
−1
nlog p(X1,X2, . . . ,Xn)→ H(X ) in probability.
Proof.
−1
nlog p(X1,X2, . . . ,Xn) = −1
n
n∑i=1
log p(Xi )→ −E[log p(X )] = H(X ).
Guangyue Han The University of Hong Kong
![Page 37: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/37.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Asymptotic Equipartition Property Theorem
AEP TheoremIf X1,X2, . . . , are i.i.d ∼ p(x), then
−1
nlog p(X1,X2, . . . ,Xn)→ H(X ) in probability.
Proof.
−1
nlog p(X1,X2, . . . ,Xn) = −1
n
n∑i=1
log p(Xi )
→ −E[log p(X )] = H(X ).
Guangyue Han The University of Hong Kong
![Page 38: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/38.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Asymptotic Equipartition Property Theorem
AEP TheoremIf X1,X2, . . . , are i.i.d ∼ p(x), then
−1
nlog p(X1,X2, . . . ,Xn)→ H(X ) in probability.
Proof.
−1
nlog p(X1,X2, . . . ,Xn) = −1
n
n∑i=1
log p(Xi )→ −E[log p(X )]
= H(X ).
Guangyue Han The University of Hong Kong
![Page 39: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/39.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Asymptotic Equipartition Property Theorem
AEP TheoremIf X1,X2, . . . , are i.i.d ∼ p(x), then
−1
nlog p(X1,X2, . . . ,Xn)→ H(X ) in probability.
Proof.
−1
nlog p(X1,X2, . . . ,Xn) = −1
n
n∑i=1
log p(Xi )→ −E[log p(X )] = H(X ).
Guangyue Han The University of Hong Kong
![Page 40: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/40.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
The Shannon-McMillan-Breiman Theorem
The Shannon-McMillan-Breiman TheoremLet {Xn} be a finite-valued stationary ergodic process. Then, withprobability 1,
−1
nlog p(X1,X2, . . . ,Xn)→ H(X ),
where H(X ) here denotes the entropy rate of the process {Xn},namely,
H(X ) = limn→∞
H(X1,X2, . . . ,Xn)/n.
Proof.There are many. The simplest is the sandwich argument byAlgoet and Cover [1988].
Guangyue Han The University of Hong Kong
![Page 41: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/41.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
The Shannon-McMillan-Breiman Theorem
The Shannon-McMillan-Breiman TheoremLet {Xn} be a finite-valued stationary ergodic process. Then, withprobability 1,
−1
nlog p(X1,X2, . . . ,Xn)→ H(X ),
where H(X ) here denotes the entropy rate of the process {Xn},namely,
H(X ) = limn→∞
H(X1,X2, . . . ,Xn)/n.
Proof.There are many. The simplest is the sandwich argument byAlgoet and Cover [1988].
Guangyue Han The University of Hong Kong
![Page 42: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/42.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
The Shannon-McMillan-Breiman Theorem
The Shannon-McMillan-Breiman TheoremLet {Xn} be a finite-valued stationary ergodic process. Then, withprobability 1,
−1
nlog p(X1,X2, . . . ,Xn)→ H(X ),
where H(X ) here denotes the entropy rate of the process {Xn},namely,
H(X ) = limn→∞
H(X1,X2, . . . ,Xn)/n.
Proof.There are many. The simplest is the sandwich argument byAlgoet and Cover [1988].
Guangyue Han The University of Hong Kong
![Page 43: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/43.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
The Shannon-McMillan-Breiman Theorem
The Shannon-McMillan-Breiman TheoremLet {Xn} be a finite-valued stationary ergodic process. Then, withprobability 1,
−1
nlog p(X1,X2, . . . ,Xn)→ H(X ),
where H(X ) here denotes the entropy rate of the process {Xn},namely,
H(X ) = limn→∞
H(X1,X2, . . . ,Xn)/n.
Proof.There are many. The simplest is the sandwich argument byAlgoet and Cover [1988].
Guangyue Han The University of Hong Kong
![Page 44: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/44.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Typical Set: Definition and Properties
DefinitionThe typical set A
(n)ε with respect to p(x) is the set of sequence
(x1, x2, . . . , xn) ∈ X n with the property
H(X )− ε ≤ −1
nlog p(x1, x2, . . . , xn) ≤ H(X ) + ε.
Properties
I (1− ε)2n(H(X )−ε) ≤ |A(n)ε | ≤ 2n(H(X )+ε) for n sufficiently large.
I Pr{A(n)ε } > 1− ε for n sufficiently large.
Guangyue Han The University of Hong Kong
![Page 45: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/45.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Typical Set: Definition and Properties
DefinitionThe typical set A
(n)ε with respect to p(x) is the set of sequence
(x1, x2, . . . , xn) ∈ X n with the property
H(X )− ε ≤ −1
nlog p(x1, x2, . . . , xn) ≤ H(X ) + ε.
Properties
I (1− ε)2n(H(X )−ε) ≤ |A(n)ε | ≤ 2n(H(X )+ε) for n sufficiently large.
I Pr{A(n)ε } > 1− ε for n sufficiently large.
Guangyue Han The University of Hong Kong
![Page 46: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/46.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Typical Set: Definition and Properties
DefinitionThe typical set A
(n)ε with respect to p(x) is the set of sequence
(x1, x2, . . . , xn) ∈ X n with the property
H(X )− ε ≤ −1
nlog p(x1, x2, . . . , xn) ≤ H(X ) + ε.
Properties
I (1− ε)2n(H(X )−ε) ≤ |A(n)ε | ≤ 2n(H(X )+ε) for n sufficiently large.
I Pr{A(n)ε } > 1− ε for n sufficiently large.
Guangyue Han The University of Hong Kong
![Page 47: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/47.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Typical Set: Definition and Properties
DefinitionThe typical set A
(n)ε with respect to p(x) is the set of sequence
(x1, x2, . . . , xn) ∈ X n with the property
H(X )− ε ≤ −1
nlog p(x1, x2, . . . , xn) ≤ H(X ) + ε.
Properties
I (1− ε)2n(H(X )−ε) ≤ |A(n)ε | ≤ 2n(H(X )+ε) for n sufficiently large.
I Pr{A(n)ε } > 1− ε for n sufficiently large.
Guangyue Han The University of Hong Kong
![Page 48: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/48.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Typical Set: Definition and Properties
DefinitionThe typical set A
(n)ε with respect to p(x) is the set of sequence
(x1, x2, . . . , xn) ∈ X n with the property
H(X )− ε ≤ −1
nlog p(x1, x2, . . . , xn) ≤ H(X ) + ε.
Properties
I (1− ε)2n(H(X )−ε) ≤ |A(n)ε | ≤ 2n(H(X )+ε) for n sufficiently large.
I Pr{A(n)ε } > 1− ε for n sufficiently large.
Guangyue Han The University of Hong Kong
![Page 49: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/49.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Typical Set: A Pictorial Description
Consider all the instances (x1, x2, . . . , xn) ∈ X n of i.i.d.(X1,X2, · · · ,Xn) with distribution p(x).
Guangyue Han The University of Hong Kong
![Page 50: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/50.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Typical Set: A Pictorial DescriptionConsider all the instances (x1, x2, . . . , xn) ∈ X n of i.i.d.(X1,X2, · · · ,Xn) with distribution p(x).
Guangyue Han The University of Hong Kong
![Page 51: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/51.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Source Coding (Data Compression)
I Represent each typicalsequence with aboutnH(X ) bits.
I Represent eachnon-typical sequence withabout n log |X | bits.
I Then we have aone-to-one and easilydecodable code.
Guangyue Han The University of Hong Kong
![Page 52: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/52.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Source Coding (Data Compression)
I Represent each typicalsequence with aboutnH(X ) bits.
I Represent eachnon-typical sequence withabout n log |X | bits.
I Then we have aone-to-one and easilydecodable code.
Guangyue Han The University of Hong Kong
![Page 53: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/53.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Source Coding (Data Compression)
I Represent each typicalsequence with aboutnH(X ) bits.
I Represent eachnon-typical sequence withabout n log |X | bits.
I Then we have aone-to-one and easilydecodable code.
Guangyue Han The University of Hong Kong
![Page 54: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/54.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Source Coding (Data Compression)
I Represent each typicalsequence with aboutnH(X ) bits.
I Represent eachnon-typical sequence withabout n log |X | bits.
I Then we have aone-to-one and easilydecodable code.
Guangyue Han The University of Hong Kong
![Page 55: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/55.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Source Coding (Data Compression)
I Represent each typicalsequence with aboutnH(X ) bits.
I Represent eachnon-typical sequence withabout n log |X | bits.
I Then we have aone-to-one and easilydecodable code.
Guangyue Han The University of Hong Kong
![Page 56: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/56.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Shannon’s Source Coding Theorem
The average bits needed is
E[l(X1, . . . ,Xn)] =∑
x1,...,xn
p(x1, . . . , xn)l(x1, . . . , xn)
=∑
x1,...,xn∈A(n)ε
p(x1, . . . , xn)l(x1, . . . , xn) +∑
x1,...,xn 6∈A(n)ε
p(x1, . . . , xn)l(x1, . . . , xn)
=∑
x1,...,xn∈A(n)ε
p(x1, . . . , xn)nH(X ) +∑
x1,...,xn 6∈A(n)ε
p(x1, . . . , xn)n log |X | ≈ nH(X ).
Source Coding Theorem
For any information source distributed according toX1,X2, · · · ∼ p(x), the compression rate is always greater thanH(X ), but it can be arbitrarily close to H(X ).
Guangyue Han The University of Hong Kong
![Page 57: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/57.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Shannon’s Source Coding Theorem
The average bits needed is
E[l(X1, . . . ,Xn)] =∑
x1,...,xn
p(x1, . . . , xn)l(x1, . . . , xn)
=∑
x1,...,xn∈A(n)ε
p(x1, . . . , xn)l(x1, . . . , xn) +∑
x1,...,xn 6∈A(n)ε
p(x1, . . . , xn)l(x1, . . . , xn)
=∑
x1,...,xn∈A(n)ε
p(x1, . . . , xn)nH(X ) +∑
x1,...,xn 6∈A(n)ε
p(x1, . . . , xn)n log |X | ≈ nH(X ).
Source Coding Theorem
For any information source distributed according toX1,X2, · · · ∼ p(x), the compression rate is always greater thanH(X ), but it can be arbitrarily close to H(X ).
Guangyue Han The University of Hong Kong
![Page 58: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/58.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Shannon’s Source Coding Theorem
The average bits needed is
E[l(X1, . . . ,Xn)] =∑
x1,...,xn
p(x1, . . . , xn)l(x1, . . . , xn)
=∑
x1,...,xn∈A(n)ε
p(x1, . . . , xn)l(x1, . . . , xn) +∑
x1,...,xn 6∈A(n)ε
p(x1, . . . , xn)l(x1, . . . , xn)
=∑
x1,...,xn∈A(n)ε
p(x1, . . . , xn)nH(X ) +∑
x1,...,xn 6∈A(n)ε
p(x1, . . . , xn)n log |X | ≈ nH(X ).
Source Coding Theorem
For any information source distributed according toX1,X2, · · · ∼ p(x), the compression rate is always greater thanH(X ), but it can be arbitrarily close to H(X ).
Guangyue Han The University of Hong Kong
![Page 59: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/59.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Shannon’s Source Coding Theorem
The average bits needed is
E[l(X1, . . . ,Xn)] =∑
x1,...,xn
p(x1, . . . , xn)l(x1, . . . , xn)
=∑
x1,...,xn∈A(n)ε
p(x1, . . . , xn)l(x1, . . . , xn) +∑
x1,...,xn 6∈A(n)ε
p(x1, . . . , xn)l(x1, . . . , xn)
=∑
x1,...,xn∈A(n)ε
p(x1, . . . , xn)nH(X ) +∑
x1,...,xn 6∈A(n)ε
p(x1, . . . , xn)n log |X | ≈ nH(X ).
Source Coding Theorem
For any information source distributed according toX1,X2, · · · ∼ p(x), the compression rate is always greater thanH(X ), but it can be arbitrarily close to H(X ).
Guangyue Han The University of Hong Kong
![Page 60: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/60.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Shannon’s Source Coding Theorem
The average bits needed is
E[l(X1, . . . ,Xn)] =∑
x1,...,xn
p(x1, . . . , xn)l(x1, . . . , xn)
=∑
x1,...,xn∈A(n)ε
p(x1, . . . , xn)l(x1, . . . , xn) +∑
x1,...,xn 6∈A(n)ε
p(x1, . . . , xn)l(x1, . . . , xn)
=∑
x1,...,xn∈A(n)ε
p(x1, . . . , xn)nH(X ) +∑
x1,...,xn 6∈A(n)ε
p(x1, . . . , xn)n log |X |
≈ nH(X ).
Source Coding Theorem
For any information source distributed according toX1,X2, · · · ∼ p(x), the compression rate is always greater thanH(X ), but it can be arbitrarily close to H(X ).
Guangyue Han The University of Hong Kong
![Page 61: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/61.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Shannon’s Source Coding Theorem
The average bits needed is
E[l(X1, . . . ,Xn)] =∑
x1,...,xn
p(x1, . . . , xn)l(x1, . . . , xn)
=∑
x1,...,xn∈A(n)ε
p(x1, . . . , xn)l(x1, . . . , xn) +∑
x1,...,xn 6∈A(n)ε
p(x1, . . . , xn)l(x1, . . . , xn)
=∑
x1,...,xn∈A(n)ε
p(x1, . . . , xn)nH(X ) +∑
x1,...,xn 6∈A(n)ε
p(x1, . . . , xn)n log |X | ≈ nH(X ).
Source Coding Theorem
For any information source distributed according toX1,X2, · · · ∼ p(x), the compression rate is always greater thanH(X ), but it can be arbitrarily close to H(X ).
Guangyue Han The University of Hong Kong
![Page 62: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/62.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Shannon’s Source Coding Theorem
The average bits needed is
E[l(X1, . . . ,Xn)] =∑
x1,...,xn
p(x1, . . . , xn)l(x1, . . . , xn)
=∑
x1,...,xn∈A(n)ε
p(x1, . . . , xn)l(x1, . . . , xn) +∑
x1,...,xn 6∈A(n)ε
p(x1, . . . , xn)l(x1, . . . , xn)
=∑
x1,...,xn∈A(n)ε
p(x1, . . . , xn)nH(X ) +∑
x1,...,xn 6∈A(n)ε
p(x1, . . . , xn)n log |X | ≈ nH(X ).
Source Coding Theorem
For any information source distributed according toX1,X2, · · · ∼ p(x), the compression rate is always greater thanH(X ), but it can be arbitrarily close to H(X ).
Guangyue Han The University of Hong Kong
![Page 63: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/63.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Communication Channel: Definition
I A message W results in channel inputs X1(W ), . . . ,Xn(W );I And they are received as a random sequence
Y1, . . . ,Yn ∼ p(y1, . . . , yn|x1, . . . , xn).I The receiver then guesses the index W by an appropriate
decoding rule W = g(Y1, . . . ,Yn).I The receiver makes an error if W is not the same as W that
was transmitted.
Guangyue Han The University of Hong Kong
![Page 64: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/64.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Communication Channel: Definition
I A message W results in channel inputs X1(W ), . . . ,Xn(W );I And they are received as a random sequence
Y1, . . . ,Yn ∼ p(y1, . . . , yn|x1, . . . , xn).I The receiver then guesses the index W by an appropriate
decoding rule W = g(Y1, . . . ,Yn).I The receiver makes an error if W is not the same as W that
was transmitted.
Guangyue Han The University of Hong Kong
![Page 65: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/65.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Communication Channel: Definition
I A message W results in channel inputs X1(W ), . . . ,Xn(W );
I And they are received as a random sequenceY1, . . . ,Yn ∼ p(y1, . . . , yn|x1, . . . , xn).
I The receiver then guesses the index W by an appropriatedecoding rule W = g(Y1, . . . ,Yn).
I The receiver makes an error if W is not the same as W thatwas transmitted.
Guangyue Han The University of Hong Kong
![Page 66: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/66.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Communication Channel: Definition
I A message W results in channel inputs X1(W ), . . . ,Xn(W );I And they are received as a random sequence
Y1, . . . ,Yn ∼ p(y1, . . . , yn|x1, . . . , xn).
I The receiver then guesses the index W by an appropriatedecoding rule W = g(Y1, . . . ,Yn).
I The receiver makes an error if W is not the same as W thatwas transmitted.
Guangyue Han The University of Hong Kong
![Page 67: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/67.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Communication Channel: Definition
I A message W results in channel inputs X1(W ), . . . ,Xn(W );I And they are received as a random sequence
Y1, . . . ,Yn ∼ p(y1, . . . , yn|x1, . . . , xn).I The receiver then guesses the index W by an appropriate
decoding rule W = g(Y1, . . . ,Yn).
I The receiver makes an error if W is not the same as W thatwas transmitted.
Guangyue Han The University of Hong Kong
![Page 68: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/68.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Communication Channel: Definition
I A message W results in channel inputs X1(W ), . . . ,Xn(W );I And they are received as a random sequence
Y1, . . . ,Yn ∼ p(y1, . . . , yn|x1, . . . , xn).I The receiver then guesses the index W by an appropriate
decoding rule W = g(Y1, . . . ,Yn).I The receiver makes an error if W is not the same as W that
was transmitted.Guangyue Han The University of Hong Kong
![Page 69: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/69.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Communication Channel: An Example
Binary Symmetric Channel
p(Y = 0|X = 0) = 1− p, p(Y = 1|X = 0) = p,
p(Y = 0|X = 1) = p, p(Y = 1|X = 1) = 1− p.
Guangyue Han The University of Hong Kong
![Page 70: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/70.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Communication Channel: An Example
Binary Symmetric Channel
p(Y = 0|X = 0) = 1− p, p(Y = 1|X = 0) = p,
p(Y = 0|X = 1) = p, p(Y = 1|X = 1) = 1− p.
Guangyue Han The University of Hong Kong
![Page 71: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/71.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Communication Channel: An Example
Binary Symmetric Channel
p(Y = 0|X = 0) = 1− p, p(Y = 1|X = 0) = p,
p(Y = 0|X = 1) = p, p(Y = 1|X = 1) = 1− p.
Guangyue Han The University of Hong Kong
![Page 72: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/72.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Tradeoff between Speed and Reliability
Speed
To transmit 1: we transmit 1. It is likely that we receive 0. Notethat the transmission rate is 1.
Reliability
To transmit 1: we transmit 11111. Though it is likely that wereceive something else, such as 11011, but more likely than not, wecan correct the possible error. Note that the transmission rate ishowever 1/5.
Guangyue Han The University of Hong Kong
![Page 73: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/73.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Tradeoff between Speed and Reliability
Speed
To transmit 1: we transmit 1. It is likely that we receive 0. Notethat the transmission rate is 1.
Reliability
To transmit 1: we transmit 11111. Though it is likely that wereceive something else, such as 11011, but more likely than not, wecan correct the possible error. Note that the transmission rate ishowever 1/5.
Guangyue Han The University of Hong Kong
![Page 74: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/74.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Tradeoff between Speed and Reliability
Speed
To transmit 1: we transmit 1. It is likely that we receive 0. Notethat the transmission rate is 1.
Reliability
To transmit 1: we transmit 11111. Though it is likely that wereceive something else, such as 11011, but more likely than not, wecan correct the possible error. Note that the transmission rate ishowever 1/5.
Guangyue Han The University of Hong Kong
![Page 75: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/75.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Shannon’s Channel Coding Theorem: Statement
Channel Coding Theorem
For any discrete memoryless channel, asymptotically perfecttransmission rate below the capacity
C = maxp(x)
I (X ;Y )
is always possible, but is not possible above the capacity.
Guangyue Han The University of Hong Kong
![Page 76: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/76.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Shannon’s Channel Coding Theorem: Statement
Channel Coding Theorem
For any discrete memoryless channel, asymptotically perfecttransmission rate below the capacity
C = maxp(x)
I (X ;Y )
is always possible, but is not possible above the capacity.
Guangyue Han The University of Hong Kong
![Page 77: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/77.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Shannon’s Channel Coding Theorem: Proof
Guangyue Han The University of Hong Kong
![Page 78: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/78.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Shannon’s Channel Coding Theorem: Proof
I For each typical input n-sequence, there are approximately2nH(Y |X ) possible typical output sequences, all of them equallylikely.
I We wish to ensure that no two X input sequences produce thesame Y output sequence. Otherwise, we will not be able todecide which X sequence was sent.
I The total number of possible typical Y sequences isapproximately 2nH(Y ). This set has to be divided into sets ofsize 2nH(Y |X ) corresponding to the different input Xsequences.
I The total number of disjoint sets is less than or equal to2n(H(Y )−H(Y |X )) = 2nI (X ;Y ). Hence, we can send at mostapproximately 2nI (X ;Y ) distinguishable sequences of length n.
Guangyue Han The University of Hong Kong
![Page 79: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/79.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Shannon’s Channel Coding Theorem: Proof
I For each typical input n-sequence, there are approximately2nH(Y |X ) possible typical output sequences, all of them equallylikely.
I We wish to ensure that no two X input sequences produce thesame Y output sequence. Otherwise, we will not be able todecide which X sequence was sent.
I The total number of possible typical Y sequences isapproximately 2nH(Y ). This set has to be divided into sets ofsize 2nH(Y |X ) corresponding to the different input Xsequences.
I The total number of disjoint sets is less than or equal to2n(H(Y )−H(Y |X )) = 2nI (X ;Y ). Hence, we can send at mostapproximately 2nI (X ;Y ) distinguishable sequences of length n.
Guangyue Han The University of Hong Kong
![Page 80: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/80.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Shannon’s Channel Coding Theorem: Proof
I For each typical input n-sequence, there are approximately2nH(Y |X ) possible typical output sequences, all of them equallylikely.
I We wish to ensure that no two X input sequences produce thesame Y output sequence. Otherwise, we will not be able todecide which X sequence was sent.
I The total number of possible typical Y sequences isapproximately 2nH(Y ). This set has to be divided into sets ofsize 2nH(Y |X ) corresponding to the different input Xsequences.
I The total number of disjoint sets is less than or equal to2n(H(Y )−H(Y |X )) = 2nI (X ;Y ). Hence, we can send at mostapproximately 2nI (X ;Y ) distinguishable sequences of length n.
Guangyue Han The University of Hong Kong
![Page 81: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/81.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Shannon’s Channel Coding Theorem: Proof
I For each typical input n-sequence, there are approximately2nH(Y |X ) possible typical output sequences, all of them equallylikely.
I We wish to ensure that no two X input sequences produce thesame Y output sequence. Otherwise, we will not be able todecide which X sequence was sent.
I The total number of possible typical Y sequences isapproximately 2nH(Y ). This set has to be divided into sets ofsize 2nH(Y |X ) corresponding to the different input Xsequences.
I The total number of disjoint sets is less than or equal to2n(H(Y )−H(Y |X )) = 2nI (X ;Y ). Hence, we can send at mostapproximately 2nI (X ;Y ) distinguishable sequences of length n.
Guangyue Han The University of Hong Kong
![Page 82: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/82.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Shannon’s Channel Coding Theorem: Proof
I For each typical input n-sequence, there are approximately2nH(Y |X ) possible typical output sequences, all of them equallylikely.
I We wish to ensure that no two X input sequences produce thesame Y output sequence. Otherwise, we will not be able todecide which X sequence was sent.
I The total number of possible typical Y sequences isapproximately 2nH(Y ). This set has to be divided into sets ofsize 2nH(Y |X ) corresponding to the different input Xsequences.
I The total number of disjoint sets is less than or equal to2n(H(Y )−H(Y |X )) = 2nI (X ;Y ). Hence, we can send at mostapproximately 2nI (X ;Y ) distinguishable sequences of length n.
Guangyue Han The University of Hong Kong
![Page 83: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/83.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Binary Symmetric Channels
The capacity of a binary symmetric channel with crossoverprobability p is C = 1− H(p), where
H(p) = −p log p − (1− p) log(1− p).
Proof.
I (X ;Y ) = H(Y )− H(Y |X )
= H(Y )−∑x
p(x)H(Y |X = x)
= H(Y )−∑x
p(x)H(p)
= H(Y )− H(p)
≤ 1− H(p).
Guangyue Han The University of Hong Kong
![Page 84: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/84.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Binary Symmetric Channels
The capacity of a binary symmetric channel with crossoverprobability p is C = 1− H(p), where
H(p) = −p log p − (1− p) log(1− p).
Proof.
I (X ;Y ) = H(Y )− H(Y |X )
= H(Y )−∑x
p(x)H(Y |X = x)
= H(Y )−∑x
p(x)H(p)
= H(Y )− H(p)
≤ 1− H(p).
Guangyue Han The University of Hong Kong
![Page 85: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/85.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Binary Symmetric Channels
The capacity of a binary symmetric channel with crossoverprobability p is C = 1− H(p), where
H(p) = −p log p − (1− p) log(1− p).
Proof.
I (X ;Y ) = H(Y )− H(Y |X )
= H(Y )−∑x
p(x)H(Y |X = x)
= H(Y )−∑x
p(x)H(p)
= H(Y )− H(p)
≤ 1− H(p).
Guangyue Han The University of Hong Kong
![Page 86: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/86.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Binary Symmetric Channels
The capacity of a binary symmetric channel with crossoverprobability p is C = 1− H(p), where
H(p) = −p log p − (1− p) log(1− p).
Proof.
I (X ;Y ) = H(Y )− H(Y |X )
= H(Y )−∑x
p(x)H(Y |X = x)
= H(Y )−∑x
p(x)H(p)
= H(Y )− H(p)
≤ 1− H(p).
Guangyue Han The University of Hong Kong
![Page 87: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/87.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Binary Symmetric Channels
The capacity of a binary symmetric channel with crossoverprobability p is C = 1− H(p), where
H(p) = −p log p − (1− p) log(1− p).
Proof.
I (X ;Y ) = H(Y )− H(Y |X )
= H(Y )−∑x
p(x)H(Y |X = x)
= H(Y )−∑x
p(x)H(p)
= H(Y )− H(p)
≤ 1− H(p).
Guangyue Han The University of Hong Kong
![Page 88: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/88.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Binary Symmetric Channels
The capacity of a binary symmetric channel with crossoverprobability p is C = 1− H(p), where
H(p) = −p log p − (1− p) log(1− p).
Proof.
I (X ;Y ) = H(Y )− H(Y |X )
= H(Y )−∑x
p(x)H(Y |X = x)
= H(Y )−∑x
p(x)H(p)
= H(Y )− H(p)
≤ 1− H(p).
Guangyue Han The University of Hong Kong
![Page 89: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/89.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Binary Symmetric Channels
The capacity of a binary symmetric channel with crossoverprobability p is C = 1− H(p), where
H(p) = −p log p − (1− p) log(1− p).
Proof.
I (X ;Y ) = H(Y )− H(Y |X )
= H(Y )−∑x
p(x)H(Y |X = x)
= H(Y )−∑x
p(x)H(p)
= H(Y )− H(p)
≤ 1− H(p).
Guangyue Han The University of Hong Kong
![Page 90: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/90.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Additive White Gaussian Channels
The capacity of an additive white Gaussian channel Y = X + Z ,where E[X 2] ≤ P and Z ∼ N(0, 1), is C = 1
2 log(1 + P).
Proof.
I (X ;Y ) = H(Y )− H(Y |X )
= H(Y )− H(X + Z |X )
= H(Y )− H(Z |X )
= H(Y )− H(Z )
≤ 1
2log 2πe(1 + P)− 1
2log 2πe
=1
2log(1 + P).
Guangyue Han The University of Hong Kong
![Page 91: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/91.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Additive White Gaussian Channels
The capacity of an additive white Gaussian channel Y = X + Z ,where E[X 2] ≤ P and Z ∼ N(0, 1), is C = 1
2 log(1 + P).
Proof.
I (X ;Y ) = H(Y )− H(Y |X )
= H(Y )− H(X + Z |X )
= H(Y )− H(Z |X )
= H(Y )− H(Z )
≤ 1
2log 2πe(1 + P)− 1
2log 2πe
=1
2log(1 + P).
Guangyue Han The University of Hong Kong
![Page 92: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/92.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Additive White Gaussian Channels
The capacity of an additive white Gaussian channel Y = X + Z ,where E[X 2] ≤ P and Z ∼ N(0, 1), is C = 1
2 log(1 + P).
Proof.
I (X ;Y ) = H(Y )− H(Y |X )
= H(Y )− H(X + Z |X )
= H(Y )− H(Z |X )
= H(Y )− H(Z )
≤ 1
2log 2πe(1 + P)− 1
2log 2πe
=1
2log(1 + P).
Guangyue Han The University of Hong Kong
![Page 93: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/93.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Additive White Gaussian Channels
The capacity of an additive white Gaussian channel Y = X + Z ,where E[X 2] ≤ P and Z ∼ N(0, 1), is C = 1
2 log(1 + P).
Proof.
I (X ;Y ) = H(Y )− H(Y |X )
= H(Y )− H(X + Z |X )
= H(Y )− H(Z |X )
= H(Y )− H(Z )
≤ 1
2log 2πe(1 + P)− 1
2log 2πe
=1
2log(1 + P).
Guangyue Han The University of Hong Kong
![Page 94: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/94.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Additive White Gaussian Channels
The capacity of an additive white Gaussian channel Y = X + Z ,where E[X 2] ≤ P and Z ∼ N(0, 1), is C = 1
2 log(1 + P).
Proof.
I (X ;Y ) = H(Y )− H(Y |X )
= H(Y )− H(X + Z |X )
= H(Y )− H(Z |X )
= H(Y )− H(Z )
≤ 1
2log 2πe(1 + P)− 1
2log 2πe
=1
2log(1 + P).
Guangyue Han The University of Hong Kong
![Page 95: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/95.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Additive White Gaussian Channels
The capacity of an additive white Gaussian channel Y = X + Z ,where E[X 2] ≤ P and Z ∼ N(0, 1), is C = 1
2 log(1 + P).
Proof.
I (X ;Y ) = H(Y )− H(Y |X )
= H(Y )− H(X + Z |X )
= H(Y )− H(Z |X )
= H(Y )− H(Z )
≤ 1
2log 2πe(1 + P)− 1
2log 2πe
=1
2log(1 + P).
Guangyue Han The University of Hong Kong
![Page 96: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/96.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Additive White Gaussian Channels
The capacity of an additive white Gaussian channel Y = X + Z ,where E[X 2] ≤ P and Z ∼ N(0, 1), is C = 1
2 log(1 + P).
Proof.
I (X ;Y ) = H(Y )− H(Y |X )
= H(Y )− H(X + Z |X )
= H(Y )− H(Z |X )
= H(Y )− H(Z )
≤ 1
2log 2πe(1 + P)− 1
2log 2πe
=1
2log(1 + P).
Guangyue Han The University of Hong Kong
![Page 97: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/97.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Additive White Gaussian Channels
The capacity of an additive white Gaussian channel Y = X + Z ,where E[X 2] ≤ P and Z ∼ N(0, 1), is C = 1
2 log(1 + P).
Proof.
I (X ;Y ) = H(Y )− H(Y |X )
= H(Y )− H(X + Z |X )
= H(Y )− H(Z |X )
= H(Y )− H(Z )
≤ 1
2log 2πe(1 + P)− 1
2log 2πe
=1
2log(1 + P).
Guangyue Han The University of Hong Kong
![Page 98: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/98.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Memory Channels
Guangyue Han The University of Hong Kong
![Page 99: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/99.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Memoryless Channels
I Channel transitions are characterized by time-invarianttransition probabilities {p(y |x)}.
I Channel inputs are independent and identically distributed.
I Representative examples include (memoryless) binarysymmetric channels and additive white Gaussian channels.
Guangyue Han The University of Hong Kong
![Page 100: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/100.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Memoryless Channels
I Channel transitions are characterized by time-invarianttransition probabilities {p(y |x)}.
I Channel inputs are independent and identically distributed.
I Representative examples include (memoryless) binarysymmetric channels and additive white Gaussian channels.
Guangyue Han The University of Hong Kong
![Page 101: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/101.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Memoryless Channels
I Channel transitions are characterized by time-invarianttransition probabilities {p(y |x)}.
I Channel inputs are independent and identically distributed.
I Representative examples include (memoryless) binarysymmetric channels and additive white Gaussian channels.
Guangyue Han The University of Hong Kong
![Page 102: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/102.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Memoryless Channels
I Channel transitions are characterized by time-invarianttransition probabilities {p(y |x)}.
I Channel inputs are independent and identically distributed.
I Representative examples include (memoryless) binarysymmetric channels and additive white Gaussian channels.
Guangyue Han The University of Hong Kong
![Page 103: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/103.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Memoryless Channels
Shannon’s channel coding theorem
C = supp(x)
I (X ;Y )
= supp(x)−∑x ,y
p(x , y) logp(x , y)
p(x)p(y).
The Blahut-Arimoto algorithm (BAA)
Guangyue Han The University of Hong Kong
![Page 104: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/104.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Memoryless Channels
Shannon’s channel coding theorem
C = supp(x)
I (X ;Y )
= supp(x)−∑x ,y
p(x , y) logp(x , y)
p(x)p(y).
The Blahut-Arimoto algorithm (BAA)
Guangyue Han The University of Hong Kong
![Page 105: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/105.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Memoryless Channels
Shannon’s channel coding theorem
C = supp(x)
I (X ;Y )
= supp(x)−∑x ,y
p(x , y) logp(x , y)
p(x)p(y).
The Blahut-Arimoto algorithm (BAA)
Guangyue Han The University of Hong Kong
![Page 106: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/106.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Memory Channels
I Channel transitions are characterized by probabilities{p(yi |x1, . . . , xi , y1, . . . , yi−1, si )},
where channel outputs are possibly dependent on previous andcurrent channel inputs and previous outputs and currentchannel state; for example, inter-symbol interference channels,flash memory channels, Gilbert-Elliot channels.
I Channel inputs may have to satisfy certain constraints whichnecessitate dependence among channel inputs; for example,(d , k)-RLL constraints, more generally, finite-type constraints.
I Such channels are widely used in a variety of real-lifeapplications, including magnetic and optical recording, solidstate drives, communications over band-limited channelswith inter-symbol interference.
Guangyue Han The University of Hong Kong
![Page 107: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/107.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Memory Channels
I Channel transitions are characterized by probabilities{p(yi |x1, . . . , xi , y1, . . . , yi−1, si )},
where channel outputs are possibly dependent on previous andcurrent channel inputs and previous outputs and currentchannel state; for example, inter-symbol interference channels,flash memory channels, Gilbert-Elliot channels.
I Channel inputs may have to satisfy certain constraints whichnecessitate dependence among channel inputs; for example,(d , k)-RLL constraints, more generally, finite-type constraints.
I Such channels are widely used in a variety of real-lifeapplications, including magnetic and optical recording, solidstate drives, communications over band-limited channelswith inter-symbol interference.
Guangyue Han The University of Hong Kong
![Page 108: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/108.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Memory Channels
I Channel transitions are characterized by probabilities{p(yi |x1, . . . , xi , y1, . . . , yi−1, si )},
where channel outputs are possibly dependent on previous andcurrent channel inputs and previous outputs and currentchannel state; for example, inter-symbol interference channels,flash memory channels, Gilbert-Elliot channels.
I Channel inputs may have to satisfy certain constraints whichnecessitate dependence among channel inputs; for example,(d , k)-RLL constraints, more generally, finite-type constraints.
I Such channels are widely used in a variety of real-lifeapplications, including magnetic and optical recording, solidstate drives, communications over band-limited channelswith inter-symbol interference.
Guangyue Han The University of Hong Kong
![Page 109: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/109.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Memory Channels
I Channel transitions are characterized by probabilities{p(yi |x1, . . . , xi , y1, . . . , yi−1, si )},
where channel outputs are possibly dependent on previous andcurrent channel inputs and previous outputs and currentchannel state; for example, inter-symbol interference channels,flash memory channels, Gilbert-Elliot channels.
I Channel inputs may have to satisfy certain constraints whichnecessitate dependence among channel inputs; for example,(d , k)-RLL constraints, more generally, finite-type constraints.
I Such channels are widely used in a variety of real-lifeapplications, including magnetic and optical recording, solidstate drives, communications over band-limited channelswith inter-symbol interference.
Guangyue Han The University of Hong Kong
![Page 110: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/110.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Memory Channels
Despite a great deal of efforts by Zehavi and Wolf [1988], Mushkinand Bar-David [1989], Shamai and Kofman [1990], Goldsmith andVaraiya [1996], Arnold, Loeliger, Vontobel, Kavcic and Zeng[2006], Holliday, Goldsmith, and Glynn [2006], Vontobel, Kavcic,Arnold and Loeliger [2008], Pfister [2011], Permuter, Asnani andWeissman [2013], Han [2015], ...
Guangyue Han The University of Hong Kong
![Page 111: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/111.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Memory Channels
Despite a great deal of efforts by Zehavi and Wolf [1988], Mushkinand Bar-David [1989], Shamai and Kofman [1990], Goldsmith andVaraiya [1996], Arnold, Loeliger, Vontobel, Kavcic and Zeng[2006], Holliday, Goldsmith, and Glynn [2006], Vontobel, Kavcic,Arnold and Loeliger [2008], Pfister [2011], Permuter, Asnani andWeissman [2013], Han [2015], ...
Guangyue Han The University of Hong Kong
![Page 112: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/112.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Memory Channels
???Guangyue Han The University of Hong Kong
![Page 113: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/113.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Memory Channels
Shannon’s channel coding theorem
C = supp(x)
I (X ;Y )
= supp(x)
limn→∞
−1
n
∑xn1 ,y
n1
p(xn1 , yn1 ) log
p(xn1 , yn1 )
p(xn1 )p(yn1 ).
The Generalized Blahut-Arimotoalgorithm (GBAA) by Vontobel,Kavcic, Arnold and Loeliger [2008]
Guangyue Han The University of Hong Kong
![Page 114: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/114.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Memory Channels
Shannon’s channel coding theorem
C = supp(x)
I (X ;Y )
= supp(x)
limn→∞
−1
n
∑xn1 ,y
n1
p(xn1 , yn1 ) log
p(xn1 , yn1 )
p(xn1 )p(yn1 ).
The Generalized Blahut-Arimotoalgorithm (GBAA) by Vontobel,Kavcic, Arnold and Loeliger [2008]
Guangyue Han The University of Hong Kong
![Page 115: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/115.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Memory Channels
Shannon’s channel coding theorem
C = supp(x)
I (X ;Y )
= supp(x)
limn→∞
−1
n
∑xn1 ,y
n1
p(xn1 , yn1 ) log
p(xn1 , yn1 )
p(xn1 )p(yn1 ).
The Generalized Blahut-Arimotoalgorithm (GBAA) by Vontobel,Kavcic, Arnold and Loeliger [2008]
Guangyue Han The University of Hong Kong
![Page 116: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/116.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Convergence of the GBAA
The GBAA will converge if the following conjecture is true.
Concavity Conjecture [Vontobel et al. 2008]
I (X ;Y ) and H(X |Y ) are both concave with respect to a chosenparameterization.
Unfortunately, the concavity conjecture is not true in general [Liand Han, 2013].
Guangyue Han The University of Hong Kong
![Page 117: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/117.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Convergence of the GBAA
The GBAA will converge if the following conjecture is true.
Concavity Conjecture [Vontobel et al. 2008]
I (X ;Y ) and H(X |Y ) are both concave with respect to a chosenparameterization.
Unfortunately, the concavity conjecture is not true in general [Liand Han, 2013].
Guangyue Han The University of Hong Kong
![Page 118: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/118.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Convergence of the GBAA
The GBAA will converge if the following conjecture is true.
Concavity Conjecture [Vontobel et al. 2008]
I (X ;Y ) and H(X |Y ) are both concave with respect to a chosenparameterization.
Unfortunately, the concavity conjecture is not true in general [Liand Han, 2013].
Guangyue Han The University of Hong Kong
![Page 119: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/119.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Convergence of the GBAA
The GBAA will converge if the following conjecture is true.
Concavity Conjecture [Vontobel et al. 2008]
I (X ;Y ) and H(X |Y ) are both concave with respect to a chosenparameterization.
Unfortunately, the concavity conjecture is not true in general [Liand Han, 2013].
Guangyue Han The University of Hong Kong
![Page 120: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/120.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
A Randomized Algorithm [Han 2015]
With appropriately chosen step sizes an = 1/na, a > 0,
θn+1 = θn + angnb(θn),
where
I θ0 is randomly selected from the parameter space Θ;
I gnb(θ) is a simulator for I ′(X (θ);Y (θ));
I0 < β < α < 1/3, b > 0, 2a + b − 3bβ > 1,
here, α, β are some “hidden” parameters involved in thedefinition of gnb(θ).
Guangyue Han The University of Hong Kong
![Page 121: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/121.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
A Randomized Algorithm [Han 2015]
With appropriately chosen step sizes an = 1/na, a > 0,
θn+1 = θn + angnb(θn),
where
I θ0 is randomly selected from the parameter space Θ;
I gnb(θ) is a simulator for I ′(X (θ);Y (θ));
I0 < β < α < 1/3, b > 0, 2a + b − 3bβ > 1,
here, α, β are some “hidden” parameters involved in thedefinition of gnb(θ).
Guangyue Han The University of Hong Kong
![Page 122: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/122.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
A Randomized Algorithm [Han 2015]
With appropriately chosen step sizes an = 1/na, a > 0,
θn+1 = θn + angnb(θn),
where
I θ0 is randomly selected from the parameter space Θ;
I gnb(θ) is a simulator for I ′(X (θ);Y (θ));
I0 < β < α < 1/3, b > 0, 2a + b − 3bβ > 1,
here, α, β are some “hidden” parameters involved in thedefinition of gnb(θ).
Guangyue Han The University of Hong Kong
![Page 123: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/123.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
A Randomized Algorithm [Han 2015]
With appropriately chosen step sizes an = 1/na, a > 0,
θn+1 = θn + angnb(θn),
where
I θ0 is randomly selected from the parameter space Θ;
I gnb(θ) is a simulator for I ′(X (θ);Y (θ));
I0 < β < α < 1/3, b > 0, 2a + b − 3bβ > 1,
here, α, β are some “hidden” parameters involved in thedefinition of gnb(θ).
Guangyue Han The University of Hong Kong
![Page 124: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/124.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
A Randomized Algorithm [Han 2015]
With appropriately chosen step sizes an = 1/na, a > 0,
θn+1 = θn + angnb(θn),
where
I θ0 is randomly selected from the parameter space Θ;
I gnb(θ) is a simulator for I ′(X (θ);Y (θ));
I0 < β < α < 1/3, b > 0, 2a + b − 3bβ > 1,
here, α, β are some “hidden” parameters involved in thedefinition of gnb(θ).
Guangyue Han The University of Hong Kong
![Page 125: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/125.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Our Simulator of I ′(X ;Y )
Define
q = q(n) , nβ, p = p(n) , nα, k = k(n) , n/(nα + nβ).
For any j with iq + (i − 1)p + 1 ≤ j ≤ iq + ip, define
Wj = −(
p′(Yj−bq/2c)
p(Yj−bq/2c)+ · · ·+
p′(Yj |Y j−1j−bq/2c)
p(Yj |Y j−1j−bq/2c)
)log p(Yj |Y j−1
j−bq/2c),
and furthermore
ζi , Wiq+(i−1)p+1 + · · ·+ Wiq+ip, Sn ,∑k(n)
i=1 ζi .
Our simulator for I ′(X ;Y ):
gn(X n1 ,Y
n1 ) = H ′(X2|X1) + Sn(Y n
1 )/(kp)− Sn(X n1 ,Y
n1 )/(kp).
Guangyue Han The University of Hong Kong
![Page 126: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/126.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Our Simulator of I ′(X ;Y )
Define
q = q(n) , nβ, p = p(n) , nα, k = k(n) , n/(nα + nβ).
For any j with iq + (i − 1)p + 1 ≤ j ≤ iq + ip, define
Wj = −(
p′(Yj−bq/2c)
p(Yj−bq/2c)+ · · ·+
p′(Yj |Y j−1j−bq/2c)
p(Yj |Y j−1j−bq/2c)
)log p(Yj |Y j−1
j−bq/2c),
and furthermore
ζi , Wiq+(i−1)p+1 + · · ·+ Wiq+ip, Sn ,∑k(n)
i=1 ζi .
Our simulator for I ′(X ;Y ):
gn(X n1 ,Y
n1 ) = H ′(X2|X1) + Sn(Y n
1 )/(kp)− Sn(X n1 ,Y
n1 )/(kp).
Guangyue Han The University of Hong Kong
![Page 127: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/127.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Our Simulator of I ′(X ;Y )
Define
q = q(n) , nβ, p = p(n) , nα, k = k(n) , n/(nα + nβ).
For any j with iq + (i − 1)p + 1 ≤ j ≤ iq + ip, define
Wj = −(
p′(Yj−bq/2c)
p(Yj−bq/2c)+ · · ·+
p′(Yj |Y j−1j−bq/2c)
p(Yj |Y j−1j−bq/2c)
)log p(Yj |Y j−1
j−bq/2c),
and furthermore
ζi , Wiq+(i−1)p+1 + · · ·+ Wiq+ip, Sn ,∑k(n)
i=1 ζi .
Our simulator for I ′(X ;Y ):
gn(X n1 ,Y
n1 ) = H ′(X2|X1) + Sn(Y n
1 )/(kp)− Sn(X n1 ,Y
n1 )/(kp).
Guangyue Han The University of Hong Kong
![Page 128: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/128.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Our Simulator of I ′(X ;Y )
Define
q = q(n) , nβ, p = p(n) , nα, k = k(n) , n/(nα + nβ).
For any j with iq + (i − 1)p + 1 ≤ j ≤ iq + ip, define
Wj = −(
p′(Yj−bq/2c)
p(Yj−bq/2c)+ · · ·+
p′(Yj |Y j−1j−bq/2c)
p(Yj |Y j−1j−bq/2c)
)log p(Yj |Y j−1
j−bq/2c),
and furthermore
ζi , Wiq+(i−1)p+1 + · · ·+ Wiq+ip, Sn ,∑k(n)
i=1 ζi .
Our simulator for I ′(X ;Y ):
gn(X n1 ,Y
n1 ) = H ′(X2|X1) + Sn(Y n
1 )/(kp)− Sn(X n1 ,Y
n1 )/(kp).
Guangyue Han The University of Hong Kong
![Page 129: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/129.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Our Simulator of I ′(X ;Y )
Define
q = q(n) , nβ, p = p(n) , nα, k = k(n) , n/(nα + nβ).
For any j with iq + (i − 1)p + 1 ≤ j ≤ iq + ip, define
Wj = −(
p′(Yj−bq/2c)
p(Yj−bq/2c)+ · · ·+
p′(Yj |Y j−1j−bq/2c)
p(Yj |Y j−1j−bq/2c)
)log p(Yj |Y j−1
j−bq/2c),
and furthermore
ζi , Wiq+(i−1)p+1 + · · ·+ Wiq+ip, Sn ,∑k(n)
i=1 ζi .
Our simulator for I ′(X ;Y ):
gn(X n1 ,Y
n1 ) = H ′(X2|X1) + Sn(Y n
1 )/(kp)− Sn(X n1 ,Y
n1 )/(kp).
Guangyue Han The University of Hong Kong
![Page 130: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/130.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Convergence of Our Algorithm
Convergence and convergence rate with concavity
If I (X ;Y ) is concave with respect to θ, then θn converges to theunique capacity achieving distribution θ∗ almost surely. And forany τ with 2a + b − 3bβ − 2τ > 1, we have
|θn − θ∗| = O(n−τ ).
Guangyue Han The University of Hong Kong
![Page 131: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/131.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Convergence of Our Algorithm
Convergence and convergence rate with concavity
If I (X ;Y ) is concave with respect to θ, then θn converges to theunique capacity achieving distribution θ∗ almost surely. And forany τ with 2a + b − 3bβ − 2τ > 1, we have
|θn − θ∗| = O(n−τ ).
Guangyue Han The University of Hong Kong
![Page 132: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/132.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
The Ideas for the Proofs
Analyticity result [Han, Marcus, 2006]
The entropy rate of hidden Markov chains is analytic.
Refinements of the Shannon-MaMillan-Breiman theorem [Han, 2012]
Limit theorems for the sample entropy of hidden Markov chainshold.
The analyticity result states that I (X ;Y ) = H(X ) + H(Y )− H(X ,Y )
is a “nicely behaved” function.
The refinement results confirm that using Monte Carlo simulations,I (X ;Y ) and its derivatives can be “well-approximated”.
Guangyue Han The University of Hong Kong
![Page 133: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/133.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
The Ideas for the Proofs
Analyticity result [Han, Marcus, 2006]
The entropy rate of hidden Markov chains is analytic.
Refinements of the Shannon-MaMillan-Breiman theorem [Han, 2012]
Limit theorems for the sample entropy of hidden Markov chainshold.
The analyticity result states that I (X ;Y ) = H(X ) + H(Y )− H(X ,Y )
is a “nicely behaved” function.
The refinement results confirm that using Monte Carlo simulations,I (X ;Y ) and its derivatives can be “well-approximated”.
Guangyue Han The University of Hong Kong
![Page 134: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/134.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
The Ideas for the Proofs
Analyticity result [Han, Marcus, 2006]
The entropy rate of hidden Markov chains is analytic.
Refinements of the Shannon-MaMillan-Breiman theorem [Han, 2012]
Limit theorems for the sample entropy of hidden Markov chainshold.
The analyticity result states that I (X ;Y ) = H(X ) + H(Y )− H(X ,Y )
is a “nicely behaved” function.
The refinement results confirm that using Monte Carlo simulations,I (X ;Y ) and its derivatives can be “well-approximated”.
Guangyue Han The University of Hong Kong
![Page 135: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/135.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
The Ideas for the Proofs
Analyticity result [Han, Marcus, 2006]
The entropy rate of hidden Markov chains is analytic.
Refinements of the Shannon-MaMillan-Breiman theorem [Han, 2012]
Limit theorems for the sample entropy of hidden Markov chainshold.
The analyticity result states that I (X ;Y ) = H(X ) + H(Y )− H(X ,Y )
is a “nicely behaved” function.
The refinement results confirm that using Monte Carlo simulations,I (X ;Y ) and its derivatives can be “well-approximated”.
Guangyue Han The University of Hong Kong
![Page 136: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/136.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Continuous-Time Information Theory
Guangyue Han The University of Hong Kong
![Page 137: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/137.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Continuous-Time Gaussian Non-Feedback Channels
Consider the following continuous-time Gaussian channel:
Y (t) =√snr
∫ t
0X (s)ds + B(t), t ∈ [0,T ],
where {B(t)} is the standard Brownian motion.
Guangyue Han The University of Hong Kong
![Page 138: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/138.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Continuous-Time Gaussian Non-Feedback Channels
Theorem (Ducan 1970)
The following I-CMMSE relationship holds:
I (XT0 ;Y T
0 ) =1
2E∫ T
0(X (s)− E[X (s)|Y s
0 ])2 ds.
Theorem (Guo, Shamai and Verdu 2005)
The following I-MMSE relationship holds:
d
dsnrI (XT
0 ;Y T0 ) =
1
2E∫ T
0(X (s)− E[X (s)|Y T
0 ])2ds.
Guangyue Han The University of Hong Kong
![Page 139: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/139.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Continuous-Time Gaussian Non-Feedback Channels
Theorem (Ducan 1970)
The following I-CMMSE relationship holds:
I (XT0 ;Y T
0 ) =1
2E∫ T
0(X (s)− E[X (s)|Y s
0 ])2 ds.
Theorem (Guo, Shamai and Verdu 2005)
The following I-MMSE relationship holds:
d
dsnrI (XT
0 ;Y T0 ) =
1
2E∫ T
0(X (s)− E[X (s)|Y T
0 ])2ds.
Guangyue Han The University of Hong Kong
![Page 140: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/140.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Continuous-Time Gaussian Non-Feedback Channels
Theorem (Ducan 1970)
The following I-CMMSE relationship holds:
I (XT0 ;Y T
0 ) =1
2E∫ T
0(X (s)− E[X (s)|Y s
0 ])2 ds.
Theorem (Guo, Shamai and Verdu 2005)
The following I-MMSE relationship holds:
d
dsnrI (XT
0 ;Y T0 ) =
1
2E∫ T
0(X (s)− E[X (s)|Y T
0 ])2ds.
Guangyue Han The University of Hong Kong
![Page 141: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/141.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Continuous-Time Gaussian Feedback Channels
Consider the following continuous-time Gaussian feedback channel:
Y (t) =√snr
∫ t
0X (s,M,Y s
0 )ds + B(t), t ∈ [0,T ],
where {B(t)} is the standard Brownian motion.
Guangyue Han The University of Hong Kong
![Page 142: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/142.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Continuous-Time Gaussian Feedback Channels
Theorem (Kadota, Zakai and Ziv 1971)
The following I-CMMSE relationship:
I (M;Y T0 ) =
1
2E∫ T
0(X (s,M,Y s
0 )− E[X (s,M,Y s0 )|Y s
0 ])2 ds.
Theorem (Han and Song 2016)
The following I-MMSE relationship holds:
d
dsnrI (M;Y T
0 ) =1
2
∫ T
0E[(
X (s)− E[X (s)|Y T0 ])2]
ds
+snr
∫ T
0E[(
X (s)− E[X (s)|Y T
0
]) d
dsnrX (s)
]ds.
Guangyue Han The University of Hong Kong
![Page 143: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/143.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Continuous-Time Gaussian Feedback Channels
Theorem (Kadota, Zakai and Ziv 1971)
The following I-CMMSE relationship:
I (M;Y T0 ) =
1
2E∫ T
0(X (s,M,Y s
0 )− E[X (s,M,Y s0 )|Y s
0 ])2 ds.
Theorem (Han and Song 2016)
The following I-MMSE relationship holds:
d
dsnrI (M;Y T
0 ) =1
2
∫ T
0E[(
X (s)− E[X (s)|Y T0 ])2]
ds
+snr
∫ T
0E[(
X (s)− E[X (s)|Y T
0
]) d
dsnrX (s)
]ds.
Guangyue Han The University of Hong Kong
![Page 144: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/144.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Continuous-Time Gaussian Feedback Channels
Theorem (Kadota, Zakai and Ziv 1971)
The following I-CMMSE relationship:
I (M;Y T0 ) =
1
2E∫ T
0(X (s,M,Y s
0 )− E[X (s,M,Y s0 )|Y s
0 ])2 ds.
Theorem (Han and Song 2016)
The following I-MMSE relationship holds:
d
dsnrI (M;Y T
0 ) =1
2
∫ T
0E[(
X (s)− E[X (s)|Y T0 ])2]
ds
+snr
∫ T
0E[(
X (s)− E[X (s)|Y T
0
]) d
dsnrX (s)
]ds.
Guangyue Han The University of Hong Kong
![Page 145: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/145.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Continuous-Time Gaussian Channels
For either the following continuous-time Gaussian channel:
Y (t) =√snr
∫ t
0X (s)ds + B(t), t ∈ [0,T ],
or the following continuous-time Gaussian feedback channel:
Y (t) =√snr
∫ t
0X (s,M,Y s
0 )ds + B(t), t ∈ [0,T ],
the capacity is P/2.
Guangyue Han The University of Hong Kong
![Page 146: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/146.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Capacity of Continuous-Time Gaussian Channels
For either the following continuous-time Gaussian channel:
Y (t) =√snr
∫ t
0X (s)ds + B(t), t ∈ [0,T ],
or the following continuous-time Gaussian feedback channel:
Y (t) =√snr
∫ t
0X (s,M,Y s
0 )ds + B(t), t ∈ [0,T ],
the capacity is P/2.
Guangyue Han The University of Hong Kong
![Page 147: An Introduction to Information Theory · Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory An Introduction to Information Theory Guangyue Han The](https://reader033.vdocument.in/reader033/viewer/2022052006/6019e126095ae521af52d763/html5/thumbnails/147.jpg)
Fundamentals of Information Theory Memory Channels Continuous-Time Information Theory
Thank you!
Guangyue Han The University of Hong Kong