discrete memoryless channel

62
Discrete Memoryless Channel and it’s Capacity by Purnachand Simhadri Asst. Professor Electronics and Communication Engineering Department K L University Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Upload: purnachand-simhadri

Post on 29-Jun-2015

7.058 views

Category:

Education


1 download

TRANSCRIPT

Page 1: Discrete Memoryless Channel

Discrete Memoryless Channel andit’s Capacity

by

Purnachand SimhadriAsst. Professor

Electronics and Communication Engineering Department

K L University

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 2: Discrete Memoryless Channel

Outline

1 Discrete Memoryless channel

Probability Model

Binary Channel

2 Mutual Information

Joint Entropy

Conditional Entropy

Definition

3 Capacity of DMC

Transmission Rate

Definition

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 3: Discrete Memoryless Channel

DiscreteMemorylesschannel

Properties

ProbabilityModel

Binary Channel

MutualInformation

Capacity ofDMC

1 Discrete Memoryless channel

Probability Model

Binary Channel

2 Mutual Information

Joint Entropy

Conditional Entropy

Definition

3 Capacity of DMC

Transmission Rate

Definition

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 4: Discrete Memoryless Channel

DiscreteMemorylesschannel

Properties

ProbabilityModel

Binary Channel

MutualInformation

Capacity ofDMC

Discrete Memoryless channelProperties

Properties

The input of a DMC is a symbol belongig to aalphabet of M symbols with a probabilty oftransmission pti(i = 1, 2, 3, . . . ,M).

The input of a DMC is a symbol belongig to the samealphabet of M symbols with a probabiltyprj(j = 1, 2, 3, . . . ,M).

Due to errors caused by noise in channel, the outputmay be different from input during symbols interval.

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 5: Discrete Memoryless Channel

DiscreteMemorylesschannel

Properties

ProbabilityModel

Binary Channel

MutualInformation

Capacity ofDMC

Discrete Memoryless channelProperties

Properties

The input of a DMC is a symbol belongig to aalphabet of M symbols with a probabilty oftransmission pti(i = 1, 2, 3, . . . ,M).

The input of a DMC is a symbol belongig to the samealphabet of M symbols with a probabiltyprj(j = 1, 2, 3, . . . ,M).

Due to errors caused by noise in channel, the outputmay be different from input during symbols interval.

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 6: Discrete Memoryless Channel

DiscreteMemorylesschannel

Properties

ProbabilityModel

Binary Channel

MutualInformation

Capacity ofDMC

Discrete Memoryless channelProperties

Properties

The input of a DMC is a symbol belongig to aalphabet of M symbols with a probabilty oftransmission pti(i = 1, 2, 3, . . . ,M).

The input of a DMC is a symbol belongig to the samealphabet of M symbols with a probabiltyprj(j = 1, 2, 3, . . . ,M).

Due to errors caused by noise in channel, the outputmay be different from input during symbols interval.

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 7: Discrete Memoryless Channel

DiscreteMemorylesschannel

Properties

ProbabilityModel

Binary Channel

MutualInformation

Capacity ofDMC

Discrete Memoryless channelProperties

Properties

The input of a DMC is a symbol belongig to aalphabet of M symbols with a probabilty oftransmission pti(i = 1, 2, 3, . . . ,M).

The input of a DMC is a symbol belongig to the samealphabet of M symbols with a probabiltyprj(j = 1, 2, 3, . . . ,M).

Due to errors caused by noise in channel, the outputmay be different from input during symbols interval.

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 8: Discrete Memoryless Channel

DiscreteMemorylesschannel

Properties

ProbabilityModel

Binary Channel

MutualInformation

Capacity ofDMC

Discrete Memoryless channelProperties(contd...)

Properties (contd...)

In an ideal channel, the output is equal to the input.

In a non-ideal channel, the output can be differentfrom the input with a given transition probabilitypij = p(Y = yj/X = xi) (i, j = 1, 2, 3, . . . ,M).

In DMC the output of the channel depends only onthe input of the channel at the same instant and noton the input before or after.

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 9: Discrete Memoryless Channel

DiscreteMemorylesschannel

Properties

ProbabilityModel

Binary Channel

MutualInformation

Capacity ofDMC

Discrete Memoryless channelProperties(contd...)

Properties (contd...)

In an ideal channel, the output is equal to the input.

In a non-ideal channel, the output can be differentfrom the input with a given transition probabilitypij = p(Y = yj/X = xi) (i, j = 1, 2, 3, . . . ,M).

In DMC the output of the channel depends only onthe input of the channel at the same instant and noton the input before or after.

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 10: Discrete Memoryless Channel

DiscreteMemorylesschannel

Properties

ProbabilityModel

Binary Channel

MutualInformation

Capacity ofDMC

Discrete Memoryless channelProperties(contd...)

Properties (contd...)

In an ideal channel, the output is equal to the input.

In a non-ideal channel, the output can be differentfrom the input with a given transition probabilitypij = p(Y = yj/X = xi) (i, j = 1, 2, 3, . . . ,M).

In DMC the output of the channel depends only onthe input of the channel at the same instant and noton the input before or after.

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 11: Discrete Memoryless Channel

DiscreteMemorylesschannel

Properties

ProbabilityModel

Binary Channel

MutualInformation

Capacity ofDMC

Discrete Memoryless channelProperties(contd...)

Properties (contd...)

In an ideal channel, the output is equal to the input.

In a non-ideal channel, the output can be differentfrom the input with a given transition probabilitypij = p(Y = yj/X = xi) (i, j = 1, 2, 3, . . . ,M).

In DMC the output of the channel depends only onthe input of the channel at the same instant and noton the input before or after.

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 12: Discrete Memoryless Channel

DiscreteMemorylesschannel

Properties

ProbabilityModel

Binary Channel

MutualInformation

Capacity ofDMC

Discrete Memoryless channelProbability Model

All these transition probabilities from xi to yj aregathered in a transition matrix (also called as channelmatrix) to model DMC .

pti = P (X = xi), prj = P (Y = yi), pij = P (Y = yj/X = xi)

and P (xi, yj) = P (yj/xi)P (X = xi) = pij · pti

⇒ prj =

M∑i=1

ptipij (1)

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 13: Discrete Memoryless Channel

DiscreteMemorylesschannel

Properties

ProbabilityModel

Binary Channel

MutualInformation

Capacity ofDMC

Discrete Memoryless channelProbability Model

All these transition probabilities from xi to yj aregathered in a transition matrix (also called as channelmatrix) to model DMC .

pti = P (X = xi), prj = P (Y = yi), pij = P (Y = yj/X = xi)

and P (xi, yj) = P (yj/xi)P (X = xi) = pij · pti

⇒ prj =

M∑i=1

ptipij (1)

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 14: Discrete Memoryless Channel

DiscreteMemorylesschannel

Properties

ProbabilityModel

Binary Channel

MutualInformation

Capacity ofDMC

Discrete Memoryless channelProbability Model

Equation( 1) can be written using matrix form aspr1pr2...

prM

︸ ︷︷ ︸

P rY

=

p11 p12 · · · p1Mp21 p22 · · · p2M...

.... . .

...pM1 pM2 · · · pMM

︸ ︷︷ ︸

Channel Matrix − PY/X

pt1pt2...

ptM

︸ ︷︷ ︸

P tX

(2)

Equation( 2) can be compactly written as

P rY = PY/XP t

X (3)

Note that,

M∑j=1

pij = 1 and pe =

M∑i=1

pi

M∑j=1,i 6=j

pij

(4)

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 15: Discrete Memoryless Channel

DiscreteMemorylesschannel

Properties

ProbabilityModel

Binary Channel

MutualInformation

Capacity ofDMC

Discrete Memoryless channelBinary Channel

Channels designed to transmit and receive one of Msymbols aree called discrete M-ary channels (M > 2).If M=2, then the channel is called binary channel.In the binary case we can statistically model thechannel as below

0

1

0

1

𝑷𝟎𝟎

𝑷𝟏𝟏

𝑷𝟎𝟏

𝑷𝟏𝟎

𝑷𝟎𝒕

𝑷𝟏𝒕

𝑷𝟎𝒓

𝑷𝟏𝒓

P (Y = j/X = i) = pij

p00 + p01 = 1

p10 + p11 = 1

P (X = 0) = pt0

P (X = 1) = pt1

P (Y = 0) = pr0

P (Y = 1) = pr1

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 16: Discrete Memoryless Channel

DiscreteMemorylesschannel

Properties

ProbabilityModel

Binary Channel

MutualInformation

Capacity ofDMC

Discrete Memoryless channelBinary Channel

Channels designed to transmit and receive one of Msymbols aree called discrete M-ary channels (M > 2).If M=2, then the channel is called binary channel.In the binary case we can statistically model thechannel as below

0

1

0

1

𝑷𝟎𝟎

𝑷𝟏𝟏

𝑷𝟎𝟏

𝑷𝟏𝟎

𝑷𝟎𝒕

𝑷𝟏𝒕

𝑷𝟎𝒓

𝑷𝟏𝒓

P (Y = j/X = i) = pij

p00 + p01 = 1

p10 + p11 = 1

P (X = 0) = pt0

P (X = 1) = pt1

P (Y = 0) = pr0

P (Y = 1) = pr1

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 17: Discrete Memoryless Channel

DiscreteMemorylesschannel

Properties

ProbabilityModel

Binary Channel

MutualInformation

Capacity ofDMC

Discrete Memoryless channelBinary Channel

for a binary channel,

pr0 = pt0p00 + pt1p10

pr1 = pt0p01 + pt1p11

and Pe = pt0p01 + pt1p10

Binary Symmetric Channel

A binary channel is said to be binary symmetric channel isp00 = p11 (⇒ p01 = p10).

Let, p00 = p11 = p ⇒ p01 = p10 = 1− p

then, for a binary symmetric channel

Pe = pt0p01 + pt1p10 = pt0(1− p) + pt1(1− p) = 1− p

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 18: Discrete Memoryless Channel

DiscreteMemorylesschannel

Properties

ProbabilityModel

Binary Channel

MutualInformation

Capacity ofDMC

Discrete Memoryless channelBinary Channel

for a binary channel,

pr0 = pt0p00 + pt1p10

pr1 = pt0p01 + pt1p11

and Pe = pt0p01 + pt1p10

Binary Symmetric Channel

A binary channel is said to be binary symmetric channel isp00 = p11 (⇒ p01 = p10).

Let, p00 = p11 = p ⇒ p01 = p10 = 1− p

then, for a binary symmetric channel

Pe = pt0p01 + pt1p10 = pt0(1− p) + pt1(1− p) = 1− p

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 19: Discrete Memoryless Channel

DiscreteMemorylesschannel

Properties

ProbabilityModel

Binary Channel

MutualInformation

Capacity ofDMC

Discrete Memoryless channelBinary Channel

for a binary channel,

pr0 = pt0p00 + pt1p10

pr1 = pt0p01 + pt1p11

and Pe = pt0p01 + pt1p10

Binary Symmetric Channel

A binary channel is said to be binary symmetric channel isp00 = p11 (⇒ p01 = p10).

Let, p00 = p11 = p ⇒ p01 = p10 = 1− p

then, for a binary symmetric channel

Pe = pt0p01 + pt1p10 = pt0(1− p) + pt1(1− p) = 1− p

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 20: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

1 Discrete Memoryless channel

Probability Model

Binary Channel

2 Mutual Information

Joint Entropy

Conditional Entropy

Definition

3 Capacity of DMC

Transmission Rate

Definition

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 21: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

Mutual InformationJoint Entropy

In a DMC, there are two statistical process at work:input to the channel and the noise, which inturneffects the output of channel. So, it is worthy toconsider the joint and conditional densities of inputand output.

Thus there are a number of entropies or informationcontents that need to be considered for studyingdiscrete memoryless channel characteristics.First, entropy of the input is

H(X) = −M∑i=1

pti log2(pti) bits/symbol

Entropy of the output is

H(Y ) = −M∑j=1

prj log2(prj ) bits/symbol

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 22: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

Mutual InformationJoint Entropy

Joint distribution of input and output can be obtainedfrom transition probabilities and input distribution as

P (xi, yj) = P (yj/xi)P (X = xi) = pij · pti

Joint Entropy

Joint entropy H(X,Y ) is defined as

H(X,Y ) = −∑xi∈X

∑yj∈Y

P (xi, yj) log2 P (xi, yj)

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 23: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

Mutual InformationJoint Entropy

Joint distribution of input and output can be obtainedfrom transition probabilities and input distribution as

P (xi, yj) = P (yj/xi)P (X = xi) = pij · pti

Joint Entropy

Joint entropy H(X,Y ) is defined as

H(X,Y ) = −∑xi∈X

∑yj∈Y

P (xi, yj) log2 P (xi, yj)

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 24: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

Mutual InformationJoint Entropy

Joint Entropy - Properties

The joint entropy of a set of variables is greater thanor equal to all of the individual entropies of thevariables in the set.

H(X,Y ) ≥ max(H(X), H(Y ))

The joint entropy of a set of variables is less than orequal to the sum of the individual entropies of thevariables in the set.

H(X,Y ) ≤ H(X) + H(Y )

This inequality is an equality if and only if X and Yare statistically independent.

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 25: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

Mutual InformationJoint Entropy

Joint Entropy - Properties

The joint entropy of a set of variables is greater thanor equal to all of the individual entropies of thevariables in the set.

H(X,Y ) ≥ max(H(X), H(Y ))

The joint entropy of a set of variables is less than orequal to the sum of the individual entropies of thevariables in the set.

H(X,Y ) ≤ H(X) + H(Y )

This inequality is an equality if and only if X and Yare statistically independent.

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 26: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

Mutual InformationConditional Entropy

Let the conditional distribution of X, given that the output ofchannel Y = yj , be P (X/Y = yj), then the average uncertainityabout X given that Y = yj is given by

H(X/Y = yj) = −∑

xi∈X

P (X = xi/Y = yj) log2 P (X = xi/Y = yj)

The conditional entropy of X conditioned on Y is the expectedvalue for the entropy of the distribution P (X/Y = yj)

⇒ H(X/Y ) = E[H(X/Y = yj)]

=∑

yj∈Y

P (Y = yj)H(X/Y = yj)

=∑

yj∈Y

P (yj)

[−∑

xi∈X

P (xi/yj) log2 P (xi/yj

]= −

∑xi∈X

∑yj∈Y

P (xi/yj)P (yj) log2 P (xi/yj)

= −∑

xi∈X

∑yj∈Y

P (xi, yj) log2 P (xi/yj)

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 27: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

Mutual InformationConditional Entropy

Let the conditional distribution of X, given that the output ofchannel Y = yj , be P (X/Y = yj), then the average uncertainityabout X given that Y = yj is given by

H(X/Y = yj) = −∑

xi∈X

P (X = xi/Y = yj) log2 P (X = xi/Y = yj)

The conditional entropy of X conditioned on Y is the expectedvalue for the entropy of the distribution P (X/Y = yj)

⇒ H(X/Y ) = E[H(X/Y = yj)]

=∑

yj∈Y

P (Y = yj)H(X/Y = yj)

=∑

yj∈Y

P (yj)

[−∑

xi∈X

P (xi/yj) log2 P (xi/yj

]= −

∑xi∈X

∑yj∈Y

P (xi/yj)P (yj) log2 P (xi/yj)

= −∑

xi∈X

∑yj∈Y

P (xi, yj) log2 P (xi/yj)

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 28: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

Mutual InformationConditional Entropy

Conditional Entropy - Definition

Conditional entropy H(X/Y ) is defined as

H(X/Y ) = −∑xi∈X

∑yj∈Y

P (xi, yj) log2 P (xi/yj)

similarly, Conditional entropy H(Y/X) is defined as

H(Y/X) = −∑xi∈X

∑yj∈Y

P (xi, yj) log2 P (yj/xi)

Conditional entropy is also called as equivocation.

H(X/Y ) gives the amount of uncertainty remaining aboutthe channel input X after the channel output Y has beenobserved.

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 29: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

Mutual InformationConditional Entropy

Conditional Entropy - Definition

Conditional entropy H(X/Y ) is defined as

H(X/Y ) = −∑xi∈X

∑yj∈Y

P (xi, yj) log2 P (xi/yj)

similarly, Conditional entropy H(Y/X) is defined as

H(Y/X) = −∑xi∈X

∑yj∈Y

P (xi, yj) log2 P (yj/xi)

Conditional entropy is also called as equivocation.

H(X/Y ) gives the amount of uncertainty remaining aboutthe channel input X after the channel output Y has beenobserved.

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 30: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

Mutual InformationConditional Entropy

There is less information in conditional entropy H(X/Y )than in the entropy H(X)

⇒ H(X/Y )−H(X) ≤ 0

Proof:

H(X/Y )−H(X) =−∑

xi∈X

∑yj∈Y

P (xi, yj) log2 P (xi/yj)

+∑

xi∈X

P (xi) log2 P (xi)

=−∑

xi∈X

∑yj∈Y

P (xi, yj) log2 P (xi/yj)

+∑

xi∈X

( ∑yj∈Y

P (xi, yj))log2 P (xi)

=∑

xi∈X

∑yj∈Y

P (xi, yj) log2P (xi)

P (xi/yj)

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 31: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

Mutual InformationConditional Entropy

Using the inequality, log a ≤ (a− 1),

it follows that:

H(X/Y )−H(X) ≤∑xi∈X

∑yj∈Y

P (xi, yj)

(P (xi)

P (xi/yj)− 1

)

=∑xi∈X

∑yj∈Y

P (xi, yj)

P (xi/yj)P (xi)−

∑xi∈X

∑yj∈Y

P (xi, yj)

=∑xi∈X

P (xi)∑yj∈Y

P (yj)− 1

=1− 1

=0

⇒ H(X/Y ) ≤ H(X)

and H(Y/X) ≤ H(Y )

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 32: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

Mutual InformationConditional Entropy- Relation with Joint Entropy

Conditional entropy H(X/Y ) is given by

H(X/Y ) =−∑

xi∈X

∑yj∈Y

P (xi, yj) log2 P (xi/yj)

=−∑

xi∈X

∑yj∈Y

P (xi, yj) log2

(P (xi, yj)

P (yj)

)=−

∑xi∈X

∑yj∈Y

P (xi, yj) log2 P (xi, yj)

+∑

yj∈Y

( ∑xi∈X

P (xi, yj)

)log2 P (yj)

=−∑

xi∈X

∑yj∈Y

P (xi, yj) log2 P (xi, yj)

+∑

yj∈Y

P (yj) log2 P (yj)

= H(X,Y )−H(Y ) ⇒ H(X,Y ) = H(X/Y ) +H(Y )

similarly, H(X,Y ) = H(Y/X) +H(X)

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 33: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

Mutual InformationDefinition

Definition

Mutual Information I(X,Y ) of X and Y is deifned as

I(X,Y ) = H(X)−H(X/Y )

I(X,Y ) gives the uncertainty of the input X resolved byobserving output Y . In other words, it is the protion ofinformation of X that depends on Y .

Properties

Symmetric : I(X,Y ) = I(Y,X)

I(X,Y ) = H(X)−H(X/Y ) = H(Y )−H(Y/X)

= H(X) +H(Y )−H(X,Y )

Nonnegetive : I(X,Y ) ≥ 0

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 34: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

Mutual InformationDefinition

Definition

Mutual Information I(X,Y ) of X and Y is deifned as

I(X,Y ) = H(X)−H(X/Y )

I(X,Y ) gives the uncertainty of the input X resolved byobserving output Y . In other words, it is the protion ofinformation of X that depends on Y .

Properties

Symmetric : I(X,Y ) = I(Y,X)

I(X,Y ) = H(X)−H(X/Y ) = H(Y )−H(Y/X)

= H(X) +H(Y )−H(X,Y )

Nonnegetive : I(X,Y ) ≥ 0

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 35: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

Mutual InformationDefinition

Proof: Property - 1

I(X,Y ) =H(X)−H(X/Y )

=−∑

xi∈X

P (xi) log2 P (xi) +∑

xi∈X

∑yj∈Y

P (xi, yj) log2 P (xi/yj)

=−∑

xi∈X

∑yj∈Y

P (xi, yj) log2 P (xi) +∑

xi∈X

∑yj∈Y

P (xi, yj) log2 P (xi/yj)

=∑

xi∈X

∑yj∈Y

P (xi, yj) log2P (xi/yj)

P (xi)

=∑

xi∈X

∑yj∈Y

P (xi, yj) log2P (xi, yj)

P (xi)P (yj)

=I(Y,X)

Equaion in box gives Kullback Leibler divergence between two probability

distributions P (xi, yj) and P (xi)P (yj)

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 36: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

Mutual InformationDefinition

Kullback Leibler divergence

In probability theory and information theory, the Kullback- Leibler divergence(also information divergence,information gain, relative entropy) is a non-symmetricmeasure of the difference between two probabilitydistributions P and Q.

DKL(P//Q) =∑i

log2

P (i)

Q(i)

KL measures the expected number of extra bits required tocode samples from P when using a code based on Q, ratherthan using a code based on P.

Thus, Mutual information gives no.of bits can be gained byconsidering dependancy between X and Y rather than byconsidering X and Y are independent.

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 37: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

Mutual InformationDefinition

Kullback Leibler divergence

In probability theory and information theory, the Kullback- Leibler divergence(also information divergence,information gain, relative entropy) is a non-symmetricmeasure of the difference between two probabilitydistributions P and Q.

DKL(P//Q) =∑i

log2

P (i)

Q(i)

KL measures the expected number of extra bits required tocode samples from P when using a code based on Q, ratherthan using a code based on P.

Thus, Mutual information gives no.of bits can be gained byconsidering dependancy between X and Y rather than byconsidering X and Y are independent.

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 38: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

Mutual InformationDefinition

Kullback Leibler divergence

In probability theory and information theory, the Kullback- Leibler divergence(also information divergence,information gain, relative entropy) is a non-symmetricmeasure of the difference between two probabilitydistributions P and Q.

DKL(P//Q) =∑i

log2

P (i)

Q(i)

KL measures the expected number of extra bits required tocode samples from P when using a code based on Q, ratherthan using a code based on P.

Thus, Mutual information gives no.of bits can be gained byconsidering dependancy between X and Y rather than byconsidering X and Y are independent.

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 39: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

Mutual InformationDefinition

Proof: Property - 2

It is known that :H(X) ≥ H(X/Y )

⇒ H(X)−H(X/Y ) ≥ 0

If X and Y are statistically independent, then

H(X/Y ) = H(X) ⇒ I(X,Y ) = 0

. Therefore,

I(X,Y ) = H(X)−H(X/Y ) ≥ 0

with equality when X and Y are statistically independent.

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 40: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

Mutual InformationFour Cases

Case - 1: X and Y are statistically independent

𝐻(𝑋) 𝐻(𝑌)

𝐻 𝑋, 𝑌 = 𝐻 𝑋 + 𝐻(𝑌)

𝐼(𝑋, 𝑌) = 0

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 41: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

Mutual InformationFour Cases

Case - 2: Y is completely dependent on XCase - 3: X is completely dependent on Y

I(X,Y) = H(Y)

𝐻(𝑋) 𝐻(𝑌)

I(X,Y) = H(X)

𝐻(𝑋) 𝐻(𝑌)

𝐻(𝑋, 𝑌) = 𝐻(𝑋) 𝐻(𝑋, 𝑌) = 𝐻(𝑌)

CASE -2 CASE -3

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 42: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Joint Entropy

ConditionalEntropy

Definition

Four Cases

Capacity ofDMC

Mutual InformationFour Cases

Case - 4: X and Y are neither statistically independentnor one is completely dependent on the other.

𝐻(𝑋/𝑌) I(𝑋, 𝑌)

𝐻 𝑋, 𝑌 = 𝐻 𝑋 + 𝐻(𝑌/𝑋) = 𝐻 𝑌 + 𝐻(𝑋/𝑌)

𝐻(𝑌/𝑋)

𝐻(𝑋) 𝐻(𝑌)

𝐼(𝑋, 𝑌) = 𝐻(𝑋) − 𝐻(𝑋/𝑌) = 𝐻(𝑌) − 𝐻(𝑌/𝑋)

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 43: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Capacity ofDMC

TransmissionRate

Definition

Examples

1 Discrete Memoryless channel

Probability Model

Binary Channel

2 Mutual Information

Joint Entropy

Conditional Entropy

Definition

3 Capacity of DMC

Transmission Rate

Definition

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 44: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Capacity ofDMC

TransmissionRate

Definition

Examples

Capacity of DMCTransmission Rate

H(X) is the amount uncertainity about X, in otherwords, information gain related to X if we are toldabout X.

H(X/Y ) is remaining amount uncertainity about Xwhen Y is observed, in other words, the amountinformation required to resolve X if we are told aboutY .

I(X,Y ) is amount of uncertainity of X resolved byobserving the output Y .

So, the amount of information that can betransmitted over a channel is nothing but the amountof uncertainity resolved by observing the channeloutput.

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 45: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Capacity ofDMC

TransmissionRate

Definition

Examples

Capacity of DMCTransmission Rate

Thus, it is possible to transmit I(X,Y ) bits ofinformation per channel use, approximately, withoutany uncertainity about the input at the output of thechannel.

⇒ It = I(X,Y ) = H(X)−H(X/Y ) bits/channel use

If the the symbol rate of a source is Rs, then the rateof information that can be transmitted over a channelsuch that the input can be resolved approximatelywithout errors is given by

Dt = [H(X)−H(X/Y )]Rs bits/sec

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 46: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Capacity ofDMC

TransmissionRate

Definition

Examples

Capacity of DMCTransmission Rate

For an ideal channel X = Y , there is no uncertainty over Xwhen we observe Y .

⇒ H(X/Y ) = 0

⇒ I(X,Y ) = H(X)−H(X/Y ) = H(X)

So all the information is transmitted for each channel use:

It = I(X,Y ) = H(X)

If the channel is too noisy, such that X and Y areindependent. So the uncertainty over X remains the sameirrespective of observation on Y .

⇒ H(X/Y ) = H(X)

⇒ I(X,Y ) = H(X)−H(X/Y ) = 0

i.e., no information passes through the channel:

It = I(X,Y ) = 0

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 47: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Capacity ofDMC

TransmissionRate

Definition

Examples

Capacity of DMCTransmission Rate

For an ideal channel X = Y , there is no uncertainty over Xwhen we observe Y .

⇒ H(X/Y ) = 0

⇒ I(X,Y ) = H(X)−H(X/Y ) = H(X)

So all the information is transmitted for each channel use:

It = I(X,Y ) = H(X)

If the channel is too noisy, such that X and Y areindependent. So the uncertainty over X remains the sameirrespective of observation on Y .

⇒ H(X/Y ) = H(X)

⇒ I(X,Y ) = H(X)−H(X/Y ) = 0

i.e., no information passes through the channel:

It = I(X,Y ) = 0

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 48: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Capacity ofDMC

TransmissionRate

Definition

Examples

Capacity of DMCDefinition

The capacity of DMC is the maximum rate of informationtransmission over the channel. The maximum rate oftransmission occurs when the source is matched to thechannel.

Definition

The capacity of DMC is defined the maximum rate ofinformation transmission over the channel, where themaximum is taken over all possible input distributionsP (X)

C = maxP (X)

I(X,Y )Rs bits/sec

= maxP (X)

[H(X)−H(X/Y )]Rs bits/sec

= maxP (X)

[H(Y )−H(Y/X)]Rs bits/sec

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 49: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Capacity ofDMC

TransmissionRate

Definition

Examples

Capacity of DMCDefinition

The capacity of DMC is the maximum rate of informationtransmission over the channel. The maximum rate oftransmission occurs when the source is matched to thechannel.

Definition

The capacity of DMC is defined the maximum rate ofinformation transmission over the channel, where themaximum is taken over all possible input distributionsP (X)

C = maxP (X)

I(X,Y )Rs bits/sec

= maxP (X)

[H(X)−H(X/Y )]Rs bits/sec

= maxP (X)

[H(Y )−H(Y/X)]Rs bits/sec

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 50: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Capacity ofDMC

TransmissionRate

Definition

Examples

Capacity of DMCNoiseless Binary Channel

Consider a noiseless binary channel as shown below

0

1

0

1

𝑷𝟎𝟎 = 𝟏

𝑷𝟏𝟏 = 𝟏

𝑷𝟎𝟏 = 𝟎

𝑷𝟏𝟎 = 𝟎

𝑷(𝒙𝟎)

𝑷(𝒙𝟏)

𝑷(𝒚𝟎) 𝑷(𝒙𝟎)

𝑷(𝒚𝟏)

P (x0, y0) = P (x0)P00 = P (x0)

P (x1, y1) = P (x1)P11 = P (x1)

P (x0, y1) = P (x0)P01 = 0

P (x1, y0) = P (x1)P10 = 0

P (y0) = P (x0)P00 + P (x1)P10

= P (x0)

P (y1) = P (x0)P01 + P (x1)P11

= P (x1)

P (x0/y0) =P (x0, y0)

P (y0)=

P (x0)

P (x0)= 1

P (x0/y1) =P (x0, y1)

P (y1)=

0

P (x1)= 0

P (x1/y0) =P (x1, y0)

P (y0)=

0

P (x0)= 0

P (x1/y1) =P (x1, y1)

P (y1)=

P (x1)

P (x1)= 1

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 51: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Capacity ofDMC

TransmissionRate

Definition

Examples

Capacity of DMCNoiseless Binary Channel

Consider a noiseless binary channel as shown below

0

1

0

1

𝑷𝟎𝟎 = 𝟏

𝑷𝟏𝟏 = 𝟏

𝑷𝟎𝟏 = 𝟎

𝑷𝟏𝟎 = 𝟎

𝑷(𝒙𝟎)

𝑷(𝒙𝟏)

𝑷(𝒚𝟎) 𝑷(𝒙𝟎)

𝑷(𝒚𝟏)

P (x0, y0) = P (x0)P00 = P (x0)

P (x1, y1) = P (x1)P11 = P (x1)

P (x0, y1) = P (x0)P01 = 0

P (x1, y0) = P (x1)P10 = 0

P (y0) = P (x0)P00 + P (x1)P10

= P (x0)

P (y1) = P (x0)P01 + P (x1)P11

= P (x1)

P (x0/y0) =P (x0, y0)

P (y0)=

P (x0)

P (x0)= 1

P (x0/y1) =P (x0, y1)

P (y1)=

0

P (x1)= 0

P (x1/y0) =P (x1, y0)

P (y0)=

0

P (x0)= 0

P (x1/y1) =P (x1, y1)

P (y1)=

P (x1)

P (x1)= 1

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 52: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Capacity ofDMC

TransmissionRate

Definition

Examples

Capacity of DMCNoiseless Binary Channel

H(X/Y ) = −1∑

i=0

1∑j=0

P (xi, yj) log2 P (xi/yj)

= − [P (x0, y0) log2 P (x0/y0) + P (x0, y1) log2 P (x0/y1)

+ P (x1, y0) log2 P (x1/y0) + P (x1, y1) log2 P (x1/y1)]

= 0

⇒ I(X,Y ) = H(X)−H(X/Y )

= H(X)

Therefore, the capacity of noiseless binary channel is

C = maxP (X)

I(X,Y ) bits/channel use

= maxP (X)

H(X) bits/channel use

= 1 bits/channel use

i.e., over a noiseless binary channel atmost one bit of information can be

send per channel use, which is maximum information content of a binary

source.Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 53: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Capacity ofDMC

TransmissionRate

Definition

Examples

Capacity of DMCNoiseless Binary Channel

H(X/Y ) = −1∑

i=0

1∑j=0

P (xi, yj) log2 P (xi/yj)

= − [P (x0, y0) log2 P (x0/y0) + P (x0, y1) log2 P (x0/y1)

+ P (x1, y0) log2 P (x1/y0) + P (x1, y1) log2 P (x1/y1)]

= 0

⇒ I(X,Y ) = H(X)−H(X/Y )

= H(X)

Therefore, the capacity of noiseless binary channel is

C = maxP (X)

I(X,Y ) bits/channel use

= maxP (X)

H(X) bits/channel use

= 1 bits/channel use

i.e., over a noiseless binary channel atmost one bit of information can be

send per channel use, which is maximum information content of a binary

source.Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 54: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Capacity ofDMC

TransmissionRate

Definition

Examples

Capacity of DMCNoisy Binary Symmetric Channel

Consider a noisy binary symmetric channel as shown below

0

1

0

1

𝑷𝟎𝟎 = 𝒑

𝑷𝟏𝟏 = 𝒑

𝑷𝟎𝟏 = 𝟏 − 𝒑

𝑷𝟏𝟎 = 𝟏 − 𝒑

𝑷(𝒙𝟎)

𝑷(𝒙𝟏)

𝑷(𝒚𝟎) 𝑷(𝒙𝟎)

𝑷(𝒚𝟏)

P (x0, y0) = P (x0)P00 = P (x0)p

P (x1, y1) = P (x1)P11 = P (x1)p

P (x0, y1) = P (x0)P01 = P (x0)(1− p)

P (x1, y0) = P (x1)P10 = P (x1)(1− p)

P (y0) = P (x0)P00 + P (x1)P10 = P (x0)p+ P (x1)(1− p)

P (y1) = P (x0)P01 + P (x1)P11 = P (x0)(1− p) + P (x1)p

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 55: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Capacity ofDMC

TransmissionRate

Definition

Examples

Capacity of DMCNoisy Binary Symmetric Channel

Consider a noisy binary symmetric channel as shown below

0

1

0

1

𝑷𝟎𝟎 = 𝒑

𝑷𝟏𝟏 = 𝒑

𝑷𝟎𝟏 = 𝟏 − 𝒑

𝑷𝟏𝟎 = 𝟏 − 𝒑

𝑷(𝒙𝟎)

𝑷(𝒙𝟏)

𝑷(𝒚𝟎) 𝑷(𝒙𝟎)

𝑷(𝒚𝟏)

P (x0, y0) = P (x0)P00 = P (x0)p

P (x1, y1) = P (x1)P11 = P (x1)p

P (x0, y1) = P (x0)P01 = P (x0)(1− p)

P (x1, y0) = P (x1)P10 = P (x1)(1− p)

P (y0) = P (x0)P00 + P (x1)P10 = P (x0)p+ P (x1)(1− p)

P (y1) = P (x0)P01 + P (x1)P11 = P (x0)(1− p) + P (x1)p

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 56: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Capacity ofDMC

TransmissionRate

Definition

Examples

Capacity of DMCNoisy Binary Symmetric Channel

H(Y/X) = −1∑

i=0

1∑j=0

P (xi, yj) log2 P (yj/xi)

= − [P (x0, y0) log2 P (y0/x0) + P (x0, y1) log2 P (y1/x0)

+ P (x1, y0) log2 P (y0/x1) + P (x1, y1) log2 P (y1/x1)]

= − [P (x0)p log2 p+ P (x0)(1− p) log2(1− p)

+ P (x1)(1− p) log2(1− p) + P (x1)p log2 p]

= − [p log2 p+ (1− p) log2(1− p)] = H(p, 1− p)

⇒ I(X,Y ) = H(Y )−H(Y/X)

= H(Y )−H(p, 1− p)

Therefore, the capacity of noisy binary symmetric channel is

C = maxP (X)

I(X,Y ) bits/channel use

= maxP (X)

H(Y )−H(p, 1− p) bits/channel use

= 1−H(p, 1− p) bits/channel use

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 57: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Capacity ofDMC

TransmissionRate

Definition

Examples

Capacity of DMCNoisy Binary Symmetric Channel

H(Y/X) = −1∑

i=0

1∑j=0

P (xi, yj) log2 P (yj/xi)

= − [P (x0, y0) log2 P (y0/x0) + P (x0, y1) log2 P (y1/x0)

+ P (x1, y0) log2 P (y0/x1) + P (x1, y1) log2 P (y1/x1)]

= − [P (x0)p log2 p+ P (x0)(1− p) log2(1− p)

+ P (x1)(1− p) log2(1− p) + P (x1)p log2 p]

= − [p log2 p+ (1− p) log2(1− p)] = H(p, 1− p)

⇒ I(X,Y ) = H(Y )−H(Y/X)

= H(Y )−H(p, 1− p)

Therefore, the capacity of noisy binary symmetric channel is

C = maxP (X)

I(X,Y ) bits/channel use

= maxP (X)

H(Y )−H(p, 1− p) bits/channel use

= 1−H(p, 1− p) bits/channel use

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 58: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Capacity ofDMC

TransmissionRate

Definition

Examples

Capacity of DMCNoisy Binary Symmetric Channel

To achieve the capacity of 1−H(p, 1− p) over a noisybinary symmetric channel, the input distributionshould make H(Y ) = 1.H(Y ) = 1, if P (y0) = P (y1) = 1

2

⇒ P (x0)p + P (x1)(1− p) =1

2

and P (x0)(1− p) + P (x1)p =1

2⇒ (1− 2p)(P (x1)− P (x0)) = 0

⇒ P (x1) = P (x0) =1

2

Thus over a binary symmetric channel, maximuminformation rate is possible when the source symbolsare equally likely.

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 59: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Capacity ofDMC

TransmissionRate

Definition

Examples

Capacity of DMCNoisy Binary Symmetric Channel

To achieve the capacity of 1−H(p, 1− p) over a noisybinary symmetric channel, the input distributionshould make H(Y ) = 1.H(Y ) = 1, if P (y0) = P (y1) = 1

2

⇒ P (x0)p + P (x1)(1− p) =1

2

and P (x0)(1− p) + P (x1)p =1

2⇒ (1− 2p)(P (x1)− P (x0)) = 0

⇒ P (x1) = P (x0) =1

2

Thus over a binary symmetric channel, maximuminformation rate is possible when the source symbolsare equally likely.

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 60: Discrete Memoryless Channel

DiscreteMemorylesschannel

MutualInformation

Capacity ofDMC

TransmissionRate

Definition

Examples

Capacity of DMCNoisy Binary Symmetric Channel

Capacity of binary symmetric channel Vs p

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 61: Discrete Memoryless Channel

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity

Page 62: Discrete Memoryless Channel

Information Theory and Coding Discrete Memoryless Channel and it’s Capacity