some aspects of information theory for a computer scientist eric fabre 11 sep. 2014

65
Some aspects of information theory for a computer scientist Eric Fabre http://people.rennes.inria.fr/Eric.Fabre http://www.irisa.fr/sumo 11 Sep. 2014

Upload: julio-harnett

Post on 01-Apr-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Some aspects of information theory for a computer scientist

Eric Fabrehttp://people.rennes.inria.fr/Eric.Fabrehttp://www.irisa.fr/sumo

11 Sep. 2014

Page 2: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Outline

1. Information: measure and compression

2. Reliable transmission of information

3. Distributed compression

4. Fountain codes

5. Distributed peer-to-peer storage

11/09/14

Page 3: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Information: measure and compression

11/09/14

1

Page 4: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Let’s play…

11/09/14

One card is drawn at random in the following set.Guess the color of the card, with a minimum of yes/no questions

One strategy• is it hearts ? • if not, is it clubs ?• if not, is it diamonds ?

Wins in• 1 guess, with probability ½ • 2 guesses, with prob. ¼• 3 guesses, with prob. ¼ 1.75 questions on average

Is there a better strategy ?

Page 5: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

11/09/14

Observation

Lessons- more likely means easier to guess (carries less information)- amount of information depends only on the log likelihood of an event- guessing with yes/no questions = encoding with bits = compressing

1

01001000

Page 6: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

11/09/14

Important remark:

• codes like the one below are not permitted

• they cannot be uniquely decoded if one transmits sequences of encoded values of Xe.g. sequence 11 can encode “Diamonds” or “Hearts,Hearts”

• one would need one extra symbol to separate “words”

1

01100

Page 7: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Entropy

11/09/14

Source of information = random variablenotation: variables X, Y, … taking values x, y, …

information carried by event “X=x”

average information carried by X

H(X) measures the average difficulty to encode/describe/guess random outcomes of X

Page 8: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Properties

11/09/14

with equality iff X and Y independent(i.e. )

with equality iff X not random

with equality iff is uniform

Bernouilli distribution

Page 9: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Conditional entropy

11/09/14

uncertainty left on Y when X is known

Property

with equality iff Y and X independent

Page 10: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

11/09/14

Example : X = color, Y = value

average

recall

so one checks

Exercise : check that

Page 11: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

11/09/14

A visual representation

Page 12: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

11/09/14

Data compression

CoDec for source X, with R bits/sample on average

rate R is achievable iff there exists CoDec pairs (fn,gn) of rate R

with vanishing error probability :

Usage: there was no better strategy for our card game !

Theorem (Shannon, ‘48) : - a lossless compression scheme for source X must have

a rate R ≥ H(X) bits/sample on average- the rate H(X) is (asymptotically) achievable

Page 13: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

11/09/14

Proof

Solution 1• use a known optimal lossless coding scheme for X : the Huffman code• then prove H(X) ≤ L < H(X) + 1

• over n independent symbols X1,…,Xn, one has

Necessity : if R achievable, then R ≥ H(X), quite easy to proveSufficiency : for R > H(X), it requires to build a losslesscoding scheme of using R bits/sample on average

Solution 2 : encoding only “typical sequences”

Page 14: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

11/09/14

Typical sequences

Let X1,…,Xn be independent, same law

By the law of large numbers, one has the a.s. convergence

Sequence is typical iff

or equivalently

Set of typical sequences :

Page 15: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

11/09/14

AEP : asymptotic equipartition property

• one has

• and

So non typical sequences count for 0, and there are approximately

typical sequences, each of probability

2nH(X) typical sequences

Kn=2n log2 K sequences, where

Optimal lossless compression• encode a typical sequence with nH(X) bits

• encode a non-typical sequence with n log2 K bits

• add 0 / 1 as prefix to mean typ. / non-typ.

Page 16: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

11/09/14

Practical coding schemes

Encoding by typicality is unpractical !

Practical codes : • Huffman code• arithmetic coding (adapted to data flows)• etc.All require to know the distribution of the source to be efficient.

Universal code:• does not need to know the source distribution

• for long sequences X1…Xn, converge to the optimal rate H(X) bits/symbol

• example: Lempel-Ziv algorithm (used in ZIP, Compress, etc.)

Page 17: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

11/09/14

Reliable transmission of information

2

Page 18: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Mutual information

11/09/14

Properties

with equality iff X and Y are independent

measures how many bits X and Y have in common (on average)

Page 19: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Noisy channel

11/09/14

Channel = input alphabet, output alphabet, transition probability

A B A AB B

observe that is left free

Capacity

maximizes the coupling between input and output lettersfavors letters that are the less altered by noise

bits / use of channel

Page 20: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Example

11/09/14

The erasure channel : a proportion of p bits are erased

A B

Define the erasure variable E = f(B) with E=1 when an erasure occurred, and E=0 otherwise

E0

1

andSo

Page 21: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Protection against errors

11/09/14

Idea: add extra bits to the message, to augment its inner redundancy (this is exactly the converse of data compression)

Coding scheme

• X takes values in { 1, 2, … , M=2nR }

• rate of the codec R = log2(M) / n transmitted bits / channel use

• R is achievable iff there exists a series of (fn,gn) CoDecs of rate R

such that

fn gn

noisy channel

where

Page 22: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Error correction (for a binary channel)

11/09/14

Repetition

• useful bit U sent 3 times : A1=A2=A3=U

• decoding by majority• detects and corrects one error… but R’=R/3

Parity checks

• X = k useful bits U1…Uk, expanded into n bits A1…An

• rate R = k/n

• for example: add extra redundant bits Vk+1…Vn that are

linear combinations of the U1…Uk

• examples: • ASCII code k=7, n=8• ISBN• social security number• credit card numberQuestions: how ??? and how many extra bits ???

Page 23: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

How ?

11/09/14

Almost all channel codes are linear : Reed-Solomon, Reed-Muller, Golay, BCH, cyclic codes, convolutional codes…Use finite field theory, and algebraic decoding techniques.

The Hamming code

• 4 useful bits U1…U4

• 3 redundant bits V1…V3

• rate R = 4/7• detects and corrects 1 error (exercise…)• trick : 2 codewords differ by at least 3 bits

U1

U2

U3

U4

V1

V2

V3

1 0 0 0 0 1 10 1 0 0 1 0 10 0 1 0 1 1 00 0 0 1 1 1 1

[ U1 … U4 ] = [ U1 … U4 V1 … V3 ]

Generating matrix (of a linear code)

Page 24: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

what Shannon proved in ’48

How much ?

11/09/14

what people believed before ‘48

Usage: measures the efficiency of an error correcting code for some channel

Theorem (Shannon, ‘48) : - any achievable transmission rate R must satisfy

R ≤ C transmitted bits / channel use - any transmission rate R < C is achievable

Page 25: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Proof

11/09/14

Necessity: if a coding is (asympt.) error free, then its rate satisfies R≤ C, rather easy to proveSufficiency: any rate R<C is achievable, demands to build a coding scheme !

Idea = random coding !

• best distribution on the input alphabet of the channel

• build a random codeword w = a1…an drawing letters according to

(w is a typical sequence)

• sending w over the channel yields output w’ = b1…bn

which is a typical sequence for • and the pair (w,w’) is jointly typical for

Page 26: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

11/09/14

w1 w’1

M typical

sequencesas codewords

typical sequences

A1…An B1…Bn

jointly typical with w1

possible typical sequences

at output

w2 w’2

wM w’M.

. .

• if M is small enough, the output cones do not overlap (with high probability)• maximal number of input codewords :

which proves that any R <C is achievable !

Page 27: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

n transmitted bits

Perfect coding

11/09/14

Perfect code = error-free and achieves capacity. What does it look like ?

• by the data processing inequality

nR = H(X) = I(X;X) ≤ I(A1…An;B1…Bn) ≤ nC

• if R = C, then I(A1…An;B1…Bn) = nC

• possible iff letters of the codeword Ai are independent,

and each I(Ai;Bi)=C, i.e. each Ai carries R=C bits

fn gn

noisy channel

For a binary channel: R = k / n a perfect code spreads information uniformly over a larger number of bits

k useful bits

channel

Page 28: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

In practice

11/09/14

• Random coding unpractical: relies on a (huge) codebook for cod./dec.• Algebraic (linear) codes were preferred for long :

more structure, cod./dec. with algorithms• But in practice, they remained much below optimal rates !

• Things changed in 1993 when Berrou & Glavieux invented the turbo-codes• followed by the rediscovery of the low-density parity check codes (LDPC)

invented by Gallager in his PhD… in 1963 !• both code families behave like random codes… but come with

low-complexity cod./dec. algorithms

Page 29: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Can feedback improve capacity ?

11/09/14

Principle• the outputs of the channel are revealed to the sender• the sender can use this information to adapt its next symbol

channel

But is can greatly simplify coding, decoding, and transmission protocols.

Theorem: Feedback does not improve channel capacity.

Page 30: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

2nd PART

11/09/14

Information theory was designed for point-to-point communications.Which was soon considered as a limitation…

broadcast channel: each user has a different channelmultiple access channel: interferences

Spread information: which structure for this object ?how to regenerate / transmit it ?

sd

Page 31: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

2nd PART

11/09/14

What is the capacity of a network ?

Are network links just pipes, with capacity, in which information flows like a fluid ?

A

B

C How many transmissions to broadcast from A to C,D and from B to C,D ?

D

E F

a a

a a

a

a

ab

bb b

b b

ba a

a

a

b

bb b

a +b

a

a

+b

b

a

+b By network coding, one transmissionover link E—F can be saved.

Medard & Koetter2003

Page 32: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Outline

1. Information: measure and compression

2. Reliable transmission of information

3. Distributed compression

4. Fountain codes

5. Distributed peer-to-peer storage

11/09/14

Page 33: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

11/09/14

Distributed source coding

3

Page 34: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Collecting spread information

11/09/14

• X, Y are two distant but correlated sources• transmit their value to a unique receiver (perfect channels)• no communication between the encoders

X

Y

dist

ance

encoder 1

encoder 2

joint decoder X,Yno

com

mun

icat

ion

I(X;Y)

H(Y|X)

H(X|Y)

• Naive solution = ignore correlation, compress and send

each source separately : rates R1=H(X), R2=H(Y)

• Can one do better, and take advantage of the correlation of X and Y ?

rate R1

rate R2K

Page 35: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Example

11/09/14

• X = weather in Brest, Y = weather in Quimper• probability that weathers are identical is 0.89• one wishes to send the observed weather of 100 days in both cities

• One has H(X) = 1 = H(Y), so naïve encoding requires 200 bits• I(X;Y) = 0.5, so not sending the “common information” saves 50 bits

0.445 0.055

0.055 0.445

sun rainY

sun

rainX

Page 36: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Necessary conditions

11/09/14

Question: what are the best possible achievable transmission rates ?

X

Y

dist

ance

encoder 1

encoder 2

joint decoder X,Yno

com

mun

icat

ion

I(X;Y)

H(Y|X)

H(X|Y) rate R1

rate R2

• Jointly, both coders must transmit the full pair (X,Y), so

R1+R2 ≥ H(X,Y)

• Each coder alone must transmit the private information that is not accessible through the other variable, so

R1 ≥ H(X|Y) and R2 ≥ H(Y|X)

A pair (R1,R2) is achievable is there exist separate encoders fnX and fn

Y

of sequences X1…Xn and Y1…Yn resp., and a joint decoder gn, that are

asymptotically error-free.

Page 37: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Result

11/09/14

Theorem (Slepian & Wolf, ‘75) : The achievable region is defined by

• R1 ≥ H(X|Y)

• R2 ≥ H(Y|X)

• R1+R2 ≥ H(X,Y)

R1

R2

H(Y|X)

H(Y)

H(X|Y) H(X)

achievable region

The achievable region is easily shown to be convex, upper-right closed.

Page 38: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Compression by random binning

11/09/14

• encode only typical sequences w = x1…xn =

• throw then at random into 2nR bins, with R>H(X)

1 2 3 2nR codeword, on R bits/symbol

Encoding of w = the number b of the bin where w lies

Decoding : if w = unique typical sequence in bin number b, output w otherwise, output “error”

Error probability

Page 39: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Proof of Slepian-Wolf

11/09/14

• fX and fY are two independent random binnings of rates R1 and R2

for x = x1…xn and y = y1…yn resp.

• to decode the pair of bin numbers (bX,bY) = (fX(x),fY(y)), g outputs the unique pair (x,y) of jointly typical sequences in box (bX,bY) or “error” if there are more than one such pair.

• R2>H(Y|X) : given x, there are 2nH(Y|X) sequences y that are jointly typical with x

• R1+R2 > H(X,Y) : the number of boxes 2n(R1+R2) must be greater than 2nH(X,Y)

1 2 3 2nR1

1

2

3

2nR2

…x

y jointly typical pairs (x,y)

Page 40: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Example

11/09/14

X= color

Y=value

0.5 1.251.25X Y

Questions:

1. Is there an instantaneous* transmission protocol for rates RX=1.25=H(X|Y), RY=1.75=H(Y) ?

• send Y (always) : 1.75 bits• what about X ?

(caution: the code for X should be uniquely decodable)

Y

X

?

?

?

?

0 10 110 111

(*) i.e. for sequences of length n=1

2. What about RX=RY=1.5 ?

K

Page 41: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

In practice

11/09/14

The Slepian-Wolf theorem extends to N sources.It long remained an academic result, since no practical coders existed.

Beginning of the 2000s, practical coders and applications appeared• compression of correlated images (e.g. same scene, 2 angles)• sensor networks (e.g. measure of a temperature field)• case of a channel with side information• acquisition of structured information, without communication

Page 42: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

11/09/14

Fountain codes

4

Page 43: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Network protocols

11/09/14

TCP/IP (transmission control protocol)

network(erasure channel)

1234567 1 234

ack 2 ack 2

• slow for huge files over long-range connexions (e.g. cloud backups…)• feedback channel… but feedback does not improve capacity !• repetition code… the worst rate among error correcting codes !• designed by engineers who ignored information theory ? :o)

Drawbacks

• the erasure rate of the channel (thus capacity) is unknown / changing• feedback make protocols simpler• there exist faster protocols (UDP) for streaming feeds

However

Page 44: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

A fountain of information bits…

11/09/14

How to quickly and reliably transmit K packets of b bits?

Fountain code: • from k packets, generate and send a continuous

flow of packets• some get lost, some go through ; no feedback• as soon as a proportion K(1+ε) of them are received,

any of them, decoding becomes possible

Fountain codes are example of rateless codes (no predefined rate),or universal codes : they adapt to the channel capacity.

Page 45: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Random coding…

11/09/14

Packet tn sent at time n is a random linear combinations of the

K packets s1…sK to transmit.

where the Gn,k are random IID binary variables.

…s1 sKb

bits

K packets

…t1 tK’

…s1 sK …t1 tK’=K

1001011 … 11010001 ... 00110100 … 1 … 1011010 … 0

G

K’

*

Page 46: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Decoding

11/09/14

…s1 sK …t1 tK’=K

1001011 … 11010001 ... 00110100 … 1 … 1011010 … 0

G

K’

*

11 … 1 12 … 000 … 1 … 11 … 0

G’

N

K =r1 rN…*…s1 sK

Some packets are lost, and N out of K’ are received.This is equivalent to another random code with generating matrix G’.

How big should N be to enable decoding ?

Page 47: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

11/09/14

Decoding

• For N=K, what is the probability that G’ is invertible ?

One has where G’ is a random K*N binary matrix.If G’ is invertible, one can decode by

Answer: converges quickly to 0.289 (as soon as K>10).

• What about N=K+E ? What is the probability P that at least one K*K sub-matrix of G’ is invertible ?Answer: P =1-δ(E) where δ(E) ≤ 2-E ( δ(E)<10-6 for E=20)

exponential convergence to 1 with E, regardless of K.

Complexity

• K/2 operations per gerenated packet, so O(K2) for encoding• decoding: K3 for matrix inversion• one would like better complexities… linear ?

Page 48: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

LT codes

11/09/14

Invented by Michael Luby (2003), and inspired from LDPC codes (Gallager, 1963).

Idea : linear combinations of packets should be “sparse”

Encoding

• for each packet tn, randomly select a “degree” dn

according to some distribution ρ(d) on degrees

• choose at random dn packets among s1…sK

and take as tn the sum of these dn packets

• some nodes have low degree, others have high degree:makes the graph a small world

…s1 sK

t1 tN

Page 49: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Decoding LT codes

11/09/14

Idea = a simplified version of turbo-decoding (Berrou) that resembles cross-words solving

Example

1 0 1 1

Page 50: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Decoding LT codes

11/09/14

Idea = a simplified version of turbo-decoding (Berrou) that resembles cross-words solving

Example

1 0 1 1

1

Page 51: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Decoding LT codes

11/09/14

Idea = a simplified version of turbo-decoding (Berrou) that resembles cross-words solving

Example

1 0 1 1

1

Page 52: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Decoding LT codes

11/09/14

Idea = a simplified version of turbo-decoding (Berrou) that resembles cross-words solving

Example

1 0 1 1

1 0

Page 53: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Decoding LT codes

11/09/14

Idea = a simplified version of turbo-decoding (Berrou) that resembles cross-words solving

Example

1 0 1 1

1 0

Page 54: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Decoding LT codes

11/09/14

Idea = a simplified version of turbo-decoding (Berrou) that resembles cross-words solving

Example

1 0 1 1

1 0 1

Page 55: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Decoding LT codes

11/09/14

Idea = a simplified version of turbo-decoding (Berrou) that resembles cross-words solving

Example

1 0 1 1

1 0 1

How to choose degrees ?• each iteration should yield to a single new node of degree 1

achieved by distribution ρ(1)=1/K and ρ (d)=1/d(d-1) for d=2…K• average degree is logeK, so decoding complexity is K logeK• in reality

• one needs a few nodes of high degree to ensure that everypacket is connected to at least one check-node

• one needs a little more small degree nodes to ensure that decoding starts

Page 56: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

In practice…

11/09/14

Performance• both encoding and decoding are in K log K (instead of K2 and K3)• for large K>104, the observed overhead E represents from 5% to 10%• Raptor codes (Shokrollahy, 2003) do better : linear time complexity

Applications• broadcast to many users:

a fountain code adapts to the channel of each user no need to rebroadcast packets missed by some user

• storage on many unreliable devices e.g. RAID (redundant array of inexpensive disks) data centers peer-to-peer distributed storage

Page 57: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

11/09/14

Distributed P2P storage

5

Page 58: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Principle

11/09/14

…s1sK

t1 tN…raw data

v

redundant data

t2

distinct storages (disks, peers,…)

Problems• disks can crash, peers can leave: eventual data loss• original data can be recovered if enough packets remain…

… but missing packets need to be restored

Idea = Raw data split into packets, expanded with some ECC. Each new created packet is stored independently. Original data erased.

Restoration• perfect : the packet that is lost is exactly replaced• functional : new packets are built, to preserve data recoverability• intermediate : maintain the systematic part of the data

t’2 t’N…

new peers

Page 59: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Which codes ?

11/09/14

Fountain/random codes :

• random linear combinations of remaining blocks among t1…tn• will not preserve the appropriate degree distribution

Target : one should rebuild missing blocks… … without first rebuilding the original data ! (would require too much bandwith)

MDS codes: maximum distance separable codes

• can rebuild s1…sk from any subset of exactly k blocks in t1…tn• example : Reed-Solomon codes

Page 60: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Example

11/09/14

k sets of α blocs

n sets of α blocs

a b c d

a+c b+d b+c a+b+da b ca b d

reconstruction a b

d b+d a+b+d

β blocs requested

Page 61: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Example

11/09/14

k sets of α blocs

n sets of α blocs

a b c d

a+c b+d b+c a+b+da b c d b+c a+b+d

reconstruction

c+d b+d

b+c a+b+d

a

β blocs requested

Result (Dimakis et al., 2010): For functional repair,given k, n and d ≥ k (number of nodes to contact for repair)network coding techniques allows to optimally balanceα (number of blocs) and β (bandwidth necessary to reconstruction).

Page 62: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

11/09/14

Conclusion

6

Page 63: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

A few lessons

11/09/14

Ralf Koetter* : “Communications aren’t anymore about transmitting a bit, but about transmitting evidence about a bit.”

(*) one of the inventors of Network Coding

Random structures spread information uniformly.

Information theory gives bounds on how much one can learnabout some hidden information…

One does not have to build the actual protocols/codes that will reveal this information.

Page 64: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

Management of distributed information……in other fields

11/09/14

- A, B: random variables, possibly correlated- one wishes to compute in B the value f(A,B)- how many bits should be exchanged?- how many communication rounds?

Compressed sensing (signal processing)

- signal can be described by sparse coefficients- random (sub-Nyquist) sampling

Communications complexity (computer science)- A, B: variables, taking values in a huge space- how many bits should A send to B in order to check A=B ?- solution by random coding

A Bn bits

A=B ?

Digital communications (network information theory)

Page 65: Some aspects of information theory for a computer scientist Eric Fabre   11 Sep. 2014

thank you !