introduction to information complexity

43
1 Introduction to information complexity June 30, 2013 Mark Braverman Princeton University

Upload: liz

Post on 12-Feb-2016

40 views

Category:

Documents


0 download

DESCRIPTION

Introduction to information complexity. June 30, 2013. Mark Braverman Princeton University. Part I: Information theory. Information theory, in its modern format was introduced in the 1940s to study the problem of transmitting data over physical channels. . communication channel. Bob. Alice. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Introduction to information complexity

1

Introduction to information complexity

June 30, 2013

Mark BravermanPrinceton University

Page 2: Introduction to information complexity

Part I: Information theory

• Information theory, in its modern format was introduced in the 1940s to study the problem of transmitting data over physical channels.

2

communication channel

Alice Bob

Page 3: Introduction to information complexity

Quantifying “information”

• Information is measured in bits.• The basic notion is Shannon’s entropy. • The entropy of a random variable is the

(typical) number of bits needed to remove the uncertainty of the variable.

• For a discrete variable:

3

Page 4: Introduction to information complexity

Shannon’s entropy• Important examples and properties:

– If is a constant, then – If is uniform on a finite set of possible values,

then .– If is supported on at most values, then .– If is a random variable determined by , then .

4

Page 5: Introduction to information complexity

Conditional entropy• For two (potentially correlated) variables ,

the conditional entropy of given is the amount of uncertainty left in given :

.• One can show .• This important fact is known as the chain

rule. • If , then

5

Page 6: Introduction to information complexity

Example

• Where .• Then

– ; ; ;

6

Page 7: Introduction to information complexity

Mutual information

7

𝐻 (𝑋 ) 𝐻 (𝑌 )𝐻 (𝑌∨𝑋 )𝐻 (𝑋∨𝑌 )

𝐵1𝐵1⊕𝐵2

𝐵2⊕𝐵3

𝐵4

𝐵5

𝐼 (𝑋 ;𝑌 )

Page 8: Introduction to information complexity

Mutual information• The mutual information is defined as

• “By how much does knowing reduce the entropy of ?”

• Always non-negative .• Conditional mutual information:

• Chain rule for mutual information:

• Simple intuitive interpretation. 8

Page 9: Introduction to information complexity

Information Theory• The reason Information Theory is so

important for communication is because information-theoretic quantities readily operationalize.

• Can attach operational meaning to Shannon’s entropy: “the cost of transmitting ”.

• Let be the (expected) cost of transmitting a sample of .

9

Page 10: Introduction to information complexity

?

• Not quite. • Let trit • .

• It is always the case that .

10

1 02 10

3 11

Page 11: Introduction to information complexity

But and are close

• Huffman’s coding: • This is a compression result: “an

uninformative message turned into a short one”.

• Therefore: .

11

Page 12: Introduction to information complexity

Shannon’s noiseless coding• The cost of communicating many copies of

scales as . • Shannon’s source coding theorem:

– Let be the cost of transmitting independent copies of . Then the amortized transmission cost

.• This equation gives operational

meaning. 12

Page 13: Introduction to information complexity

communication channel

𝐻 ( 𝑋 )𝑜𝑝𝑒𝑟𝑎𝑡𝑖𝑜𝑛𝑎𝑙𝑖𝑧𝑒𝑑

13

𝑋 1 ,… ,𝑋𝑛 ,… per copy to transmit ’s

Page 14: Introduction to information complexity

is nicer than

• is additive for independent variables. • Let be independent trits. • .

• Works well with concepts such as channel capacity.

14

Page 15: Introduction to information complexity

Operationalizing other quantities

• Conditional entropy • (cf. Slepian-Wolf Theorem).

communication channel

𝑋 1 ,… ,𝑋𝑛 ,… per copy to transmit ’s

𝑌 1 ,… ,𝑌 𝑛 ,…

Page 16: Introduction to information complexity

communication channel

Operationalizing other quantities

• Mutual information :

𝑋 1 ,… ,𝑋𝑛 ,… per copy to sample ’s

𝑌 1 ,… ,𝑌 𝑛 ,…

Page 17: Introduction to information complexity

Information theory and entropy

• Allows us to formalize intuitive notions. • Operationalized in the context of one-way

transmission and related problems. • Has nice properties (additivity, chain rule…)• Next, we discuss extensions to more

interesting communication scenarios.

17

Page 18: Introduction to information complexity

Communication complexity• Focus on the two party randomized setting.

18

A B

X YA & B implement a functionality .

F(X,Y)

e.g.

Shared randomness R

Page 19: Introduction to information complexity

Communication complexity

A B

X Y

Goal: implement a functionality .A protocol computing :

F(X,Y)

m1(X,R)m2(Y,m1,R)

m3(X,m1,m2,R)

Communication cost = #of bits exchanged.

Shared randomness R

Page 20: Introduction to information complexity

Communication complexity• Numerous applications/potential

applications.• Considerably more difficult to obtain lower

bounds than transmission (still much easier than other models of computation!).

20

Page 21: Introduction to information complexity

Communication complexity

• (Distributional) communication complexity with input distribution and error : Error w.r.t. .

• (Randomized/worst-case) communication complexity: . Error on all inputs.

• Yao’s minimax:.

21

Page 22: Introduction to information complexity

Examples

• Equality .

• .

22

Page 23: Introduction to information complexity

Equality• is .• is a distribution where w.p. and w.p. are

random.

A B

X Y

• Shows that

MD5(X) [128 bits]X=Y? [1 bit]

Error?

Page 24: Introduction to information complexity

Examples

• I. • .In fact, using information complexity:• .

24

Page 25: Introduction to information complexity

Information complexity

• Information complexity :: communication complexity

as• Shannon’s entropy ::

transmission cost

25

Page 26: Introduction to information complexity

Information complexity

• The smallest amount of information Alice and Bob need to exchange to solve .

• How is information measured?• Communication cost of a protocol?

– Number of bits exchanged. • Information cost of a protocol?

– Amount of information revealed.

26

Page 27: Introduction to information complexity

Basic definition 1: The information cost of a protocol

• Prior distribution: .

A B

X Y

Protocol πProtocol transcript

𝐼𝐶(𝜋 ,𝜇)= 𝐼 (Π ;𝑌∨𝑋 )+𝐼 (Π ; 𝑋∨𝑌 )what Alice learns about Y + what Bob learns about X

Page 28: Introduction to information complexity

Example• is .• is a distribution where w.p. and w.p. are

random.

A B

X Y

1 + 65 = 66 bits

what Alice learns about Y + what Bob learns about X

MD5(X) [128 bits]X=Y? [1 bit]

Page 29: Introduction to information complexity

Prior matters a lot for information cost!

• If a singleton,

29

Page 30: Introduction to information complexity

Example• is .• is a distribution where are just uniformly

random.

A B

X Y

0 + 128 = 128 bits

what Alice learns about Y + what Bob learns about X

MD5(X) [128 bits]X=Y? [1 bit]

Page 31: Introduction to information complexity

Basic definition 2: Information complexity

• Communication complexity:.

• Analogously:.

31

Needed!

Page 32: Introduction to information complexity

Prior-free information complexity

• Using minimax can get rid of the prior. • For communication, we had:

.• For information

.

32

Page 33: Introduction to information complexity

Operationalizing IC: Information equals amortized communication

• Recall [Shannon]: .• Turns out [B.-Rao’11]: , for . [Error allowed on each copy]• For : .•[ an interesting open problem.]

33

Page 34: Introduction to information complexity

Entropy vs. Information Complexity

Entropy IC

Additive? Yes Yes

Operationalized

Compression? Huffman: ???!

Page 35: Introduction to information complexity

Can interactive communication be compressed?

• Is it true that ?• Less ambitiously:

• (Almost) equivalently: Given a protocol with , can Alice and Bob simulate using communication?

• Not known in general…

35

Page 36: Introduction to information complexity

Applications

• Information = amortized communication means that to understand the amortized cost of a problem enough to understand its information complexity.

36

Page 37: Introduction to information complexity

Example: the disjointness function• , are subsets of • Alice gets , Bob gets .• Need to determine whether .• In binary notation need to compute

• An operator on copies of the 2-bit AND function.

37

Page 38: Introduction to information complexity

Set intersection

• , are subsets of • Alice gets , Bob gets .• Want to compute .• This is just copies of the 2-bit AND. • Understanding the information complexity

of AND gives tight bounds on both problems!

38

Page 39: Introduction to information complexity

Exact communication bounds[B.-Garg-Pankratov-Weinstein’13]

• (trivial). • [Kalyanasundaram-Schnitger’87,

Razborov’92]New:• .

39

Page 40: Introduction to information complexity

Small set disjointness• , are subsets of , • Alice gets , Bob gets .• Need to determine whether .• Trivial: .• [Hastad-Wigderson’07]:• [BGPW’13]: .

40

Page 41: Introduction to information complexity

Open problem: Computability of IC

• Given the truth table of , and , compute • Via can compute a sequence of upper

bounds.• But the rate of convergence as a function of

is unknown.

41

Page 42: Introduction to information complexity

Open problem: Computability of IC

• Can compute the -round information complexity of .

• But the rate of convergence as a function of is unknown.

• Conjecture:

• This is the relationship for the two-bit AND.

42

Page 43: Introduction to information complexity

43

Thank You!