semantic information theory in 20 minutes

29
Towards a Theory of Semantic Communication Jie Bao, Prithwish Basu, Mike Dean, Craig Partridge, Ananthram Swami, Will Leland and Jim Hendler RPI, Raytheon BBN, and ARL IEEE Network Science Workshop 2011, West Point, June 23rd, 20011

Upload: jie-bao

Post on 11-May-2015

5.807 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: Semantic information theory in 20 minutes

Towards a Theory of Semantic Communication

Jie Bao, Prithwish Basu, Mike Dean, Craig Partridge, Ananthram Swami, Will Leland and Jim Hendler

RPI, Raytheon BBN, and ARL

IEEE Network Science Workshop 2011,

West Point, June 23rd, 20011

Page 2: Semantic information theory in 20 minutes

Outline

• Background• A general semantic communication model• Semantic data compression (source coding)• Semantic reliable communication (channel

coding) • Path ahead

2

Page 3: Semantic information theory in 20 minutes

Shannon, 1948

“The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning;... These semantic aspects of communication are irrelevant to the engineering problem.”

3Claude E. Shannon. A mathematical theory of communication. Bell System Technical Journal, 27:379-423, 625-56, 1948.

message

message

Signal

Signal

Page 4: Semantic information theory in 20 minutes

However, are these just bits?

• Movie streams• Software codes• DNA sequences• Emails• Tweets• ……

4

“The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning;..”“These semantic aspects of communication are irrelevant to the engineering problem”?

Page 5: Semantic information theory in 20 minutes

Our Contributions

• We develop a generic model of semantic communication, extending the classic model-theoretical work of (Carnap and Bar-Hillel 1952) ;

• We discuss the role of semantics in reducing source redundancy, and potential approaches for lossless and lossy semantic data compression;

• We define the notions of semantic noise, semantic channel, and obtain the semantic capacity of a channel.

5

Page 6: Semantic information theory in 20 minutes

(Classical) Information Theory Semantic Information Theory

Shannon, 1948

message

message

Shannon ModelShannon Model

Signal

Signal

ExpressedMessage(e.g., commands and reports)

Expressed Message

Semantic Channel

From IT to SIT

6

Page 7: Semantic information theory in 20 minutes

A 3-level Model (adapted from Weaver)

Transmitter Receiver

Destination Destination Source Source

Physical Channel

Technical message

Technical Noise

Intended message

Expressed message

Semantic Transmitter

Semantic Transmitter

Semantic ReceiverSemantic Receiver

Semantic Noise

Semantic Noise

Shared knowledge

Shared knowledge

Local knowledge

Local knowledge

Local knowledge

Local knowledge

C: Effectiveness

B: Semantic

A: Technical

Context, Utility, Trust etc.Context, Utility, Trust etc.

7

Page 8: Semantic information theory in 20 minutes

A Semantic Communication Model

8

Message generator

World model

Background Knowledge

Inference Procedure

Messages

Sender

Message interpreter

World model

Background Knowledge

Inference Procedure

Receiver

Ws Wr

Ks KrIs Ir

{m}

World

M: Message Syntax

Feedback (?)

observations

Ms Mr

Page 9: Semantic information theory in 20 minutes

Semantic Sources

• Key: A semantic source tells something that is “true”– Engineering bits are neither true or false!

• Goals: 1) more correctness (sent as “true”->received as “true”); 2) less ambiguity

9

Page 10: Semantic information theory in 20 minutes

Which message is more “surprising”?

10

Rex is not a tyrannosaurus

Rex is not a dog

This slide contains animation

Page 11: Semantic information theory in 20 minutes

Semantics of Messages

• If a message (an expression) is more commonly true, it contains less semantic information– inf (Sunny & Cold) > inf (Cold)– inf (Cold) > inf (Cold or Warm)

11

Shannon Information

Shannon Information How often a message appears

Semantic Information

Semantic Information How often a message is true

Page 12: Semantic information theory in 20 minutes

Semantics of Messages

• Carnap & Bar-Hillel (1952) - “An outline of a theory of semantic information”

m(exp) = |mod(exp)| / |all models|

inf(exp) = - log m(exp)

• Example– m(A v B) = 3/4, m(A ^ B)=1/4– Inf(A v B)=0.415, inf(A^B )= 2

12

Page 13: Semantic information theory in 20 minutes

Knowledge Entropy

• Extending Carnap & Bar-Hillel (1952) – Models have a distribution– Background knowledge may present

Weekend=2/7, Saturday=1/7

13

Page 14: Semantic information theory in 20 minutes

Semantic Information Calculator

• http://www.cs.rpi.edu/~baojie/sit/index.php

14

Page 15: Semantic information theory in 20 minutes

• Semantic Information and Coding– Data compressing (source coding)– Reliable communication (channel coding)

15

Page 16: Semantic information theory in 20 minutes

Compression with Shared Knowledge

• Background knowledge (A->B), when shared, helps compress the source– Side information in the form of entailment – Not addressed by classical information theory

16

Page 17: Semantic information theory in 20 minutes

Lossless Message Compression

• Intuition: by removing synonyms– “pig” = “swine”– a->(a^b)v(b^c) = a->b

• Theorem: There is a semantically lossless code for source X, with message entropy H >= H(Xeq)

– Xeq are equivalence classes of X

• Other lossless compressing strategies may exist– e.g., by using semantic ambiguity

17

Page 18: Semantic information theory in 20 minutes

Other Source Coding Strategies

• Lossless model compression– E.g., using minimal models

• Lossy message compression– Sometime, a semantic loss is intentional compression

• How about having lunch at 1pm? See you soon! => “lunch@1? C U”

• Textual description of an image

• Ongoing Work

18

Page 19: Semantic information theory in 20 minutes

Semantic Errors and Noises

Examples• From engineering noise: “copy machine” or “coffee machine”?• Semantic mismatch: The source / receiver use different

background knowledge or inference• Lost in translation: The word “Uncle” in English has no exact

correspondence in Chinese.

Note: • Not all syntactical errors are semantic!• Neither that all semantic errors are syntactical.

19

W X Y W’

Model Sent message Received message Model

Goal: W’ |=xNoise

Page 20: Semantic information theory in 20 minutes

Semantic Channel Coding Theorem

• Theorem: If transmission rate is smaller than Cs (semantic channel capacity), semantic-error-free coding exists

20

Engineering channel capacity

Engineering channel capacity

Encoder’s semantic ambiguity

Encoder’s semantic ambiguity

Decode’s “smartness”

Decode’s “smartness”

W X Y W’

Model Sent message Received message Model

Goal: W’ |=x

Page 21: Semantic information theory in 20 minutes

Path Ahead

• Extensions– First-order languages [probabilistic logics]– Inconsistent KBs (misinformation) [paraconsistent

logics]– Lossy source coding [clustering and similarity

measurement]– Semantic mismatches

• Applications– Semantic compression for RDF/OWL– Semantic retrieval, e.g., extending TF*IDF

21

Page 22: Semantic information theory in 20 minutes

Questions?

22Image courtesy: http://www.addletters.com/pictures/bart-simpson-generator/900788.htm

• Slides: http://www.slideshare.net/baojie_iowa• Tech Report:

http://www.cs.rpi.edu/~baojie/pub/2011-03-28_nsw_tr.pdf

• Contact: [email protected]

Page 23: Semantic information theory in 20 minutes

backup

23

Page 24: Semantic information theory in 20 minutes

Measuring Semantic Information

• Statistical approach: Inference may change the distribution of symbols, hence the entropy of the source.

• Model-theoretical approach: The less “likely” a message is to be true, the more information it contains.

• Algorithmic approach: What’s the minimal program needed to describe messages and their deductions?

• Situation-theoretical approach: measuring the divergence of messages to “truth”.

24

Our ApproachOur Approach

Page 25: Semantic information theory in 20 minutes

Shannon: Information = “surpriseness”

25

H(tyrannosaurus) > H(dog)H(tyrannosaurus) > H(dog)

Captured from: http://www.wordcount.org/main.php

Page 26: Semantic information theory in 20 minutes

????

Model Semantics

• tyrannosaurus • dog

26

??

Page 27: Semantic information theory in 20 minutes

Conditional Knowledge Entropy

• When there is background knowledge, the set of possible worlds decreases.

27

Page 28: Semantic information theory in 20 minutes

Semantic Noise and Channel Coding

28

“coffee machine”“copy machine”

“Xerox” “Xerox”

“copy machine”

p->ff

?

?

0.9

0.1

1.0

W X Y W’

Scenario developed based on reports in http://english.visitkorea.or.kr/enu/AK/AK_EN_1_6_8_5.jsp and  http://blog.cleveland.com/metro/2011/03/identifying_photocopy_machine.html

Page 29: Semantic information theory in 20 minutes

Sunny

RainLight Rain

Heavy Rain

Sunny

Light Rain Heavy Rain

Status0.5

0.25

0.250.5

0.2 0.2

0.1

(a) (b)

H(X)=1.5 H(X’)=1.76

Compressing by semantic Ambiguity