semantic information theory in 20 minutes
TRANSCRIPT
Towards a Theory of Semantic Communication
Jie Bao, Prithwish Basu, Mike Dean, Craig Partridge, Ananthram Swami, Will Leland and Jim Hendler
RPI, Raytheon BBN, and ARL
IEEE Network Science Workshop 2011,
West Point, June 23rd, 20011
Outline
• Background• A general semantic communication model• Semantic data compression (source coding)• Semantic reliable communication (channel
coding) • Path ahead
2
Shannon, 1948
“The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning;... These semantic aspects of communication are irrelevant to the engineering problem.”
3Claude E. Shannon. A mathematical theory of communication. Bell System Technical Journal, 27:379-423, 625-56, 1948.
message
message
Signal
Signal
However, are these just bits?
• Movie streams• Software codes• DNA sequences• Emails• Tweets• ……
4
“The fundamental problem of communication is that of reproducing at one point either exactly or approximately a message selected at another point. Frequently the messages have meaning;..”“These semantic aspects of communication are irrelevant to the engineering problem”?
Our Contributions
• We develop a generic model of semantic communication, extending the classic model-theoretical work of (Carnap and Bar-Hillel 1952) ;
• We discuss the role of semantics in reducing source redundancy, and potential approaches for lossless and lossy semantic data compression;
• We define the notions of semantic noise, semantic channel, and obtain the semantic capacity of a channel.
5
(Classical) Information Theory Semantic Information Theory
Shannon, 1948
message
message
Shannon ModelShannon Model
Signal
Signal
ExpressedMessage(e.g., commands and reports)
Expressed Message
Semantic Channel
From IT to SIT
6
A 3-level Model (adapted from Weaver)
Transmitter Receiver
Destination Destination Source Source
Physical Channel
Technical message
Technical Noise
Intended message
Expressed message
Semantic Transmitter
Semantic Transmitter
Semantic ReceiverSemantic Receiver
Semantic Noise
Semantic Noise
Shared knowledge
Shared knowledge
Local knowledge
Local knowledge
Local knowledge
Local knowledge
C: Effectiveness
B: Semantic
A: Technical
Context, Utility, Trust etc.Context, Utility, Trust etc.
7
A Semantic Communication Model
8
Message generator
World model
Background Knowledge
Inference Procedure
Messages
Sender
Message interpreter
World model
Background Knowledge
Inference Procedure
Receiver
Ws Wr
Ks KrIs Ir
{m}
World
M: Message Syntax
Feedback (?)
observations
Ms Mr
Semantic Sources
• Key: A semantic source tells something that is “true”– Engineering bits are neither true or false!
• Goals: 1) more correctness (sent as “true”->received as “true”); 2) less ambiguity
9
Which message is more “surprising”?
10
Rex is not a tyrannosaurus
Rex is not a dog
This slide contains animation
Semantics of Messages
• If a message (an expression) is more commonly true, it contains less semantic information– inf (Sunny & Cold) > inf (Cold)– inf (Cold) > inf (Cold or Warm)
11
Shannon Information
Shannon Information How often a message appears
Semantic Information
Semantic Information How often a message is true
Semantics of Messages
• Carnap & Bar-Hillel (1952) - “An outline of a theory of semantic information”
m(exp) = |mod(exp)| / |all models|
inf(exp) = - log m(exp)
• Example– m(A v B) = 3/4, m(A ^ B)=1/4– Inf(A v B)=0.415, inf(A^B )= 2
12
Knowledge Entropy
• Extending Carnap & Bar-Hillel (1952) – Models have a distribution– Background knowledge may present
Weekend=2/7, Saturday=1/7
13
Semantic Information Calculator
• http://www.cs.rpi.edu/~baojie/sit/index.php
14
• Semantic Information and Coding– Data compressing (source coding)– Reliable communication (channel coding)
15
Compression with Shared Knowledge
• Background knowledge (A->B), when shared, helps compress the source– Side information in the form of entailment – Not addressed by classical information theory
16
Lossless Message Compression
• Intuition: by removing synonyms– “pig” = “swine”– a->(a^b)v(b^c) = a->b
• Theorem: There is a semantically lossless code for source X, with message entropy H >= H(Xeq)
– Xeq are equivalence classes of X
• Other lossless compressing strategies may exist– e.g., by using semantic ambiguity
17
Other Source Coding Strategies
• Lossless model compression– E.g., using minimal models
• Lossy message compression– Sometime, a semantic loss is intentional compression
• How about having lunch at 1pm? See you soon! => “lunch@1? C U”
• Textual description of an image
• Ongoing Work
18
Semantic Errors and Noises
Examples• From engineering noise: “copy machine” or “coffee machine”?• Semantic mismatch: The source / receiver use different
background knowledge or inference• Lost in translation: The word “Uncle” in English has no exact
correspondence in Chinese.
Note: • Not all syntactical errors are semantic!• Neither that all semantic errors are syntactical.
19
W X Y W’
Model Sent message Received message Model
Goal: W’ |=xNoise
Semantic Channel Coding Theorem
• Theorem: If transmission rate is smaller than Cs (semantic channel capacity), semantic-error-free coding exists
20
Engineering channel capacity
Engineering channel capacity
Encoder’s semantic ambiguity
Encoder’s semantic ambiguity
Decode’s “smartness”
Decode’s “smartness”
W X Y W’
Model Sent message Received message Model
Goal: W’ |=x
Path Ahead
• Extensions– First-order languages [probabilistic logics]– Inconsistent KBs (misinformation) [paraconsistent
logics]– Lossy source coding [clustering and similarity
measurement]– Semantic mismatches
• Applications– Semantic compression for RDF/OWL– Semantic retrieval, e.g., extending TF*IDF
21
Questions?
22Image courtesy: http://www.addletters.com/pictures/bart-simpson-generator/900788.htm
• Slides: http://www.slideshare.net/baojie_iowa• Tech Report:
http://www.cs.rpi.edu/~baojie/pub/2011-03-28_nsw_tr.pdf
• Contact: [email protected]
backup
23
Measuring Semantic Information
• Statistical approach: Inference may change the distribution of symbols, hence the entropy of the source.
• Model-theoretical approach: The less “likely” a message is to be true, the more information it contains.
• Algorithmic approach: What’s the minimal program needed to describe messages and their deductions?
• Situation-theoretical approach: measuring the divergence of messages to “truth”.
24
Our ApproachOur Approach
Shannon: Information = “surpriseness”
25
H(tyrannosaurus) > H(dog)H(tyrannosaurus) > H(dog)
Captured from: http://www.wordcount.org/main.php
????
Model Semantics
• tyrannosaurus • dog
26
??
Conditional Knowledge Entropy
• When there is background knowledge, the set of possible worlds decreases.
27
Semantic Noise and Channel Coding
28
“coffee machine”“copy machine”
“Xerox” “Xerox”
“copy machine”
p->ff
?
?
0.9
0.1
1.0
W X Y W’
Scenario developed based on reports in http://english.visitkorea.or.kr/enu/AK/AK_EN_1_6_8_5.jsp and http://blog.cleveland.com/metro/2011/03/identifying_photocopy_machine.html
Sunny
RainLight Rain
Heavy Rain
Sunny
Light Rain Heavy Rain
Status0.5
0.25
0.250.5
0.2 0.2
0.1
(a) (b)
H(X)=1.5 H(X’)=1.76
Compressing by semantic Ambiguity