lecture 1: semantic analysis in language technology

66
MARINA SANTINI PROGRAM: COMPUTATIONAL LINGUISTICS AND LANGUAGE TECHNOLOGY DEPT OF LINGUISTICS AND PHILOLOGY UPPSALA UNIVERSITY, SWEDEN 12 NOV 2013 Semantic Analysis in Language Technology Lecture 1: Introduction Course Website : http://stp.lingfil.uu.se/~santinim/sais/sais_fall2013.htm

Upload: marina-santini

Post on 10-May-2015

2.307 views

Category:

Education


2 download

DESCRIPTION

Short overview of;

TRANSCRIPT

Page 1: Lecture 1: Semantic Analysis in Language Technology

MARINA SANTINI

P R O G R A M : C O M P U TAT I O N A L L I N G U I S T I C S A N D L A N G U A G E T E C H N O L O G Y

D E P T O F L I N G U I S T I C S A N D P H I L O L O G Y

UPPSALA UNIVERSITY, SWEDEN

12 NOV 2013

Semantic Analysis in Language Technology

Lecture 1: Introduction

Course Website: http://stp.lingfil.uu.se/~santinim/sais/sais_fall2013.htm

Page 2: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

2

Acknowledgements

Thanks to Mats Dahllöf for the many slides I borrowed from his previous course and for structuring such an interesting and comprehensive content.

Page 3: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

3

INTENDED LEARNING OUTCOMESASSIGNMENTS AND EXAMINATION

READING LISTDEMOS

Practical Information

Page 4: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

4

Course Website & Contact Details

Course website: http://stp.lingfil.uu.se/~santinim/sais/sais_fall2013.ht

m

Contact details: [email protected] [email protected] [email protected]

Page 5: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

5

Check the website regularly and make sure to refresh the page: we are building up this course together, so this page will be continously

updated!

Page 6: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

6

About the Course

Introduction to Semantics in Language Techology and NLP.

Focus on methods used in Language Technology and NLP for the perform the following tasks:

Sentiment Analysis (SA) Information Extraction (IE) Word Sense Disambiguation (WSD) Predicate-Argument Extraction (PAS)

Page 7: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

7

Intended Learning Outcomes

In order to pass the course, a student must be able to:

describe systems that perform the following tasks, apply them to authentic linguistic data, and evaluate the results:

1. detect and extract attitudes and opinions from text, i.e. Sentiment Analysis (SA);

2. use semantic analysis in the context of Information Extraction (IE)

3. disambiguate instances of polysemous lemmas, i.e. Word Sense Disambiguation (WSD);

4. use robust methods to extract the Predicate-Argument Structure (PAS).

Page 8: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

8

Compulsory Readings

1. Bing Liu (2012) Sentiment Analysis and Opinion Mining, Morgan & Claypool.

2. Richard Johansson and Pierre Nugues. 2008. Dependency-based Syntactic–Semantic Analysis with PropBank and NomBank, CoNLL 2008: Proceedings of the 12th Conference on Computational Natural Language Learning.

3. Daniel Jurafsky and James H. Martin (2009), Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Second Edition, Pearson Education.

4. Daniel Gildea and Daniel Jurafsky. 2002. Automatic Labeling of Semantic Roles, Computational Linguistics 28:3, 245-288.

5. M Palmer, D Gildea, P Kingsbury. 2005. The proposition bank: An annotated corpus of semantic roles, Computational Linguistics 31 (1), 71-106.

6. Additional suggested readings will be listed at the end of each lecture

Page 9: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

9

Demos & Tutorials

This list will be continuosly updated, also with your contribution…

Page 10: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

10

Assignments and Examination

Four Assignments:1. Essay writing: independent study of a system, an approach, or a field within

semantics-oriented language technology. The study will be presented both as a written essay and an oral presentation. The essay work will also include a feedback step where the work of another group is reviewed.

2. Assignment on Predicate-Argument Structure (PAS)3. Assignment on Sentiment Analysis (SA)4. Assignment on Word Sense Disambiguation (WSD)

General Info: No lab sessions, supervision by email Essay and assignments must be submitted to [email protected]

Examination: Written report submitted for each assignment All four assignments necessary to pass the course Grade G will be given to students who pass each assignment. Grade VG to those who

pass the essay assignment and at least one of the other ones with distinction.

Page 11: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

11

IMPORTANT!

Start thinking about a topic you are interested in for your essay writing assignment!

Page 12: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

12

Practical Organization

45min + 15 min break

Lectures on Course webpage and SlideShare

Email all your questions to me: [email protected]

IMPORTANT: Send me an email to [email protected], so I make sure

that I have all the correct email addresses. If you do not get an acknowledgement of receipt, please give me a shout!

Page 13: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

13

Interaction and Cooperation

Communicate with me and with your classmates to exchange ideas, if you have problems in understanding notions and concepts or practical implementations.

Recommemdation: share your knowledge with your peers and steam off stress.

Cheating is not permitted

Page 14: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

14

SEMANTICS IN LANGUAGE TECHNOLOGYAPPLICATIONS

LEXICAL SEMANTICSREPRESENTATION OF MEANING

SUMMARY

Semantics in Language Technology - Overview

Page 15: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

15

Semantics in Language Technology

Page 16: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

16

Logic and Semantics

Aristotelian logic – important ever since. Syllogisms, e.g.: Premise: No reptiles have fur. Premise: All snakes are reptiles. Conclusion: No snakes have fur.

Modern logic develops, late 19th Century – more general and systematic.

Formal semantics in linguistics and philosophy based on logic (20th Century).

Page 17: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

17

Formal and Computational Semantics

Computational semantics “is the study of how to automate the process of constructing and reasoning with meaning representations of natural language expressions.” (Wikipedia).

Early systems rule-based, most famous example: “Montague grammar” (1970). Sophisticated mechanisms for translation of English into a very rich logic.

Language technology: Recent interest in data-driven and machine learning-based methods.

Page 18: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

18

Semantics in NLP

NLP semantics is typically more limited in scope than NL semantics as analysed in linguistics and philosophy.

NLP applications often handle semantic aspects without having explicitly semantic components, e.g. in machine translation.

Other aspects of language – morphology, syntax, etc. – can be seen as support systems for semantics: The purpose of language lies in the use of expressions as carriers of semantic meaning. And that is what many NLP systems have to respect, e.g. MT, retrieval, classification, etc.

Page 19: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

19

Semantics and Truth (i)

Semantics, meanings and states of affairs:

What a sentence means: a structure involving (lexical) concepts and relations among them. Can be articulated as a semantic representation.

E.g. I ate a turkey sandwich. in predicate logic:

A sentence and the semantic representation of a sentence is also the representation of a possible state of affairs.

Page 20: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

20

Semantics and Truth (ii)

Correspondence theory of truth: If the content of a sentence corresponds to an actual state of affairs if it is true; otherwise, it is false.

Ignoring philosophical complications, in many cases we can extract knowledge from texts.

E.g. Warmer climate entails increased release of carbondioxide by inland lakes. (From uu.se press release.)

Related issue: Which texts should we trust?

Many sentences are difficult to formalize in logic. (Modality, conditionality, vague quantification, tense, etc.)

Page 21: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

21

Representation of Meaning

Page 22: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

22

Formalizing Meaning

Linguistic content has – at least to a certain degree – a logical structure that can be formalized by means of logical calculi – meaning representations.

The representation languages should be simple and unambiguous – in contrast to complex and ambiguous NL.

Logical calculi come with accounts of logical inference. They are useful for reasoning-based applications.

Meaning formalization faces far-reaching conceptual andcomputational difficulties.

Page 23: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

23

Compositionality

Linguistic content is compositional: Simple expressions have a given (lexical) meaning; the meaning of complex expressions is determined by the meanings of their constituents.People produce and understand new phrases and sentences all the time. (NLP must also deal with these.)

Compositionality is studied in detail in compositional syntax-driven semantics. Work in this field is typically about hand-coded rule systems for small fragments of NL.

Page 24: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

24

Compositional Aspects

Page 25: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

25

Compositional Aspects – Argument Structure

Page 26: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

26

Discourse-Related Aspects

Page 27: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

27

Compositional semantics in Language Technology

Page 28: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

28

First-Order Predicate Logic (i)

“flexible, well-understood, and computationally tractable approach to the representation of knowledge [and] meaning” (J&M. 2009: 589)

expressiveverifiability against a knowledge base

(related to database languages)inferencemodel-theoretic semantics

Page 29: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

29

First-Order Predicate Logic (ii)

Boolean operators: negation and connectivesExistential/universal quantificationIndividual constantsPredicates (taking a number of arguments)

Page 30: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

30

When to assume compositionality?

Page 31: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

31

Multi-Word Expressions

MWEs (a.k.a multiword units or MUs) are lexical units encompassing a wide range of linguistic phenomena, such as idioms (e.g. kick the bucket = to die), collocations (e.g. cream tea = a small meal eaten in Britain, with small cakes and tea), regular compounds (cosmetic surgery), graphically unstable compounds (e.g. self-contained <> self contained <> selfcontained - all graphical variants have huge number of hits in Google), light verbs (e.g. do a revision vs. revise), lexical bundles (e.g. in my opinion), etc. While easily mastered by native speakers, MWEs' correct interpretation remains challenging both for non-native speakers and for language technology (LT), due to their complex and often unpredictable nature.

Page 32: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

32

Cross-lingualityUse Case: Information Access

In multi-ethnic societies, like the Swedish society, it is common that many non-native speakers use public websites – e.g. Arbetesförmedlingen or Pensionsmyndigheten websites – to access information that are vital to their living and integration in the host country. National regulations are often accompanied by special terminology and new coinages. For instance, the Swedish expression /egenremiss/ (14,900 hits, Google.se April 2013) – or alternatively as an MWE – /egen remiss/ (8,210 hits, Google.se April 2013) denotes a referral to a specialist doctor written by patients themselves. This expression is made up from two common Swedish words /egen/ `own (adj)' and /remiss/ `referral'. It is a recent expression (probably coined around 20101) and not yet recorded in any official dictionary nor in Wiktionary or other multilingual online lexical resources. However, it is very frequent in query logs belonging to a Swedish public health service website. When trying to implement a cross-lingual search based on the automatic translation of query logs, it turned out that none of the existing multilingual lexical resources contained this expression.

Page 33: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

33

Use Case: Personal Use & Text Understanding

The use of expressions that are marked for style, genre, domain, or register (and/or other textual categories), or the use of expressions which are misspelled or idiomatic for some textual category are beyond the competence of a novice reader or a non-native speaker. Additionally, in a web search or in social networks, one cannot tell if the texts one reads are good or bad the way a firstlanguage readers can. When readers/users read a language they do not know at all, they can use automatic translation or online dictionaries or other lexical resources. However, what they cannot determine well is the *type* of text one is reading. They cannot tell if the text is verbose, terse, formal, informal, stupid, funny, bad, or good.

For instance, the phrase "es ist zum Kotzen" means this is vernacular and unrefined text as well as a controversial expression. The phrase "isch alle", instead, means that this line in the text is spoken by a Berliner.

Page 34: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

34

Semantics vs Pragmatics/Discourse (i)

What does a word, a phrase, a text segment mean as an NL expression? (“Linguistic meaning” – semantics.) Conventional, static, systemic aspect of meaning.

What does the author intend to convey by means of a word, a phrase, a text segment? (“Speaker meaning” – pragmatics/discourse.)

Contextual, dynamic aspect of meaning.

The two aspects depend on each other, of course.

Page 35: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

35

Semantics vs Pragmatics/Discourse (ii)

Page 36: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

36

Semantics vs Pragmatics/Discourse (iii)

Page 37: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

37

Applications

Page 38: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

38

Semantics-oriented NLP applications

Machine translation: The translation of a text segment should mean the same as the original (to emphasize linguistic meaning) or should convey the same content (to emphasize speaker meaning).

Information extraction is to extract components of the information conveyed by a text.

Question answering is extraction – combined with inference – of an answer to a given question.

Text classification, in typical cases, relates to the meanings of the texts being classified.

Page 39: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

39

Semantics and Generation

Generation: semantic representation NL. Less challenging than analysis – the structure of the input is under control. Needed in e.g. dialogue systems.

Interlingua – semantic representation in machine

translation:Analysis: source language interlingua.Generation: interlingua target language.Would be economic if many languages are involved. The

idea has not proved very successful so far.

Page 40: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

40

Reference

Reference is very important – what statements are about.

Referring expressions are very common.Reference is a discourse phenomenon.Resolving reference is a crucial step in e.g.

extraction, e.g.in sentiment analysis translation, e.g. to get agreement right

English it vs French il/elle vs Swedish den/det.

Page 41: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

41

Reference –An Example

Page 42: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

42

Kinds of Referring Expressions

Indefinite noun phrases. E.g. a book. Introduce new entities.

Pronouns. E.g. he. Typically coreferent with a previous referring expression (antecedent).

Names. E.g. Bill Gates.Demonstrative. E.g. this room.Other definite noun phrases. E.g. the first

chapter. Reference to somehow known entity, often previously mentioned.

Page 43: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

43

Named Entity Recognition (NER)

To identify expressions being used as names. (What characterizes a “name”?)

Also to identify what kind of name it is: E.g. of a person, or a place, or a stretch of time, or a chemical compound, or a gene, etc.

“State-of-the-art NER systems for English produce near-human performance. For example, the best system entering MUC-7 scored 93.39% of F-measure while human annotators scored 97.60% and 96.95%” (Wikipedia).

Page 44: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

44

Anaphora and Deixis Resolution

Pronouns (they), pronominal adverbs (there, then), and definite NP’s refer to entities by means of contextually given information.

E.g. by referring to previously mentioned referents – anaphora.

E.g. by reference based on the participants, time, and place of the discourse – deixis (e.g. I, you, here, yesterday).

Anaphora and deixis resolution is much more challenging task than NER. The reference of name-like graph words is much more predictible. Compare Barack Obama and he.

Page 45: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

45

Sentiment Analysis – an extraction task

What views do people express in blogs and reviews? That’s interesting for politicans and marketing people.

Opinions are often expressed in a personal and informal way.E.g. Peter bought me a Baileys marzipan chocolate thingwhich I washed down with Gluehwein and that, incombination with the bright lights and cheery faces reallymade me feel warm inside! (From a blog post.)

Sentiment analysis: to extract the referent of a “sentiment” and the polarity positive–negative associated with it.E.g. Baileys marzipan chocolate – positive.

Page 46: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

46

Lexical Semantics

Page 47: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

47

Lexical Concepts

Words are often grammatically simple, but carry a structured conceptual content. Definitions “unpack” the content of concepts: friend – a person whom one knows well, is loyal to,

etc. turkey – a kind of animal, a bird, etc. sandwich – a kind of food item, contains bread , etc. eat – a relation (holding in/of an event) between an

organism and a food item, the food is chewed and ingested, etc.

Page 48: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

48

Lexical Concepts - Decomposition

Page 49: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

49

Lexical Concepts – Relations (i)

Page 50: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

50

Lexical Concepts – Relations (ii)

Page 51: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

51

Synonimy

Synonymy holds between two words (word tokens) which express the same or similar concepts.

Unsupervised detection of synonymy can be based on “The Distributional Hypothesis: words with similar distributions have similar meanings.” = The Distributional Hypothesis in linguistics is the theory that words that occur in the same contexts tend to have similar meanings. The underlying idea that "a word is characterized by the company it keeps" was popularized by Firth.

“Random Indexing” is a method here. (“a high-dimensional model can be projected into a space of lower dimensionality without compromising distance metrics if the resulting dimensions are chosen appropriately”)

Synonymy knowledge useful in e.g. translation, text classification, and information extraction. Also “query expansion” in retrieval.

Page 52: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

52

Lexical Ambiguity

Page 53: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

53

Lexical Ambiguity - WSD

Page 54: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

54

Word Ambiguity: Homography vs Polysemy (i)

Page 55: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

55

Word Ambiguity: Homography vs Polysemy (ii)

Page 56: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

56

Word Senses

Discerning word senses (for a lemma) – lexicographical task, matter of sophisticated linguistic judgements.

Theoretical principles. Practical purpose.Different dictionaries make different

analyses.English: WordNet – a standard resource.

Page 57: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

57

Senses of day in WordNet, for instance (i)

Page 58: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

58

Senses of day in WordNet, for instance (ii)

Page 59: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

59

Word Sense Disambiguation (WSD)

A distributional hypothesis for WSD: words representing the same sense have more similar distributions than words representing different senses.

I.e. distribution similarity implies sense similiarity.

We can use this for supervised learning of WSD. This requires data in the form of a sense-tagged

corpus (based on a given sense inventory, e.g. the one given by WordNet).

Page 60: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

60

Manual Sense-Tagging

More difficult than typical grammatical tagging.As we saw in the day example, senses and their

distinctions can be quite subtle. Definitions and examples are often far from obvious.

Expensive: requires competent people and standardised procedures.

Quality measure: inter-annotator agreement. ” Ex: Cohen's kappa coefficient is a statistical measure of inter-rater agreement or inter-annotator agreementfor qualitative (categorical) items. It is generally thought to be a more robust measure than simple percent agreement calculation since κ takes into account the agreement occurring by chance ”

Page 61: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

61

Summary

Page 62: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

62

Conclusions (i)

Logic-based semantics is a theoretical foundation for NLP semantics, but implemented systems are typically more coarse-grained and of a more limited scope.

Meaning depends both on literal content and contextual information. This is a challenge for most NLP tasks.

Most NLP applications have to be highly sensitive to semantics.

Page 63: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

63

Conclusions (ii)

Finding and interpreting names and other referential expressions is a central issue for NLP semantics.

Disambiguation of polysemous lexical tokens is also a central issue for NLP semantics.

Accessing the content of lexical tokens is also useful.

Meaning representation involves predicate-argument structure, which captures a basic aspect of NL compositionality.

Page 64: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

64

Start thinking about a Topic of interest for your essay writing! Tell me your

thoughts next time…

Page 65: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

65

Suggested Readings

Term Logic (Wikipedia)

Predicate Logic (Wikipedia)

Jurafsky and Martin (2009): Ch. 17 ”Representation of Meaning” Ch. 18 ”Computational Semantics” Ch. 19 ”Lexical Semantics” Ch. 20 ”Compuational Lexical Semantics”

Clark et al. (2010): Ch 15 ”Computational Semantics”

Indurkhya and Damerau (2010): Ch 5 ”Semantic Analysis”

Page 66: Lecture 1: Semantic Analysis in Language Technology

Lecture 1: Introduction

66

This is the end… Thanks for your attention !