lecture 5 - opinion analysis...institut mines-télécom social data and opinion analysis 5...
TRANSCRIPT
Institut Mines-Télécom
Lecture 5 - Opinion
Analysis
Chloé Clavel
Institut Mines-Télécom
Introduction
30/01/2019 Modèle de présentation Télécom ParisTech2
Institut Mines-Télécom
Introduction
Analyse de
données
socialesLes données textuelles 3
Different terminologies
• Opinion extraction, opinion mining, sentiment analysis,
subjectivity analysis, affect sensing, emotion detection
Applications
• Social Network analysis
• Huma-agent interaction : ex: chatbot
Institut Mines-Télécom
Social data and opinion analysis
4
Social data:
• Expressions of the citizens on the web
Context :
• Renewal of opportunities for criticism and action via the
Internet
Lecture : « La démocratie
Internet »
Dominique Cardon
Institut Mines-Télécom
Social data and opinion analysis
5
Challenges
• Analysis of societal trends
• Analysis of citizens' opinions on candidates in elections
• Review of movie reviews (movie reviews)
• Analysis of the opinions of Internet users on a product
• Analysis of the e-reputation of a brand, a product
• Identify target clients / referral systems
• Evaluate the success of communication campaign
Institut Mines-Télécom
Social Data and opinion analysis
6
Disciplines :
• Sociology:
─ qualitative / manual / sociological analysis of small
corpora selected to form a panel of studies
• Computer science :
─ development of automatic large corpus analysis methods
Institut Mines-Télécom
Human-agent/robot interaction
Robotics and artificial agents• Analyze and reproduce human behaviors to interact socially with humans.
Animated conversational agents,
• Robots & "emotional avatar"
7
[GRETA Platform,
Pelachaud]
[Softbank robotics]
Institut Mines-Télécom
Human-agent interaction
Virtual assistant
Text mining et Opinion mining8
https://www.youtube.com/watch?v=TaY9zt_qx_c
Institut Mines-Télécom
Human-machine interaction
Analyse de données socialesLes données textuelles 9
Kirobo: the Japanese robot who left 18 months in
space to keep an astronaut company
Institut Mines-Télécom
Interaction humain-agent: LiveChat et
relation client
10
Institut Mines-Télécom
Human-robot interaction
Berenson robot at Quai Branly
•
"Visitors were invited to observe and interact with
Berenson's behavior, helping to define the criteria for
aesthetic appreciation of this amateur art robot. "
11
Institut Mines-Télécom
Outline of the lecture
Terminology and theoretical models
Knowledge-based methods for opinion analysis
ML methods for opinion analysis
• How to obtain labelled data?
• Opinion features
30/01/2019 Modèle de présentation Télécom ParisTech12
Institut Mines-Télécom
Terminology and theoretical
models
30/01/2019 Modèle de présentation Télécom ParisTech13
Institut Mines-Télécom
Terminology issue
Sentiment/opinion-related phenomena
• Emotion, opinion, sentiment, mood, attitude,
interpersonal stance, personality traits, affect,
judgment, appreciation, argumentation, engagement
30/01/2019 Modèle de présentation Télécom ParisTech14
Institut Mines-Télécom
Terminology issue
Modalities of expression
• Verbal content, prosody, Gesture, Posture, Facial
expressions, physiological signal
30/01/2019 Modèle de présentation Télécom ParisTech15
Institut Mines-Télécom
Terminology issue
Scherer's definitions [Scherer, 2005]
• Emotion: short phenomenon, physiological reaction, appraisal of a major event (stimulus)
• Mood: diffuse non-caused low-intensity long-duration change in subjective feeling
• Interpersonal stances: affective stance toward another person in a specific interaction
• Attitudes: enduring, affectively colored beliefs, dispositions towards objects or persons
• Personality traits: stable personality dispositions and typical behavior tendencies
PRACTICE : link the following terms to the most relevant phenomenon
• liking, gloomy, contemptuous, jealous, sad
30/01/2019 Modèle de présentation Télécom ParisTech16
Institut Mines-Télécom
Terminology issue
Munezero's definitions [MUN14]
• Affect : précédant les émotions avant la prise de
conscience de leur ressenti
• les émotions diffèrent des sentiments de par leur durée
(les émotions sont des phénomènes plus brefs) et de
par la présence d’une cible (les émotions ne sont pas
toujours reliées à un objet);
• les opinions sont des interprétations personnelles
d’une information et ne sont pas nécessairement
chargées émotionnellement comme les sentiments.
30/01/2019 Modèle de présentation Télécom ParisTech17
Institut Mines-Télécom
Terminology and applications
Challenges:
• choose the relevant phenomenon according to the
application and the data [Clavel and Callejas, 2016]
Examples
• Detect when the student is frustrated or bored in e-
learning system or detect when the user is upset in a
human-machine dialogue system => emotion
30/01/2019 Modèle de présentation Télécom ParisTech18
Institut Mines-Télécom
Terminology and applications
Examples
• Detect depressed persons at home in robot companion
systems => mood
• Detect extrovert personality in job interview data =>
personality traits
30/01/2019 Modèle de présentation Télécom ParisTech19
Institut Mines-Télécom
Theoretical models and formalization
Use theoretical models to formalize the structure of
an opinion/appraisal expression
• Example : Appraisal theory from systemic functional
linguistics [Martin and White, 2005]
─ An appraisal expression is a source that evaluates a
target -> 3 components.
30/01/2019 Modèle de présentation Télécom ParisTech20
Institut Mines-Télécom
Theoretical models
Properties of the model
• Identify the opinion target
• Distinguish the evaluative stances :
• Affect : personal reaction referring to an emotional state
o ‘I feel highly unfriendly attitude towards me’
• Judgment : assigning qualities - e.g. tenacity - to individuals
according to normative principles
o ‘The shop assistant’s behavior was really unfriendly’
• Appreciation : Evaluation of an object - e.g. a product or a
process
o ‘Plastic bags are environment unfriendly’
30/01/2019 Modèle de présentation Télécom ParisTech21
Institut Mines-Télécom
Standardization
Well defined for emotions
• Emotion : Emotion Markup Language
• http://www.w3.org/TR/2014/REC-emotionml-20140522/
Still little about opinions and sentiments
• Opinion and sentiments
http://www.w3.org/community/sentiment/ : Linked Data
• Models for Emotion and Sentiment Analysis
Community Group
30/01/2019 Modèle de présentation Télécom ParisTech22
Institut Mines-Télécom
Opinion detection methods -
Challenges
30/01/2019 Modèle de présentation Télécom ParisTech23
Institut Mines-Télécom
Challenges :
“This film should be brilliant. It sounds like a great plot,
the actors are first grade, and the supporting cast is
good as well, and Stallone is attempting to deliver a
good performance. However, it can’t hold up.”
« Well as usual Keanu Reeves is nothing special, but
surprisingly, the very talented Laurence Fishbourne is
not so good either, I was surprised. »
24
EXO: Is the review positive or negative? Highlight the expressions corresponding to
the expression of an opinion. Do they seem positive or negative in general?
Institut Mines-Télécom
Challenges
“This film should be brilliant. It sounds like a
great plot, the actors are first grade, and the
supporting cast is good as well, and Stallone is
attempting to deliver a good performance.
However, it can’t hold up.”
Well as usual Keanu Reeves is nothing special,
but surprisingly, the very talented Laurence
Fishbourne is not so good either, I was
surprised.
25
Institut Mines-Télécom
Challenges
more complex than a simple positive vs. negative
word counts.
• conditional tense
• discourse markers
• negation processing (I don't like this movie)
• modifiers and intensifiers (the plot is not very good)
• dealing with metaphors (global warming vs. climate
change [Ahmad et al., 2011])
30/01/2019 Modèle de présentation Télécom ParisTech26
Institut Mines-Télécom
Détection d’opinions : enjeux et difficultés
Les données textuelles 27
Identification de la cible de l’opinion
─ « Je suis satisfait des contacts que j’ai eus avec le service
client mais pas des tarifs pratiqués »
─ Concepts détectés
• Opinion : satisfaction
• Thématiques: contact et prix
─ Enjeu : pouvoir détecter automatiquement ce sur quoi
porte l’opinion
─ Résolution d’anaphore : “il les adore”
Institut Mines-Télécom
Knowledge-based methods
for opinion detection
30/01/2019 Modèle de présentation Télécom ParisTech28
Institut Mines-Télécom
1st type of method: Keyword Detection
Analyse de
données
socialesLes données textuelles 29
Keyword spotting: the most naive approach but
also the most accessible
• The text is classified in the category of opinions
corresponding to the presence of words clearly
associated with an opinion or an emotion
• « je suis content » => positive
Limits :
• Do not treat negation
─ « je ne suis pas content » => positive
• Ignore words that are implicitly positive or negative
─ « le réchauffement climatique » (global warming)
Institut Mines-Télécom
2nd type of method:
Analyse de
données
socialesLes données textuelles 30
Lexical affinity
• Assign to the different words a probability of belonging
to a category of opinion or emotion (« probabilistic
affinity »)
─ Ex : « réchauffement » is considered the negative class
with a probability of 75%
─ ‘accident’ 75% probability of being indicating a negative
effect, as in ‘car accident’ or ‘hurt by accident’ but not in “I
met him by accident” [Cambria & White (2014)]
• These probabilities are learned on annotated corpora
Institut Mines-Télécom
2nd type of method:
Analyse de
données
socialesLes données textuelles 31
Limits :
• Operates at the level of the word and not at the level of the sentence
─ does not deal with negation or the semantic context
─ Ex tiré de [Moilanen 2007]
« The senators supporting(+) the leader(+) failed(-) to praise(+) his hopeless(-) HIV(-) prevention program.”
• The probabilities learned depend strongly on the corpus used for learning the probabilities and thus on its domain
─ Ex : “the power of the graphism” (for video games)
─ Vs. “the contract power” (for electricity)
Institut Mines-Télécom
Lexicon of opinions in English
• SentiWordNet http://sentiwordnet.isti.cnr.it/
─ Relies on Wordnet: lexical database
• Principle : synsets
• English Version: http://wordnetweb.princeton.edu/perl/webwn
• French Version: Wordnet Libre du Français (WOLF) :
http://alpage.inria.fr/~sagot/wolf.html
32
Institut Mines-Télécom
Lexicon of opinions in English
• SentiWordNet http://sentiwordnet.isti.cnr.it/
─ Principle : add to each synset a positive score, a negative score AND
an objective score between 0 and 1
• [estimable(J,3)] “may be computed or estimated”
Pos 0 Neg 0 Obj 1
• [estimable(J,1)] “deserving of respect or high regard”
Pos .75 Neg 0 Obj .25
33
Institut Mines-Télécom
Lexicon of opinions in English
• SentiWordNet
34
Institut Mines-Télécom
Lexicon of opinions in English
Analyse de
données
socialesLes données textuelles 35
Wordnet affect
• Selecting a subset of wordnet Affective label + valence
Tiré de https://www.proxem.com/Download/Research/BDL-CA07-WordNet_et_son_ecosysteme-
Francois_Chaumartin.pdf
Institut Mines-Télécom
Lexicon of opinions in English
LIWC (Linguistic Inquiry and Word Count) Pennebaker, J.W.,
Booth, R.J., & Francis, M.E. (2007). Linguistic Inquiry and Word
Count: LIWC 2007. Austin, TX
Home page: http://www.liwc.net/
2300 mots, >70 classes
Version française : http://sites.univ-
provence.fr/wpsycle/outils_recherche/liwc/FrenchLIWC
Dictionary_V1_1.dic
36
Institut Mines-Télécom
French LIWC
37
Institut Mines-Télécom
3rd type of methods: semantic rules
38
« manque de qualité de service »
« il n’y a vraiment pas eu de contact », …
Concept
INSATISFACTION
(manque|~negation-patt|(il/#NEG/y/avoir/~negation-
patt))/(#PREP_DE)?/ (conseil|contact|~services-lex)
Semantic rules:
Lexicon of feeling (ex: SentiWordNet)
Extraction rules [Moilanen 2007] [Taboaba et al.] [SenticPatterns]
Institut Mines-Télécom
Reminder
Analyse de
données
socialesLes données textuelles 39
Give the regular expression parsing the
sentences fulfilling the following criteria:
• The first word is an uppercase
• The last character is a full stop
• The sentence contains one or more words (characters
a...z et A...Z) separated by a space.
^[A-Z][A-Za-z]*(\ [A-Za-z]+)*\.$
From http://www.ulb.ac.be/di/ssd/ggeeraer/lg/extexpreg_print.pdf
to check regular
expression :
regexplanet.com
Institut Mines-Télécom
3rd type of methods: semantic rules
Compositional approach[Moilanen 2007] :
• Representation of the sentence in the form of constituents
• Calculates the overall polarity of an output constituent from the input constituents
40
« The senators supporting the leader
failed to praise his hopeless HIV
prevention program »
Institut Mines-Télécom
3rd type of methods: semantic rules
Compositional approaches[Moilanen 2007] :
41
• Propagation rules:
•the polarity of a neutral constituent is
"erased" by that of a non-neutral
constituent
• {(+)(N)} → (+)
• {(-)(N)} → (-)
• Inversion rules:
• (+) → (-) ; (-) → (+) in order to deal with
negation, for example
• Polarity conflict resolution rules:
•when the two polarities are conflicting at
different levels of the syntactic structure
Institut Mines-Télécom
3rd type of methods: semantic rules
Les données textuelles 42
Compostional approach
─ Recognition of affect, judgment and appreciation in Text –
Neviarouskaya et al., COLING 2010
• Affect : personal emotional state
o ‘I feel highly unfriendly attitude towards me’
• Judgment : social or ethical appraisal of other’s behavior
o ‘The shop assistant’s behavior was really unfriendly’
• Appreciation : evaluation of phenomena
o ‘Plastic bags are environment unfriendly’
Institut Mines-Télécom
3rd type of methods: semantic
rules
Analyse de
données
socialesLes données textuelles 43
Taboaba et al. : Lexicon-Based methods for
sentiment analysis
Principle:
• Attributes an SO (Semantic Orientation) between -5
and 5 to adjectives, nouns, verbs and adverbs
Institut Mines-TélécomAnalyse de
données
socialesDonnées textuelles44
Institut Mines-Télécom Les données textuelles 45
Institut Mines-TélécomAnalyse de
données
socialesLes données textuelles 46
Taboaba et al. : Lexicon-Based methods for sentiment
analysis
Principle:
• Intensifier management: modification of SO
EXO :
If sleazy has an SO at 3, what is the SO
of somewhat sleazy ?
If excellent has an SO at 5, what is the
SO of most excellent ?
Institut Mines-TélécomAnalyse de
données
socialesLes données textuelles 47
Taboaba et al. : Lexicon-Based methods for sentiment
analysis
Principle:
• Negation processing :
─ switch negation for the simplest cases (good(+3), not good(-3))
─ Search for negation in more complicated cases
• Ex: « Nobody gives a good performance in this movie »
• Manage the « Irrealis blocking »: ex: « would »
─ « This should have been a great movie »(SO = 3 -> SO =0)
Institut Mines-Télécom
Knowledge-based methods for
opinion analysis in human-agent
interaction
30/01/2019 Modèle de présentation Télécom ParisTech48
Institut Mines-Télécom
La problématique
Text mining et Opinion mining49
How are you ?
Actually, I am in a good mood
today
Good. What did you do today?
I went to the cinema with some of
my friends. That was good.
Agent
Agent
User
Institut Mines-Télécom Text mining et Opinion mining50
Do you like this painting ?
Yes I do
Agent
User
Institut Mines-Télécom
Knowledge-based methods for opinion
analysis in human-agent interactions
Example : bottom-up rule-based approach based on
three levels of the utterance
30/01/2019 Modèle de présentation Télécom ParisTech51
C. Langlet and C. Clavel, Improving social relationships in face-to-face human-
agent interactions: when the agent wants to know users likes and dislikes , in
ACL 2015
Institut Mines-Télécom
Knowledge-based methods for opinion
analysis in human-agent interactions
Attribution of chunk semantic role (target and
source)
• Rules formalizing different Target/Source/Evaluation
structures
30/01/2019 Modèle de présentation Télécom ParisTech52
From Caroline Langlet’s Phd dissertation
Syntagme referring to an attitude with an attribute position using or not a prepositional syntagme (I see)
Institut Mines-Télécom
Knowledge-based methods for opinion
analysis in human-agent interactions
Extraction patterns and semantic rules in order to
integrate the human-agent interaction context
• rules at the adjacency pair level (agent utterance, user
utterance)
30/01/2019 Modèle de présentation Télécom ParisTech53
Question fermée (sans marque de négation)
Question(Att(Src = User,Tgt = T, orientationPol = pos|neg))
— Réponse de type Accord Att(Src = User,Tgt = T, Pol = orientationPol)
— Réponse de type Désaccord Att(Src = User,Tgt = T,Pol = reverse(orientationPol))
Institut Mines-Télécom
Knowledge-based methods for opinion
analysis in human-agent interactions
30/01/2019 Modèle de présentation Télécom ParisTech54
Question fermée (avec marque de négation) ex : You don’t like Picasso ?”
Question(Att(Src = User,Tgt = T, orientationPol = pos|neg))
— Réponse de type Accord Att(Src = User,Tgt = T,Pol = reverse(orientationPol))
ex: Yes, I do
— Réponse de type Désaccord Att(Src = User,Tgt = T,Pol = orientationPol)
ex: No, I don’t
Institut Mines-Télécom
Knowledge-based methods for opinion
analysis in human-agent interactions
30/01/2019 Modèle de présentation Télécom ParisTech55
Assertion (sans marque de négation) ex : This movie is so bad
Assertion(Eval1(Src1 = Agent, Tgt = T,Pol = pos|neg))
— Réponse de type Accord Eval2(Src2 = User,Tgt = T,Pol2 = Pol1)
ex: yes it is
— Réponse de type Désaccord Eval2(Src2 = User,Tgt = T,Pol2 = reverse(Pol1))
ex: No it isn’t
Assertion (avec marque de négation) ex : This movie is not really good
Assertion(Eval1(Src1 = Agent, Tgt1 = T,Pol1 = pos|neg))
— Réponse de type Accord Eval2(Src2 = User,Tgt2 = Tgt1, Pol2 = reverse(Pol1))
ex: yes it is
— Réponse de type Désaccord Eval2(Src2 = User,Tgt2 = Tgt1, Pol2 = Pol1)ex: No it isn’t
Institut Mines-Télécom
Machine learning for opinion
analysis : Introduction and
specific challenges
Building labelled dataset
30/01/2019 Modèle de présentation Télécom ParisTech56
Institut Mines-Télécom
Machine learning for opinion analysis
Two tasks :
• classication of documents in opinion categories (ex :
positive/negative) using
─ supervised machine learning approaches : SVM (Support Vector Machine),
Naïve bayes classier (see Lab), deep learning approaches
• sequential annotation of opinion, source and target using sequential
approaches
─ (ex: Conditional Random Fields, recurrent neural networks)
30/01/2019 Modèle de présentation Télécom ParisTech57
Figure 2: from (Irsoi and Cardie)
Institut Mines-Télécom
ML vs. Knowledge-based
ML advantages
• few linguistic expertise is required to build the model
from the annotated data,
• a higher interoperability of the models
ML drawback
• require a labelled dataset (big dataset for deep learning
approaches) while:
─ annotating data in opinions is a difficult task
• difficult interpretation of trained models
• difficult to transfer model on different data (the model is
corpus-dependent)
30/01/2019 Modèle de présentation Télécom ParisTech58
Institut Mines-Télécom
How to build a labelled database?
Importance of data
• Quality of models learned <= sufficient quantity and quality of data
• Collect data close to the intended application
• Sometimes difficult, eg fear
59
Institut Mines-Télécom
Opinion annotation : challenges
Subjective phenomenon
─ We do not all have the same perception of an opinion expressed by the
other
Complex phenomenon:
─ Opinions and emotions and other phenomena mixed in natural
interactions (eg fear and anger)
• for the same event: variants of the reactions given the personalities :
60
Their First Murder, 1941 © Weegee
Institut Mines-Télécom
Opinion/emotion annotation
Challenge
─ Define among the wide variety
of existing phenomena the
classes that will be recognized
by the system (anger vs.
discontent)
─ Access the contents of the
corpus for a better
understanding of the behavior
of the system
61
Cone of Plutchik's emotions
from[Plutchik, 1984].
Institut Mines-Télécom
Annotation tools
Text :
• Gate
• Glose
62
Institut Mines-Télécom
Measure the reliability of annotations
Opinion/Emotion phenomenon = subjective
phenomenon
• Annotate multiple annotators
• Assess the degree of reliability of the annotations
63
Institut Mines-Télécom
Measure the reliability of annotations
Measures
• Cohen’s kappa [Carletta, 1996]:
─ agreement corrected for what it would be under the mere
fact of chance
─ Po is the proportion of agreement observed and Pe the
probability that the annotators agree by chance
64
Institut Mines-Télécom
Mesure de la fiabilité des annotations
Kappa values ?
• When annotators agree as much as chance
• When the annotators agree totally
Exercice
• 50 audio sequences annotated by 2 people (Ann1 /
Ann2) in 2 categories positive / negative
• Calculating kappa between the two annotators
30/01/2019 Reconnaissance acoustique des émotions65
Ann1\Ann2 Positive Negative
Positive 20 5
Negative 10 15
EXO
Institut Mines-Télécom
Measure the reliability of annotations
Po = (20+15)/50 = 0,7
Calculate Pe:
─ Ann1 uses positive label 50% of the time
─ Ann2 uses positive label 60% of the time
─ Probability that Ann1 and Ann2 use the positive label:
0.5*0.6=0.3
─ Probability that Ann1 and Ann2 use the negative label :
0.5*0.4 = 0.2
─ Probability to agree by chance : 0.2+0.3 = 0.5
Kappa computation:
• Kappa = 0.2/0.5 = 0.4
30/01/2019 Reconnaissance acoustique des émotions66
Institut Mines-Télécom
Measure the reliability of annotations
Moderate agreement = standard for emotions
[Landis et Koch, 1977]
Other measure: Cronbach's Alpha [Cronbach,
1951] for dimensions
67
Institut Mines-Télécom
Machine learning for opinion
analysis : text representation
30/01/2019 Modèle de présentation Télécom ParisTech68
Institut Mines-Télécom
Text representation : Reminder
How to transform documents/words into vectors?
30/01/2019 Modèle de présentation Télécom ParisTech69
─ Cosine similarity
Institut Mines-Télécom
Text representation : Reminder
How to transform documents/words into vectors?
• Word embeddings : ex : word2vec
30/01/2019 Modèle de présentation Télécom ParisTech70
Institut Mines-Télécom
Word embeddings and sentiment analysis
Analyse de
données
socialesLes données textuelles 71
Two options
• Used existing word embedding learned on big database
• Create a word embedding for the task of sentiment analysis,
─ ex: SSWETang, Duyu, et al. "Learning Sentiment-Specific Word Embedding for Twitter Sentiment
Classification." ACL (1). 2014.
Institut Mines-Télécom
Word embeddings and sentiment analysis
Analyse de
données
socialesLes données textuelles 72
• Create a word embeddings for the task of sentiment analysis,
─ ex: technics from LDA : Latent Dirichlet Allocation (probalistic model of topic)
─ Maas, Andrew L., et al. "Learning word vectors for sentiment analysis.“ Proceedings of the
49th Annual Meeting of the Association for Computational Linguistics: Human Language
Technologies-Volume 1. Association for Computational Linguistics, 2011.• LEARNED from Movie reviews
• RESULTS : Words the most similar to “romantic”
Sentiment
+ SemanticSemantic only Latent Semantic
Analysis
extravagant
Institut Mines-Télécom
Machine learning for opinion
analysis : conditional random
fields
30/01/2019 Modèle de présentation Télécom ParisTech73
Institut Mines-Télécom
Machine learning for opinion analysis
Two tasks :
• classication of documents in opinion categories (ex :
positive/negative) using
─ supervised machine learning approaches : SVM (Support Vector Machine),
Naïve bayes classier (see Lab), deep learning approaches
• sequential annotation of opinion, source and target using sequential
approaches
─ (ex: Conditional Random Fields, recurrent neural networks)
30/01/2019 Modèle de présentation Télécom ParisTech74
Figure 2: from (Irsoi and Cardie)
Institut Mines-Télécom
Conditional Random Fields
Analyse de
données
socialesLes données textuelles 75
Generalization of Hidden Markov models
• Markovian graphical model :
• Non-directed
• Each Yi is linked to all the Xj in the graph
• Non-generative model : a model of the conditional probability of the
target Y, given an observation x,
• P(y/x) depends on
─ the structure of Y graph
─ potential functions that depends on
• feature functions given by the linguist expert
• Weights of these feature functions in the model
Institut Mines-Télécom
Conditional Random Fields
Analyse de
données
socialesLes données textuelles 76
Feature functions
• To integrate more or less linguistic knowledge
• The model identifies the knowledge that plays a role in the labelling
task
Institut Mines-Télécom
Conditional Random Fields
Analyse de
données
socialesLes données textuelles 77
Linear CRF
• Y graph is a first order linear chain
• Each Yi is still linked to all the Xj in the graph
• Non-generative model : a model of the conditional probability of the
target Y, given an observation x,
Institut Mines-Télécom
Conditional Radom Fields
Analyse de
données
socialesLes données textuelles 78
mise en place d'un modèle à base de CRF :• Définition de champs de variables aléatoires qui modélisent
le domaine
• Choix d'une structure de graphe sur ces variables
─ Ex: linear CRF
• Choix d'un modèle pour exprimer la probabilité conditionnelle
qui relie les variables entre elles
• Choix des fonctions caractéristiques qui décrivent le mieux
les connaissances du domaine à injecter dans le modèle
Institut Mines-Télécom
Etiqueteur probabiliste : les CRF – les
Champs Aléatoires Conditionnels
Analyse de
données
socialesLes données textuelles 83
Les boîtes à outils existantes pour les CRFs• CRF suite http://www.chokkan.org/software/crfsuite/ et son
wrapper python http://python-
crfsuite.readthedocs.org/en/latest/
• Wapiti : https://wapiti.limsi.fr/
Institut Mines-Télécom
Machine learning for opinion
analysis : deep learning
approaches
30/01/2019 Modèle de présentation Télécom ParisTech84
Institut Mines-Télécom
deep learning and sentiment analysis
Analyse de
données
socialesLes données textuelles 85
REF : R. Socher, A. Perelygin, J. Wu, J.Chuang, C. D. Manning, A. Y. Ng, and C.Potts, Recursive deep models forsemantic compositionality over asentiment treebank, in Proceedings of the2013 Conference on Empirical Methodsin Natural Language Processing.Stroudsburg, PA : Association forComputational Linguistics, October 2013,pp. 1631? 1642.
Institut Mines-Télécom
deep learning and sentiment analysis
Analyse de
données
socialesLes données textuelles 86
• Use of recursive tensor networks : Representation of
the sentence by a tree (use of the Stanford parser):
Institut Mines-Télécom
deep learning and sentiment analysis
• Database: Treebank Sentiment sentences of film critics parsed
with the Stanford parser
• Annotation of the nodes of the tree in (-, +, 0) to provide the
structure necessary for the application of a recursive model
87
Institut Mines-Télécom
deep learning and sentiment analysis
Analyse de
données
socialesLes données textuelles 88
─ Decision : recursively apply the activation functions:
─ Learning of the function g of the passage to the parent in
the binary tree representing the sentence
Institut Mines-Télécom
Références
[MUN 14] MUNEZERO M. D., SUERO MONTERO
C., SUTINEN E., PAJUNEN J., “Are They Different?
Affect, Feeling, Emotion, Sentiment, and Opinion
Detection in Text”, IEEE Transactions on Affective
Computing, 2014.
30/01/2019 Modèle de présentation Télécom ParisTech89