who cares about sarcastic tweets? investigating the impact of sarcasm on sentiment analysis

University of Sheffield, NLP

Diana MaynardMark Greenwood

University of Sheffield, UK

Who cares about sarcastic tweets?

Twitter is full of mindless drivel

● OMMMFG!!! JUST HEARD EMINEM'S “RAPGOD”. SMFH!!! these other dudes might as well stop rapping if they not on this level

● i've got dressed but only because I need biscuits● I used to be so bad at naming any k idol group members pmsl I

would get so confused and now I'm pro ;)))● im gonna learn to be a lifegaurd hopfully so while everyone else is

working in a shop actually doing stuff il be sitting on a pool side.yay

What are people reading about?

● Of the top 10 Twitter accounts with the highest number of followers:

● 7 pop stars● 2 social media sites● and Barack Obama

● Why on earth do we care about this stuff?

Even the mindless drivel could be useful

● OMMMFG!!! JUST HEARD EMINEM'S “RAPGOD”. SMFH!!! these other dudes might as well stop rapping if they not on this level

● i've got dressed but only because I need biscuits● I used to be so bad at naming any k idol group members pmsl I

would get so confused and now I'm pro ;)))● im gonna learn to be a lifegaurd hopfully so while everyone else is

working in a shop actually doing stuff il be sitting on a pool side.yay

➔ English people like biscuits. A lot.➔ What do young people think about their future careers?➔ People who like K Idol and RapGod also like Apple

products

Sarcasm is a part of British culture

● The BBC has its own webpage on sarcasm designed to teach non-native English speakers how to be sarcastic successfully in conversation

http://www.bbc.co.uk/worldservice/learningenglish/radio/specials/1210%5C_how%5C_to%5C_converse/page13.shtml

How do you know when someone is being sarcastic?

• Use of hashtags in tweets such as #sarcasm, #irony, #whoknew etc.

It's not like I wanted to eat breakfast anyway #sarcasm

• Large collections of tweets based on hashtags can be used to make a training set for machine learning

• But you still have to know what to do with sarcasm once you've found it

• Sarcasm generally entails saying the opposite of what you mean

– But it doesn't necessarily just invert the polarity of an opinion

– “It's not like I wanted to eat breakfast anyway” is negative when uttered sarcastically, but non-opinionated when uttered neutrally.

My friend Barry likes Apple products

Or does he?

Understanding sarcasm is hard

Sarcastic or not?

How about now?

It often requires world knowledge

Capitalisation indicates sarcasm

But not always

What does sarcasm do to polarity?

● Sarcasm often indicated by hashtags in tweets such as #sarcasm, #irony, #whoknew etc.

● It's very hard to identify sarcasm outside these parameters● In general, when someone is being sarcastic, they're saying the

opposite of what they mean● So as long as you know which bit of the utterance is the sarcastic bit,

you can simply reverse the polarity

Eating breakfast food for lunch. Living the dream.

#toast #rebel #sarcasm● If there is no polarity on the original statement, the sarcastic version is

probably negative

It's not like I wanted to eat breakfast anyway #sarcasm● If there's more than one hashtag, you need to look at the combination,

and any sentiments they express

Getting the scope of hashtags right

Eating breakfast food for lunch. Living the dream.

#toast #rebel #sarcasm

Analysing Hashtags

What's in a hashtag?

● Hashtags often contain smushed words● #SteveJobs● #CombineAFoodAndABand● #southamerica

● For NER we want the individual tokens so we can link them to the right entity

● For opinion mining, individual words in the hashtags often indicate sentiment, sarcasm etc.

● #greatidea● #worstdayever

● We need to retokenise hashtags so that we can use the content in our application

How to analyse hashtags?● Camelcasing makes it relatively easy to separate the words, using an

adapted tokeniser, but many people don't bother● We use a simple approach based on dictionary matching the longest

consecutive strings, working L to R● We use a combination of dictionaries (Linux dictionary, slang

dictionary, plus gazetteers of Named Entities, modified manually)● #lifeisgreat -> #-life-is-great● #lovinglife -> #-loving-life

● It's not foolproof, however● #greatstart -> #-greats-tart

● In an experiment with 2010 English hashtags (4538 tokens): P=98.12%, R=96.41% , F1= 97.25%.

● We could use a language modelling approach based on bigrams and trigrams, but since hashtags are often novel, it might not help much

Identifying the scope of sarcasm

I am not happy that I woke up at 5:15 this morning. #greatstart #sarcasm

You are really mature. #lying #sarcasm

Rules for identifying scope

I am not happy that I woke up at 5:15 this morning. #greatstart #sarcasm

● negative sentiment + positive hashtag + sarcasm hashtag● The positive hashtag becomes negative with sarcasm

You are really mature. #lying #sarcasm● positive sentiment + sarcasm hashtag + sarcasm hashtag● The positive sentiment is turned negative by both sarcasm

hashtags● When in doubt, it's usually safe to assume that a sarcastic

statement carries negative sentiment

Experiments with sarcastic hashtags

Collected a corpus of 134 tweets containing #sarcasm Manually annotated sentences with sentiment

266 sentences, of which 68 opinionated (25%) 62 negative, 6 positive (yes, this is biased...)

Adding sarcasm detection improved accuracy of polarity

detection from 27.27% to 77.28% Even though we know these sentences are sarcastic, we don't always get

polarity right After implementing rules for sarcasm scope, 91% accuracy

Conclusions● Unlike most work on sarcasm detection, we don't try to

identify sarcasm where it's not explicitly indicated● We instead examine the effect that known sarcasm has on the

sentiment expressed in tweets● We retokenise hashtags so that we can make use of

information within them in order to identify sarcasm scope● We develop a set of rules for determining sarcasm scope, and

improve polarity detection as a result● Lots more work could be done on this topic, but it's a

#greatstart #really

Questions?

?

who cares about sarcastic tweets? investigating the impact of sarcasm on sentiment analysis

Social Media

scope of sarcasm

sarcasm hashtag

scope of hashtags

sarcasm negative sentiment

english hashtags

analysing hashtags

breakfast food

sarcastic tweets