sentiment analysis for serbian language

Post on 11-May-2015

186 Views

Category:

Technology

3 Downloads

Preview:

Click to see full reader

DESCRIPTION

Sentiment analisis of Serbian languge with stemmer for Serbian.

TRANSCRIPT

Sentiment analysis of sentences in Serbian language

Nikola Milošević

Why to analyze sentiment in Serbian?

● Great industrial need– Ads websites– Automated market research– Customer satisfaction

● NLP tools for Serbian are not developed– Need for tools and resources– Almost no accessible tools through API

Serbian language

● Belongs to Indo-Europian language group● Slavic language● Highly inflectional● 3 pronunciation types● 3 dialect groups● Write as you speak● Latin and Cyrillic

writing system

Sentiment analysis work-flow

Tokenization and preprocessing

● Process of breaking a stream of text up into words

● Stop-word filtering● Negation handling

– Adding NE_ prefix after negation– All words before punctuation

● Irregular verbs

Stemming

● Process for reducing inflected words to their stem, base or root form

● Kešelj and Šipka (2008)● Hand crafted rule based stemmer● ~300 rules

Sentiment analysis

● Aim to build binary sentiment analysis● General Serbian language● No annotated corpus for Serbian● Annotation work (~1000 small texts)● Supervised machine learning

Naive Bayes

● Algorithm that learns fast● Bag of words approach● Assumption of conditional independence● Laplace smoothing

Implementation

● Web API with presentation layer● JSON communication● Secured page for annotating● Build using PHP and MySQL● Web & Android

Results

● Stemmer– Smallest and most precise stemmer– 90% correct on news articles– Problems: small words, irregular inflections,

voice changes

● Sentiment analyzer– 80% correct– Problems: Irony, ambiguity, small training

data

Future work

● Stemmer– Use snowball framework

– Build multi-step stemmer

● Sentiment analyzer– POS tagging

– Complex negation handling

– SVM algorithm

Thank you

● Available from http://inspiratron.org

● Contact: nikola.milosevic@postgrad.manchester.ac.uk

top related