learning subjective nouns using extraction pattern bootstrapping

Learning Subjective Nouns using Extraction Pattern Bootstrapping

Ellen Riloff School of Computing University of Utah Janyce Wiebe , Theresa Wilson Computing Science University of Pittsburgh

CoNLL-03

Introduction (1/2)

Many Natural Language Processing applications can benefit from being able to distinguish between factual and subjective information .

Subjective remarks come in a variety of forms , including opinions , rants , allegations , accusations and speculation .

QA should distinguish between factual and speculative answers .

Multi-document summarization system need to summarize different opinions and perspectives .

Spam filtering systems must recognize rants and emotional tirades , among other things .

Introduction (2/2) In this paper , we use Meta-Bootstrapping (Riloff and J

ones 1999) , Basilisk (Thelen and Riloff 2002) algorithms to learn lists of subjective nouns :

Both bootstrapping algorithms automatically generated extraction patterns to identify words belonging to a semantic category .

We hypothesize that extraction patterns can also identify subjective words .

The Pattern “expressed <direct_object>” often extracts subjective nouns , such as “concern” , “hope” , “support” .

Both bootstrapping algorithm require only a handful of seed words and unannotated texts for training ; no annotated data is need at all .

Annotation Scheme The goal of the annotation scheme is to identify

and characterize expressions of private states in a sentence .

Private state is a general covering term for opinions , evaluations , emotions and speculations .

“ The time has come , gentleman , for Sharon , the assassin , to realize that injustice cannot last long” -> writer express a negative evaluation .

Annotator are also asked to judge the strength of each private state . A private state can have low , medium , high or extreme strength .

Corpus , Agreement Results

Our data consist of English-language versions of foreign news document from FBIS .

The annotated corpus used to train and test our subjective classifiers (the experiment corpus) consist of 109 documents with a total of 2197 sentences .

We use a separate , annotated tuning corpus to establish experiment parameters .

Extraction Pattern In the last few years , two bootstrapping algorithms hav

e been developed to create semantic dictionaries by exploiting extraction patterns .

Extraction patterns represent lexico-syntactic expression that typically rely on shallow parsing and syntactic role assignment .

“ <subject> was hired . ” A bootstrapping process looks for words that appear in

the same extraction patterns as the seeds and hypothesize that those words belong to the same semantic category .

Meta-Bootstrapping (1/2)

Meta-Bootstrapping process begins with a small set of seed words that represent a targeted semantic category (eg.” seashore ” is a location) and an unannotated corpus .

Step1 , MetaBoot automatically creates a set of extraction patterns for the corpus by applying syntactic templates .

Step2 , MetaBoot computes a score for each pattern based on the number of the seed words among its extractions .

The best pattern is saved and all of its extracted noun phrase are automatically labeled as the targeted semantic category .

Meta-Bootstrapping (2/2)

MetaBoot then re-scores the extraction patterns , using the original seed words plus the newly labeled words , and the process repeats . (Mutual Bootstrapping)

When the mutual bootstrapping process is finished , all nouns that were put into the semantic dictionary are re-evaluated.

Each noun is assigned a score based on how many different patterns extracted it .

Only the five best nouns are allowed to remain in the dictionary .

Mutual bootstrapping process starts over again using the revised semantic dictionary

Basilisk (1/2) Step1 , Basilisk automatically creates a set of

extraction patterns for the corpus and scores each pattern based on the number of seed words among its extraction .

Basilisk Put the best patterns into a Pattern Pool . Step2 , All nouns extracted by a pattern in the

pattern pool are put into a Candidate Word Pool . Basilisk scores each noun based on the set of patterns

that extracted it and their collective association with the seed words .

Step3 , the top 10 nouns are labeled as the targeted semantic class and are added to dictionary .

Basilisk (2/2) Then the bootstrapping process then repeats , using

the original seed and the newly labeled words . The major difference Basilisk and Meta-Bootstrapping :

Basilisk scores each noun based on collective information gathered from all patterns that extracted it .

Meta-Bootstrapping identify a single best pattern and assumes that everything it extracts belongs to the same semantic category .

In comparative experiment , Basilisk outperformed Meta-Bootstrapping .

Experimental Results (1/2)

We create the bootstrapping corpus , by gathering 950 new texts from FBIS and manually selected 20 high-frequency words as seed words .

We run each bootstrapping algorithm for 400 iterations , generating 5 word per iteration . Basilisk generates 2000 nouns and Meta-Bootstrapping generates 1996 nouns .

Experimental Results (2/2)

Next , we manually review 3996 words proposed by the algorithm and classify the words as StrongSubjective , Weak Subjective or Objective .

X - the number of words generated

Y - the percentage of those words

that were manually classified as

subjective

Subjective Classifier (1/3)

To evaluate the subjective nouns , we train a Naïve Bayes classifier using the nouns as features . We also incorporated previously established subjectivity clues , and added some new discourse features .

Subjective Noun Features : We define four features BA-Strong , BA-weak , MB-Strong ,

MB-Weak to represent the sets of subjective nouns produced by bootstrapping algorithm .

We create a three-valued feature based on the presence of 0 , 1 , >=2 words from that set .


WBO Features : Wiebe , Bruce and O’Hara (1999) , a machine learning sy

stem to classify subjective sentences . Manual Features :

Levin 1993 ; Ballmer and Brennenstuhl 1981 Some fragment lemmas with frame element experiencer

(Baker et al. 1998) Adjectives manually annotated for polarity (Hatzivassilogl

ou and McKeown 1997 ) Some subjective clues list in (Wiebe 1990)


Discourse Features : We use discourse feature to capture the density of

clues in the text surrounding a sentence . First , we compute the average number of subjective

clues and objective clues per sentence . Next , we characterize the number of subjective and

objective clues in the previous and next sentence as :

higher-than-expected (high) , lower-than-expected (low) , expected (medium) .

We also define a feature for sentence length .

Classification Result (1/3)

We evaluate each classifier using 25-fold cross validation on the experiment corpus and use paired t-test to measure significance at the 95% confidence level .

We compute Accuracy (Acc) as the percentage that match the gold-standard , and Precision (Prec) , Recall (Rec) with respect to subjective sentences .

Gold-standard : a sentence is subjective if it contains at least one private-state expression of medium or higher strength .

Objective class consist of everything else .


We train a Naive Bays classifier using only the SubjNoun features . This classifier achieve good precision (77%) but only moderate recall (64%) .

We discover that the subjective nouns are good indicators when they appear , but not every subjective sentence contains a subjective noun .


There is a synergy between these feature set : using both types of features achieves better performance than either one alone .

In Table 8 Row 1 , we use WBO + SubjNoun + manual + discourse feature . This classifier achieve 81.3% precision , 77.4% recall and 76.1% accuracy .

Conclusion We demonstrate that weakly supervised bootstrapping

techniques can learn subjective terms from unannotated texts.

Bootstrapping algorithms can learn not only general semantic category , but any category for which words appear in similar linguistic phrase .

The experiment suggest that reliable subjective classification require a broad array of features .

learning subjective nouns using extraction pattern bootstrapping

Documents