opinion integration and summarization yue lu university of illinois at urbana-champaign

50
Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

Upload: merilyn-reed

Post on 27-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

Opinion Integration and Summarization

Yue LuUniversity of Illinois at Urbana-Champaign

Page 2: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 2

Opinions neededin all kinds of decision processes

“What do people complain about iPhone?”

“How do people like the new drug?”

“How is the new policy received?”

Business intelligence

Health informatics

Political science

Yue Lu

Page 3: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 3

Online opinions cover all kinds of topics

65M msgs/day

Topics: PeopleEventsProductsServices, …

Sources: Blogs Microblogs Forums Reviews ,…

53M blogs1307M posts

115M users 10M groups

45M reviews

Yue Lu

Page 4: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 4

How could I read them all?

Yue Lu

After collecting opinions using Google

Page 5: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 5

Online opinions are complicated

High quality

Low quality

Aspect Sentiment Quality

Yue Lu

Page 6: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 6

Online Opinions

Topic = t

Integrated Summary

Aspect Opinion Sentences Sentiment Quality

Aspect 1

positivenegative

highmedium

Aspect 2

neutralpositive

lowhigh

… … … …

Sentence 512Sentence 823

Sentence 21Sentence 153

OpinionIntegration

SentimentAnalysis

Quality Prediction

Sentence1Sentence 2

Sentence 100Sentence 900

Yue Lu

Vision: Opinion Integration & Summarization

Major Challenge:

develop general techniques

that work for arbitrary topics…

Page 7: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 7

Existing work cannot scale to different topics

• Review summarization– Unsupervised feature extraction + opinion polarity

identification: [Hu&Liu 04], [Popescu&Etzioni 05], …

– Supervised aspect extraction: [Zhuang et al] …

• Hidden aspect discovery: [Hofmann99] [[Chen&Dumais00] [Blei et al03] [Zhai et al04] [Li&McCallum06] [Titov&McDonald08]…

• Sentiment classification– Binary classification: [Pang&Lee02] [Kim&Hovy04] [Cui et al06] …

– Rating classification: [Pang&Lee05] [Snyder&Barzilay07] …

• Opinion Quality Prediction: [Zhang&Varadarajan`06] [Kim et al. `06] [Liu et al. `08] [Ghose&Ipeirotis `10]…

Yue Lu

Heavily rely on domain specific • Hand-labeled training data• Hand-written heuristics/rules

How to?develop general techniques

that work for arbitrary topics

Page 8: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 8

ONLINE OPINIONS

Sentence1Sentence 2

Sentence 100Sentence 900

New idea: exploit naturally available resources

Structured Ontology

OverallSentimentRatings

ExpertArticles

INTEGRATED SUMMARY

Topic = t

[COLING'10]

[WWW‘09] [KDD’10][WWW’11]

[WWW'10]

[WWW‘08]

Social Networks

Yue Lu

Page 9: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 9

Intuition: scalable to different topics

Yue Lu

3.5 M things

45M reviews

22 M topics

500 M users

>3 M users

>3 K products/y

3.5 M articles

Opportunities?• Provide domain-specific guidance• Alleviate heavy dependence on human

labors

Challenges?• Cannot directly apply

supervised machine learning• Need for new methods

Page 10: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 10

Online Opinions

Topic = t

Integrated Summary

Aspect Opinion Sentences Sentiment Quality

Aspect 1

positivenegative

highmedium

Aspect 2

neutralpositive

lowhigh

… … … …

Sentence 512Sentence 823

Sentence 21Sentence 153

OpinionIntegration

SentimentAnalysis

Quality Prediction

Sentence1Sentence 2

Sentence 100Sentence 900

Yue Lu

My Work

[WWW’08][COLING'10]

[WWW’10][WWW’09][KDD’10][WWW’11]

Page 11: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 11

• [WWW’11] “Automatic Construction of a Context-Aware Sentiment Lexicon: an Optimization Approach”

Aspects Opinion Sentences Sentiment QualityAspect 1 positive

negativehighmedium

Aspect 2 neutralpositive

lowhigh

Sentence 512Sentence 823

Sentence 21Sentence 153

OpinionIntegration

SentimentAnalysis

Quality Prediction

Roadmap

[WWW’08][COLING'10]

[WWW’10][WWW’09][KDD’10][WWW’11]

Yue Lu

Integrated Summary

Page 12: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 12

“unpredictable”

Domain = Movie

Domain = Laptop

A well-known challenge: sentiments are domain dependent

Existing Work• Linguistic heuristics

[Hatzivassiloglou&McKeown `97], [Kanayama&Nasukawa `06], …

• Morphology, synonymy [Neviarouskaya et al `09], [Mohammad et al `09], …

• Seed sentiment words [Turney&Littman `03], …

• Document-level sentiment rating [Choi and C. Cardie. `09], …

Yue Lu

Page 13: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 13

“large” Aspect = Screen

Aspect = Battery

Sentiments are also aspect dependent

Domain = LaptopYue Lu

Page 14: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 14

New problem: constructing aspect-dependent sentiment lexicon

SCREEN-large +1SCREEN-great +1BATTERY-large -1… …

Output:

Input:

“Aspect-Adj”: sentiment_score

“Aspects”Laptop Collection

+

Yue Lu

• SCREEN: screen, LCD, display, …• BATTERY: battery, power, charger, …• PRICE: price, cost, money, …• … A challenging problem:

due to increased sparseness

Page 15: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

15

General Sentiment Lexicon

excellent, awesome, …

bad, terrible, …

Dictionary

large~ big, … large<->tiny, …

Language Heuristics1. “and” clue

2. “but” clue

3. “negation” clue

Screen: text…Battery: text…

Overall Sentiment Ratings

…1

43

2

Our idea: exploit multiple resources

Yue Lu

Synonyms Antonyms

SCREEN-largeSCREEN-greatBATTERY-large

Challenges:1. signals in different format2. contradictory signals

?

Page 16: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 16

A Novel Optimization Framework

S = argmin

subject to

+ δ

λprior

+ λsim

+ λoppo

+ λratingSCREEN-large S1

SCREEN-great S2

BATTERY-large S3

… …

Objective function designed to encode signals from multiple resources

Yue Lu

S

S: Aspect-Dependent Sentiment Lexicon

Constraints

Page 17: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 17

1. sentiment prior

G: General-purpose Sentiment Lexicon

S = argmin

+ δ

λprior

+ λsim

+ λoppo

+ λrating

Yue Lu

S: Aspect-Dependent Sentiment Lexicon

S

SCREEN-large S1

SCREEN-great S2

BATTERY-large S3

… …

SCREEN-great 1SCREEN-bad -1BATTERY-great 1… …

Page 18: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

18

2. overall sentiment rating

O: Review Overall Ratings

R1 1R2 1R3 -1R4 0… ..

X: Review Word Matrix*

S = argmin λprior

+ λsim

+ λoppo + δ

~

+ λrating

S

Predicted Ratings

R1 0.8R2 0.5R3 -0.7R4 0.1… ..

=

SCREEN-large S1

SCREEN-great S2

BATTERY-large S3

… …

S: Aspect-Dependent Sentiment Lexicon

R1 SCREEN-bright 0.2R1 BATTERY-large 0.3R1 SCREEN-great 0.5R2 SCREEN-awesome 0.4… ..

Page 19: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 19

3. similar sentiments

A: Similar-Sentiment Matrix (from synonyms and “and” clues)

S = argmin

+ δ

λprior

+ λsim

+ λoppo

+ λrating

Yue Lu

S

SCREEN-large S1

SCREEN-great S2

BATTERY-large S3

… …

S: Aspect-Dependent Sentiment Lexicon

SCREEN-large SCREEN-big 1SCREEN-bad SCREEN-terrible 1BATTERY-small BATTERY-tiny 1… …

Page 20: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 20

4. opposite sentiment

subject to

S = argmin

+ δ

λprior

+ λsim

+ λoppo

+ λratingB: Opposite-Sentiment Matrix

(from antonyms and “but” clues)

Separate the representation of Sj:- Sign: only one of Sj

+ , Sj- is active

- Abs Value: value of the active oneYue Lu

S

SCREEN-large S1

SCREEN-great S2

BATTERY-large S3

… …

S: Aspect-Dependent Sentiment Lexicon

SCREEN-large SCREEN-small 1SCREEN-excellent BATTERY-big 1BATTERY-small BATTERY-big 1… …

Sign is differentAbs Value is similar

Page 21: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 21

A Novel Optimization Framework

S = argmin

subject to

+ δ

λprior

+ λsim

+ λoppo

+ λrating Overall rating

General sentiment lexicon

Synonyms “and” clues

1

2

3

4

Antonyms“but” clues

Weights set as the degree we trust each signal

3

4

S

Yue Lu

• Transform to linear programming

• solved efficiently using GAMS/CPLEX

Page 22: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 22

Evaluation: Data SetsHotel Data Printer Data

Source TripAdvisor Customer Survey# doc 4792 3511# aspects 7 25AVG length 270 24# judged doc 750 3511# judged lexicon entry 705 NA# judged doc-aspect pair 2145 4634

Yue Lu

Evaluation (1): Lexicon QualityEvaluation (2): Doc-Aspect Sentiment, aggregate the sentiment of lexicon entries to doc level

Page 23: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 23

Evaluation (1): Lexicon QualityOPT > Global > Dictionary

Method Precision Recall F-ScoreRandom 0.4932 0.2784 0.3559MPQA 0.9631 0.3702 0.5348INQ 0.8757 0.4397 0.5855Global 0.7073 0.5929 0.6451OPT 0.8125 0.6823 0.7417

equal weights, i.e. (λprior:λrating:λsim:λoppo = 1:1:1:1)

Guess 1,0,-1 uniformly

General dictionary only

Overall ratings only

Our method with[Lu et. al. WWW09] 15%

27%39%

Interesting sample results using OPT:Hotel Data: ROOM-private, FOOD-excelentPrinter Data: INK-fast, SUPPORT-fast

Hotel Data

Yue Lu

Page 24: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 24

Tuning weights further improves performance

λprior λsim λoppo λrating F-Score1 1 1 1 0.7417

0 1 1 1 0.65491 0 1 1 0.73091 1 0 1 0.74081 1 1 0 0.6453

2 1 1 2 0.74313 1 1 3 0.75446 1 1 6 0.75108 1 1 8 0.7506

OPT default:equal weights

Dropping one term

More weightson importantterms

Yue Lu

Page 25: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 25

Evaluation (2): Doc-Aspect Sentiment:OPT > Global > Dictionary

Method Precision Recall F-Score MSERandom 0.4844 0.2629 0.3408 0.7142MPQA 0.7579 0.1597 0.2639 0.5740INQ 0.7879 0.3502 0.4849 0.5365Global 0.7645 0.5448 0.6362 0.5091OPT 0.8222 0.5276 0.6428 0.4680

Random 0.4368 0.3689 0.3999 0.5670MPQA 0.8128 0.5289 0.6408 0.4700INQ 0.7800 0.6294 0.6966 0.4561Global 0.6975 0.7730 0.7333 0.4426OPT 0.7283 0.7756 0.7512 0.4160

PrinterData

HotelData

2%

1%

6%

8%

8%17%

33%144%

13%18%

9%11%

Yue Lu

Page 26: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 26

Aspects Opinion Sentences Sentiment QualityAspect 1 positive

negativehighmedium

Aspect 2 neutralpositive

lowhigh

Sentence 512Sentence 823

Sentence 21Sentence 153

OpinionIntegration

SentimentAnalysis

Quality Prediction

Roadmap

• [WWW’10]: Exploiting Social Context for Review Quality Prediction

[WWW’08][COLING'10]

[WWW’10][WWW’09][KDD’10][WWW’11]

Yue Lu

Integrated Summary

Page 27: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 27

Existing Work of Quality Prediction• As a supervised learning problem

√ ×?

???

??

?

?

?

√[Zhang&Varadarajan`06] [Kim et al. `06][Liu et al. `08] [Ghose&Ipeirotis `10]

Labeled

Unlabeled

• Textual features• Meta-data features

Very HelpfulNot Helpful

Yue Lu

Page 28: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 28

Base model: Linear Regression

w = argmin= argmin{ }

Quality( ) = Weights × FeatureVector( )i

i

Closed-form: w=

Textual Features

Yue Lu

w

w

Labeled

Labels are expensive to obtain!

Page 29: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 29

We also observe…

Reviewer Identity

Social Network

Social Context+

Quality( )is related to its Social NetworkQuality( )

Intuitions:Quality( )

is related to

How to use them to help prediction?

Yue Lu

Our idea: social context can help!

Page 30: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 30

{ + β× Graph Regularizer }w = argmin

Trade-off parameter

Designed to “favor”our intuitions

BaselineLoss function

Advantages:• Semi-supervised: make use of unlabeled data• Applicable to reviews without social context

Labeled Unlabeled

How to design the regularizers?

Yue Lu

Our approach: add social context as graph-based regularizers

w

Page 31: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 31

Hypothesis 1: Reviewer Consistency

Quality( )

Quality( ) ~

1 23 4

1

4

Quality( ) 2

Quality( ) ~3

Reviewers are consistent!

Yue Lu

Page 32: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 32

Regularizer for Reviewer Consistency

Reviewer Regularizer

=∑ [ Quality( ) - Quality( ) ]21 2

Closed-form solution! 1 2

3 4

Same-Author Graph (A)

[Zhou et al. 03] [Zhu et al. 03] [Belkin et al 06]

w=

Graph LaplacianReview-FeatureMatrix

Yue Lu

Page 33: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 33

Hypothesis 2: Trust Consistency

Quality( ) - Quality( ) ≤ 0

I trust people with quality at least as good as mine!

Yue Lu

Page 34: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 34

Regularizer for Trust Consistency

Trust Regularizer

=∑max[0, Quality( ) -

Quality( )]2

No closed-form solution…Still convexGradient Descent

Trust Graph

Yue Lu

Page 35: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 35

Hypothesis 3 &4Trust Graph

Co-citation Graph

Yue Lu

Link Graph

Hypothesis 4:Link Consistency

Hypothesis 3:Co-citation Consistency

Page 36: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 36

Mathematical Formulations

1. Reviewer Consistency:

2. Trust Consistency:

3. Co-citation Consistency:

4. Link Consistency:

Yue Lu

Closed form

Closed form

Closed form

Gradient descent

Page 37: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 37

Evaluation: Data Sets from Ciao UK

Statistics Cellphone Beauty Digital Camera# Reviews 1943 4849 3697Reviews/Reviewer ratio 2.21 2.84 1.06

Trust Graph Density 0.0075 0.014 0.0006

Summary Cellphone Beauty Digital CameraSocial Context rich rich sparse

Gold-std Quality Distribution balanced skewed balanced

Yue Lu

Page 38: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 38

-16%

-14%

-12%

-10%

-8%

-6%

-4%

-2%

0%

Our methods are most effective with limited labeled data

% o

f MSE

Diff

eren

ce

Percentage of labeled Data10% 25% 50% 100%

(Cellphone)Better

Reg:

Link

Reg:

Revie

wer

Reg:

Cocit

ation

Reg:

Trus

t

Yue Lu

Baseline

Page 39: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 39

-15%-13%-11%

-9%-7%-5%-3%-1%

% o

f MSE

Diff

eren

ce Cellphone Beauty Digital Camera

Better

Reg:

Link

Reg:

Revie

wer

Reg:

Cocit

ation

Reg:

Trus

t

Yue Lu

Our methods are most effective with rich social context

Baseline

Reviews/Reviewer ratio = 1.06

Page 40: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 40

Aspects Opinion Sentences Sentiment QualityAspect 1 positive

negativehighmedium

Aspect 2 neutralpositive

lowhigh

Sentence 512Sentence 823

Sentence 21Sentence 153

OpinionIntegration

SentimentAnalysis

Quality Prediction

Summary of this talk

Yue Lu

Integrated Summary

Page 41: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 41

Aspects Opinion Sentences Sentiment QualityAspect 1 positive

negativehighmedium

Aspect 2 neutralpositive

lowhigh

Sentence 512Sentence 823

Sentence 21Sentence 153

OpinionIntegration

SentimentAnalysis

Quality Prediction

Summary of this talk

1. Sentiment Analysis: construct aspect-dependent sentiment lexicon

2. Quality Prediction: exploit social context

[WWW’08][COLING'10]

[WWW’10][WWW’09][KDD’10][WWW’11]

Yue Lu

Integrated Summary

Page 42: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 42

Future Directions

65M msgs/day53M blogs1307M posts

115M users 10M groups

45M reviews

Yue Lu

Integrative Analysis

Efficient Algofor Real-time

Interaction

Task-supportApplications

Page 43: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

Summary of my other work:Text Information Management

Text Mining

[IRJ 10]

“Investigation of Topic Models”

[COLING 10]

[WWW 09][WWW 08]

[WWW 10][WWW 11]

Opinion Integrationand Summarization

[KDD 10]

Bioinformatics

Information Retrieval

[NAR 07]

“An open system for microarray clustering”

[NAR 10] “Bio literature mining”

http://sifaka.cs.uiuc.edu/yuelu2/ 43Yue Lu

[IRJ 09]

“Bio literature IR”[TREC 07]

Page 44: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

Thank you!&

Questions?

Page 45: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

Backup Slides

Page 46: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 46

References[WWW'11] Yue Lu, Malu Castellanos, Umeshwar Dayal, ChengXiang Zhai. "Automatic

Construction of a Context-Aware Sentiment Lexicon: An Optimization Approach", To Appear at WWW’11

[COLING'10] Yue Lu, Huizhong Duan, Hongning Wang and ChengXiang Zhai. "Exploiting Structured Ontology to Organize Scattered Online Opinions", In Proceedings of the 23rd International Conference on Computational Linguistics Pages: 734--742.

[KDD’10] Hongning Wang, Yue Lu, and ChengXiang Zhai. "Latent Aspect Rating Analysis on Review Text Data: A Rating Regression Approach", In Proceedings of the 16th ACM SIGKDD Conference on Knowledge Discovery and Data Mining Pages: 783-792

[WWW'10] Yue Lu, Panayiotis Tsaparas, Alexandros Ntoulas, and Livia Polanyi. "Exploiting Social Context for Review Quality Prediction", In Proceedings of the 19th International World Wide Web Conference Pages: 691-700.

[WWW'09] Yue Lu, ChengXiang Zhai and Neel Sundaresan. "Rated Aspect Summarization of Short Comments", In Proceedings of the 18th International World Wide Web Conference Pages: 131-140.

[WWW'08] Yue Lu and ChengXiang Zhai. "Opinion Integration Through Semi-supervised Topic Modeling", In Proceedings of the 17th International World Wide Web Conference Pages: 121-130.

Yue Lu

Page 47: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 47

Other Publications[IRJ’10] Yue Lu, Qiaozhu Mei, ChengXiang Zhai. "Investigating Task Performance of Probabilistic Topic Models - An Empirical Study of PLSA and LDA", Information Retrieval. [NAR’10] X. He, Y. Li, R. Khetani, B. Sanders, Yue Lu, X. Ling, C.-X. Zhai, B. Schatz. “BSQA: Integrated Text Mining Using Entity Relation Semantics Extracted from Biological Literature of Insects", Nucleic Acids Research.

[IRJ’09] Yue Lu, Hui Fang and ChengXiang Zhai. "An Empirical Study of Gene Synonym Query Expansion in Biomedical Information Retrieval", Information Retrieval Volume 12, Issue1 (2009), Pages: 51-68.

[TREC'07] Yue Lu, Jing Jiang, Xu Ling, Xin He, ChengXiang Zhai. "Language Models for Genomics Information Retrieval: UIUC at TREC 2007 Genomics Track", In Proceedings of the 16th Text REtrieval Conference.

[NAR’07] Yue Lu, Xin He and Sheng Zhong. “Cross-species microarray analysis with the OSCAR system suggests an INSR->Pax6->NQO1 neuro-protective pathway in ageing and Alzheimer's disease", Nucleic Acids Research 105-114

Bioinformatics

Bioinformatics

Biomedical IR

Biomedical IR

Topic models

Yue Lu

Page 48: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 48Yue Lu

Generating Candidate Lexicon Entries

The LCD is great but battery is so large.

[The/DT LCD/NN is/VBZ great] but/CC [battery/NN is/VBZ so/RB large/JJ] ./.

SCREEN-greatBATTERY-large

[The/DT (LCD/NN):SCREEN is/VBZ great/JJ] but/CC [(battery/NN):BATTERY is/VBZ so/RB large/JJ] ./.

Candidates:

Parsed:

Input:

AspectTagged:

SCREEN-largeSCREEN-greatBATTERY-large…

?

Page 49: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 49

Hypotheses Testing (1):Reviewer Consistency

Qg( ) -1 Qg( ) 2

Qg( ) -1 Qg( ) 3

Hypothesis 1: Reviewer Consistency is supported by data

Difference in Review QualityDe

nsityFrom same reviewer

From different reviewers

(Cellphone)

Yue Lu

Page 50: Opinion Integration and Summarization Yue Lu University of Illinois at Urbana-Champaign

http://sifaka.cs.uiuc.edu/yuelu2/ 50

Hypotheses Testing (2-4):Social Network-based Consistencies

Qg( ) - Qg( )

B is not linked to AB trusts AB is co-cited with AB is linked to A

B A

Hypotheses 2-4: Social Network-based Consistencies supported by data

Difference in Reviewer Quality

Dens

ity

(Cellphone)

Yue Lu