need-based product review mining

49
Weiwei [email protected] Need-based Product Review Mining 1

Upload: chanda-olsen

Post on 01-Jan-2016

38 views

Category:

Documents


1 download

DESCRIPTION

Need-based Product Review Mining. Weiwei [email protected]. Outline. Introduction: Traditional Product Review Mining Change to “Need-based Product Review Mining” Research Area Technology Related Need Recognition Feature(explicit & implicit) Extraction Opinion Extraction - PowerPoint PPT Presentation

TRANSCRIPT

Weiwei

[email protected]

Need-based Product Review Mining

1

Outline

Introduction: Traditional Product Review Mining Change to “Need-based Product Review Mining”

Research AreaTechnology Related

Need Recognition Feature(explicit & implicit) Extraction Opinion Extraction Scoring and Ranking

Conclusion

2

Introduction3

Traditional Product Review Mining

Product-centric(Product-based): Process:

Select a product Review mining Structural visualization

Paper: Liu.B[KDD04, WWW05], Dave.K[WWW03],

Turney.P[ACL02], Liu[KDD08] etc. An example:

CRO

4

An example: CRO

[1]Select a product(or input a product)

5

An example: CRO

[2]Review Mining of the corresponding product

6

An example: CRO

[3]Structural visualization

7

Change to “Need-based Mining”

Motivation – “Online Purchasing Analyze” “Customer seek to satisfy a particular need”[Kotler03] Vs “Traditional Store-Purchasing”

A clerk to help you(Store-purchasing) Using online chat software to interact customers(Online-

purchasing) What are they talking about?

Help to translate their “need” to a specific product

8

Change to “Need-based Mining”

Motivation Why doing this?-”why not let customer do this

alone?” Don’t know what the product attributes mean Only have a need in mind Need a recommended products list satisfying their need

How to translate need is a problem

9

Need-based Mining

Need-based(or user-centric) Focus on multi-products of a product category(not a

single product) Associate “need” to a set of attributes of the product Recommend products by sentiment analysis towards

the attributes above

10

Research Area11

Research Framework

CSS Research Framework

Customer

Product Review

Product

Traditional Product Review Mining Research

Framework

Need

Product Review

Customer

Product

12

Review DB

a need

Feature extraction

Merge similar feature

Onto construction

Need recognition

Opinion extraction

<Feature, opinion>pairs, include

implicit feature identify

a rank list of product

Online

Offline

Research Framework

Aggregation function

Sentiment analysis

Product scoring13

1. Need Recognition2. Feature Extraction3. Opinion Extraction(sentiment analysis)4. Scoring and Ranking

Technology related14

1.Need Recognition

Need definition: “Feeling of want that provides a basis for behavior or

action”(name) These words are implicitly related to a set of features

of a product category Each feature has a weight Need = <n, F, W> Some examples:

“a camera for climbing”: <“Climbing”, {size, weight, wide-angel}, {0.3, 0.5, 0.2}>

“a sun-resistant cosmetic”: <“sun-resistant”, {whitening, price}, {0.8, 0.2}>

15

Need Recognition16

Need name

Feature clusters

1

2

3

4

Degree of association (DOA) calculation

-PMI, LSA, etc.

Introduction to PMI17

Fact object and discriminator object Based on co-occurrence of words PMI(f, d) = PMI=0, independent; PMI>0, dependent

Estimation-”PMI-IR[21]” Constraint: Near, And, etc[23 AltaVista].

)()(

),(lg

dPfP

dfP

)()(

int),(log),(

dhitsfhits

ConstradfhitsdfIRPMI

DOA Calculation18

How to find and quantify the association between two objects?

Product Reviews

corpus(Full set)

Ideal Condition )()(

),(lg

BPAP

BAPPMI

DOA Calculation19

Reviews corpus

Actual Condition + PMI-IR[21] like algorithm

Need Recognition20

Need name

Feature clusters

1

2

3

4

Find the features set F and weight

F->{F1, F3}W ->{W1, W3}

Feature Set and Weight21

Feature choosing condition: Degree of association(DOA) ≧ δ(threshold)

Set’s Weight Calculation:

Need Description: Need = <name, F, W>

F

FFj

ii

j

nFDOA

nFDOAW

),(

),(

2.Feature Extraction(Onto Cons)22

Related Work: Supervised method. Unsupervised method.

Disadvantages: Similar features clustering problem(Concept

relationship discovery) Implicit features recognition problem

Feature Extraction23

What is feature(>attribute)? Not only the product parameters(attribute) All the comment aspects of the product “Official parameter specification” + “consumer

comment aspects” Features are infinite Explicit feature and Implicit feature

Feature Extraction(Explicit Feature)[17]24

Relevant Product Review

corpus

Irrelevant Product Review

corpus

Candidate Feature Set

Feature Set Noisy filter

Similar feature clustering

Supplementary

Similar Feature Clustering25

Related works: [14], [15], [18]-” Reinforcement Clustering

heterogeneous web objects”

First problem: How to pre-define the similar feature? Synonym features, the same aspects of the product

Experiments: Content only to cluster similar features Link only to cluster similar features Content plus link to cluster similar features

Experiments 26

featuresOpinions

Content-based method27

Similarity calculation:

Clustering PAM

Measurement: Entropy, Precision, Recall, F-Measure.

w)Cooccur(f,

wfCountOpwfSubstrwfSim

),(),(),(

Link-based method28

Similarity calculation: SimRank[19]

Clustering: PAM

Measurement: Entropy, Precision, Recall, F-Measure.

Content plus link method29

featuresOpinions

Content plus link method30

featuresOpinions

Content plus link method31

featuresOpinions

Content plus link method32

featuresOpinions

Content plus link method33

featuresOpinions

Content plus link method34

featuresOpinions

Experiments Results35

Link method

Content-based

Content plus link

Entropy 5.9891 3.4160 2.4037

Precision 0.6443 0.6860 0.7308

Recall 0.5104 0.6895 0.7312

F-measure 0.5688 0.6872 0.7310

Feature Extraction(Implicit Feature)36

Opinions

featuresConfident Value:

Opinion(w)->F(i)

Feature Relationship Learning

同义 上下位部分 /整体

3.Opinion Extraction(<Sentiment Analysis)

38

Sentiment Analysis Sub/obj text classification, sentiment tracking, product

opinion mining, etc.

Opinion Extraction Context-based opinion polarity identification

What is opinion?39

Opinion Words or phrases express semantic

orientation(Positive, Negative or Neutral) Context independent opinion(“good”, “bad”, etc) Context dependent opinion(“big”, “small”, etc)

Opinion semantic orientation identification Context independent opinion Context dependent opinion

Related Works40

Context independent opinion WordNet-based method [1, 2, 5]

Seed list, Incremental PMI-SO method [Turney 24]

Seed list(“excellent”, “awful”, etc)

Context dependent opinion Syntactic rules(conjunction, disjunction, etc)

[Ding 20] Semantic Clustering based

[Liu 5], [Yang “Study of Structurizing Chinese Product Review”]

Problems41

Find the context of opinion word Word level

Eg: “good”, “bad”, etc. (Context independent opinion) <feature, opinion> pair level

Eg: “The camera is too heavy”, <camera, heavy>-negative

Sentence level Eg: “The camera is very shining but I don’t like it.” Almost all the research don’t consider this problem Split by “but”, <camera, shining>-positive (Actually is

negative here)

Future works42

Try to tackle these problem, especially 3.

4.Product Scoring and Ranking43

Related Work: Product Recommendation based on reviews

[9], [12], etc.

Problem: Only consider one feature at a time[12 Red Opal]

A need always has several features All the reviews are equal[all]

Different reviews express different need Only consider numerical scores(always total scores)[3,

4, 12] Maybe in a review fa‘s polarity is negative, fb’s polarity is

positive, but the reviewer gives the score is 3 star

Product Scoring and Ranking44

Need-based Product Recommendation Focus on multi-features at a time Weight each review by their satisfactory of the giving

need Topic-based opinion extraction

Need = <n, F, W> n: a word or phrase reveal the consumer need F: feature set W: weight of each feature

Product Scoring and Ranking45

Product Scoring

Product Ranking Product scores, NA(need association), etc.

n

niii SONAScoreP

],1[

*.

Fi

iFi

Nn

NNA

*

2

F

Fffji

j

jSOwSO *

Reference

[1] Liu.B. Opinion Observer: Analyzing and Comparing Opinions on the Web. WWW05

[2] Liu.B. Mining and Summarizing Customer Reviews. KDD04[3] Turney.P. Thumbs Up or Thumbs Down?: Semantic Orientation

Applied to Unsupervised Classification of Reviews. ACL02[4] Dave.k. Mining the Peanut Gallery: Opinion Extraction and

Semantic Classification of Product Reviews. WWW03[5] Liu. CRO: a system for online review structurization. KDD08[6] Kotler.P. Marketing Management. Prentice Hall 2003.[7] Orman.L. Consumer Support System. Communications of ACM

2007[8] Lee.T. Need-based Analysis of Online Customer Reviews. ICEC07[9] Lee.T. Needs-Centric Searching and Ranking Based on Customer

Reviews. ICEC08[10] Lee.T. Use-centric mining of customer reviews. WITS04[11] Lee.T. Constraint-based Ontology Induction from Online

Customer Reviews. Group Decision and Negotiation

46

Reference

[12] Scaffidi.C. Red Opal: Product-Feature Scoring from Reviews. ACM-EC 07

[13] Scaffidi.C. Application of a Probability-based Algorithm to Extracting of Product Features from Online Reviews. CMU Technical Report 06

[14] H. J Zeng. A unified framework for clustering heterogeneous web objects. ICWISE 02.

[15] Q.Su. Hidden Sentiment Association in Chinese Web Opinion Mining. WWW 08

[16] X.Y Du. A Survey on Ontology Learning Research. Journal of Software 06

[17] W.Wei. Extracting Feature and Opinion Words Effectively from Chinese Product Review. FSKD 08

[18] J.D Wang. ReCoM: Reinforcement Clustering of Multi-Type Interrelated Data Objects. SIGIR 03

[19] G.J. SimRank: A measure of Structural-Context Similarity. SIGKDD 02

[20] X.W Ding. A Holistic Lexicon-Based Appraoch to Opinion Mining. WSDM 08

47

48

[21] P.Turney. Mining the Web for Synonyms: PMI-IR Versus LSA on TOEFL. ECML 01

[22] Manning.C. Foundations of Statistical Natural Language Processing. MIT Press 1999

[23] AltaVista: AltaVista Advanced Search Cheat Sheet. Alta Vista Company 01

[24] A.Maria. Extracting Product Features and Opinions from Reviews. EMNLP 05

49

End.

Any question?