mingyu derek ma - a sentiment analysis method to …a sentiment analysis method to better utilize...

1
A Sentiment Analysis Method to Better Utilize User Profile and Product Information Mingyu MA (Derek) supervised by Prof. Qin Lu [email protected] derek.ma Hierarchical LSTM with Attention Document d UMN U(d) ^ PMN Joint Mechanism P(d) ^ Document d (text) (numeric vector) Sentiment Prediction Abstract Introduction Model Related Works Evaluation We proposed a new Joint User and Product Memory Network (JUPMN) utilizing user profile and product information in separate ways into sentiment classification. Inspired by the successful utilization of memory network, our model first creates document representations using hierarchical LSTM model and then feeds the document vectors into new carefully designed user and product memory networks to reflect corresponding features. The evaluation of JUPMN on three benchmark review datasets shows that JUPMN outperforms the state-of-the-art model and further analysis of experimental results is employed. Input Output ATT W External Memory ak dk-1 Attention Layer k user profile product info reviews document Task: Sentiment Analysis. Input: a paragraph of reviewing text Output: ratings/sentiment Can consider user profile and product information, while they are fundamentally different. We should not consider them as single united representation NN as classifier for text classification RNN, LSTM Neural-network- based Approaches Linear or kernel methods on lexical features Traditional Way Focus more on important text and add more associate data like eye- tracking data Attention Utilizing User Profile and Product Information as single representation > JUPMN Joint User and Product Memory Network Part I: Hierarchical LSTM network with Attention Document d (embedded by hierarchical LSTM) Softmax Sentiment Prediction WU wU U M N WP wP U(d) (embedded by hierarchical LSTM) ... ^ Attention Layer 3 d2 a3 Attention Layer 2 d1 a2 Attention Layer 1 d0 a1 P M N Attention Layer 3 d2 a3 Attention Layer 2 d1 a2 Attention Layer 1 a1 d0 P(d) (embedded by hierarchical LSTM) ... ^ d3 U d3 P Part II: Memory Network Attention Layer Benchmark datasets: IMDB, Yelp13 and Yelp14 JUPMN vs Comparison Models Outperforms the state-of-the-art model Best accuracy, RMSE and MAE Config : Importance of User vs Product Memory Network Config : Number of Hops Config : Memory Size Config : Joint Weights JUPMN with Different Configurations Config : Importance of User vs Product Memory Network User profile influences sentiments of movie reviews more Product information influences sentiments of restaurants reviews more Config : Number of Hops Smaller hop works better Config : Memory Size Larger memory helps until 75 Config : Joint Weights Weighted version works better Weight help to balance the influences of UMN and PMN Left: document number statistics Average joint weight for three datasets Performance vs Memory Size

Upload: others

Post on 25-May-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Mingyu Derek MA - A Sentiment Analysis Method to …A Sentiment Analysis Method to Better Utilize User Profile and Product Information Mingyu MA (Derek) supervised by Prof. Qin Lu

A Sentiment Analysis Method to Better Utilize User Profile and Product InformationMingyu MA (Derek) supervised by Prof. Qin Lu [email protected] derek.ma

Hierarchical LSTM with Attention

Document d

UMN

U(d)^

PMN

Joint Mechanism

P(d)^

Document d (text)

(numeric vector)

Sentiment Prediction

Document d

UMN

U(d)^

PMN

Joint Mechanism

P(d)^

Abstract

Introduction

Model

Related Works

Evaluation

We proposed a new Joint User and Product Memory Network (JUPMN) utilizing user profile and product information in separate ways into sentiment classification. Inspired by the successful utilization of memory network, our model first creates document representations using hierarchical LSTM model and then feeds the document vectors into new carefully designed user and product memory networks to reflect corresponding features. The evaluation of JUPMN on three benchmark review datasets shows that JUPMN outperforms the state-of-the-art model and further analysis of experimental results is employed.

Input

Output

ATT WExternal Memory

ak

dk-1

Attention Layer k

user profile

product inforeviews

document

Task: Sentiment Analysis. Input: a paragraph of reviewing textOutput: ratings/sentiment

Can consider user profile and product information, while they are fundamentally different. We should not consider them as single united representation

NN as classifier for text classificationRNN, LSTMNeural-network-based Approaches

Linear or kernel methods on lexical featuresTraditional Way

Focus more on important text and add more associate data like eye-tracking dataAttention

Utilizing User Profile and Product Information as single representation

>

JUPMNJoint User and Product

Memory Network

Part I: Hierarchical LSTM network with Attention

Document d(embedded by hierarchical LSTM)

Softmax Sentiment Prediction

WU

wU

UMN

WP

wP

U(d)(embedded by

hierarchical LSTM)

...

^

Attention Layer

3d2

a3

Attention Layer

2d1

a2

Attention Layer

1d0

a1

PMN

Attention Layer

3d2

a3

Attention Layer

2d1

a2

Attention Layer

1

a1

d0

P(d)(embedded by

hierarchical LSTM)

...

^

d3

U d3

P

Part II: Memory Network

Attention Layer

Benchmark datasets: IMDB, Yelp13 and Yelp14

JUPMN vs Comparison Models

Outperforms the state-of-the-art modelBest accuracy, RMSE and MAE

Config①: Importance of User vs Product Memory Network

Config②: Number of Hops

Config③: Memory Size

Config④:Joint Weights

JUPMN with Different Configurations

Config①: Importance of User vs Product Memory Network

• User profile influences sentiments of moviereviews more

• Product information influences sentiments of restaurants reviews more

Config②: Number of Hops• Smaller hop works betterConfig③: Memory Size• Larger memory helps until 75

Config④: Joint Weights• Weighted version works better• Weight help to balance the

influences of UMN and PMN

Left: document number statistics

Average joint weight for three datasets

Performance vs Memory Size