sentiment analysis on amazon movie reviews dataset
TRANSCRIPT
![Page 1: Sentiment Analysis on Amazon Movie Reviews Dataset](https://reader036.vdocument.in/reader036/viewer/2022062821/589a86c31a28ab0e2f8b65f9/html5/thumbnails/1.jpg)
SENTIMENT ANALYSISAMAZON MOVIE REVIEW DATASET
IS 688 – WEB MINING
INSTRUCTOR: CHRISTOPHER MARKSON
TEAM MEMBERS: Maham | Amit | Mashael | Karan | Nidhish
![Page 2: Sentiment Analysis on Amazon Movie Reviews Dataset](https://reader036.vdocument.in/reader036/viewer/2022062821/589a86c31a28ab0e2f8b65f9/html5/thumbnails/2.jpg)
OUTLINE
• Data Source, Collection & Parsing• Model Selection & Optimizing Parameters• Methods / Code Sample• Results Overview & Value
![Page 3: Sentiment Analysis on Amazon Movie Reviews Dataset](https://reader036.vdocument.in/reader036/viewer/2022062821/589a86c31a28ab0e2f8b65f9/html5/thumbnails/3.jpg)
DATA SOURCE, COLLECTION & PARSING
Amazon movie reviews, published by Jure Leskovec. Assistant Professor of Computer Science at Stanford University on his personal site.
![Page 4: Sentiment Analysis on Amazon Movie Reviews Dataset](https://reader036.vdocument.in/reader036/viewer/2022062821/589a86c31a28ab0e2f8b65f9/html5/thumbnails/4.jpg)
PROBLEMS
• Format was not R-Friendly• Only partial information was available, data context were missing
• we had reviews but no information about the movie
![Page 5: Sentiment Analysis on Amazon Movie Reviews Dataset](https://reader036.vdocument.in/reader036/viewer/2022062821/589a86c31a28ab0e2f8b65f9/html5/thumbnails/5.jpg)
WORKAROUND / SOLUTION• Wrote a parser to convert JSON txt file into CSV using R Compiler
• Developed a NodeJS middleware to gather information about movie
![Page 6: Sentiment Analysis on Amazon Movie Reviews Dataset](https://reader036.vdocument.in/reader036/viewer/2022062821/589a86c31a28ab0e2f8b65f9/html5/thumbnails/6.jpg)
PREPARED FILESAfter parsing, and gather more data using Amazon Web Service, we got following 2 files
&
Reviews
Movie Details
![Page 7: Sentiment Analysis on Amazon Movie Reviews Dataset](https://reader036.vdocument.in/reader036/viewer/2022062821/589a86c31a28ab0e2f8b65f9/html5/thumbnails/7.jpg)
MODEL SELECTION & OPTIMIZATION• Basic Sentiment Score for Each Review, using Syuzhet package
• Provides 4 types of method, bing, afinn, nrc, Stanford; AFFIN has weighted 2477 words and phrases
• Uses coreNLP, stringr libraries mainly.. Emotional trajectory of review
• Create WordCloud for Each Movie, using wordcloud package
• Combined all reviews into one variable, calculated term frequency & generated WordCloud images
• Used tm (text minig), SnowballC (text stemming), RColorBrewer (color palettes) alongside
• Pointwise Mutual Information (PMI) Sentiment Score for Each Movie, using RCurl package
• Wrote our own function
• Movie_Title vs Excellent/Poor, Movie_Genre vs Excellent/Poor
• Final score was the ratio of Movie_Title / Movie_Genre
![Page 8: Sentiment Analysis on Amazon Movie Reviews Dataset](https://reader036.vdocument.in/reader036/viewer/2022062821/589a86c31a28ab0e2f8b65f9/html5/thumbnails/8.jpg)
MODEL SELECTION & OPTIMIZATION
• Aggregated all the Sentiment Scores• Took Median of all the users review score
• Took Median of all the users review text sentiment score
• Assigned an overall Sentiment Score to each movie• Took median of
• User Review Score Aggr,
• User Review Text Sentiment Score Aggr,
• Movie_Title vs Genre PMI Score
![Page 9: Sentiment Analysis on Amazon Movie Reviews Dataset](https://reader036.vdocument.in/reader036/viewer/2022062821/589a86c31a28ab0e2f8b65f9/html5/thumbnails/9.jpg)
METHODS / CODE SAMPLE
Basic Sentiment Score
WordCloud
![Page 10: Sentiment Analysis on Amazon Movie Reviews Dataset](https://reader036.vdocument.in/reader036/viewer/2022062821/589a86c31a28ab0e2f8b65f9/html5/thumbnails/10.jpg)
METHODS / CODE SAMPLE
Aggregation
PMI
![Page 11: Sentiment Analysis on Amazon Movie Reviews Dataset](https://reader036.vdocument.in/reader036/viewer/2022062821/589a86c31a28ab0e2f8b65f9/html5/thumbnails/11.jpg)
RESULT OVERVIEW & VALUE
![Page 12: Sentiment Analysis on Amazon Movie Reviews Dataset](https://reader036.vdocument.in/reader036/viewer/2022062821/589a86c31a28ab0e2f8b65f9/html5/thumbnails/12.jpg)
RESULT OVERVIEW & VALUE
The Count of Monte Cristo [Region 2]
Far from HomePhonics Volume 1
![Page 13: Sentiment Analysis on Amazon Movie Reviews Dataset](https://reader036.vdocument.in/reader036/viewer/2022062821/589a86c31a28ab0e2f8b65f9/html5/thumbnails/13.jpg)
RESULT OVERVIEW & VALUE
• Alongside with aggregate user reviews, Amazon can present
• overall rating score, and
• Word Cloud local to that product
• This will save users a lot of time to read through all the reviews and they can easily picture the overall user sentiments regarding that product.
![Page 14: Sentiment Analysis on Amazon Movie Reviews Dataset](https://reader036.vdocument.in/reader036/viewer/2022062821/589a86c31a28ab0e2f8b65f9/html5/thumbnails/14.jpg)
THANK YOU