yelp big data analytics
TRANSCRIPT
Yelp Big Data Analytics
Avi Dey - 008458892
Poonkodi Ponnambalam - 009438611
Marc Nipuna Dominic Savio - 009305205
1.6 Million
Reviews
366K Users
61K Businesses
481K BusinessAttributes
500K Tips
Yelp Challenge Dataset
Token Analysis
1-star reviews
Most frequent words 1-star review
5-star reviews
Most frequent words 5-star review
People rate more when they have positive experiences
Users find 1 star ratings useful
Most of the users are lonely !!
How many friends do users have
Almost half of the reviews leftare useful
Food and Shopping businesses arereviewed most
● Is it possible to use machine learning and predict ratings for a business?
● Can we use this human labelled data to train a predicting model?
Machine Learning - Rating Prediction
● Ratings are numeric and continuous.Use Linear Regression
● Y = a + b1 * x1 + b2 * x2 + b3 * x3 +.... bk * xk● Y = Continuous output value● x1,x2,...xk : Features of the input.● a, b1, b2..bk : Weights of the features.
● Stars = F(Biz attributes)● Stars = a + b1 * (wifi) + b2 * (drive_thru)... etc
Rating Predictions
Hurting features -0.5056 * drive_thru + -0.3192 * dj + -0.1692 * delivery + -0.1387 * good_for_groups + -0.0919 * accepts_credit_cards
Helpful features. 0.4745 * upscale + 0.4431 * intimate + 0.4003 * classy + 0.3434 * hipster + 0.341 * romantic + 0.2848 * valet + 0.2679 * coat_check + 0.2641 * dogs_allowed + 0.2112 * by_appointment_only + 0.1932 * background_music + 0.1496 * wheelchair_accessible + 0.1413 * divey + 0.1374 * good_for_dancing
Rating Predictions
Predict Ratings for a business
Rating Predictions
Advice to Improve your BusinessDon’t:
-Have a Drive-Thru
-Be cash only
-Have free delivery
-Have a noisy environment
Advice to Improve your BusinessDo:
-Take Appointments
-Offer valet services, coat-check
-Play smooth/ambient background music
-DJ music isn’t always the best!
-Be upscale, romantic and intimate
-Be socially responsible
-Wheel-chair accessible
-Dog friendly