yelp big data analytics

19
Yelp Big Data Analytics Avi Dey - 008458892 Poonkodi Ponnambalam - 009438611 Marc Nipuna Dominic Savio - 009305205

Upload: avi-dey

Post on 07-Aug-2015

79 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Yelp Big Data Analytics

Yelp Big Data Analytics

Avi Dey - 008458892

Poonkodi Ponnambalam - 009438611

Marc Nipuna Dominic Savio - 009305205

Page 2: Yelp Big Data Analytics

1.6 Million

Reviews

366K Users

61K Businesses

481K BusinessAttributes

500K Tips

Yelp Challenge Dataset

Page 3: Yelp Big Data Analytics

Token Analysis

Page 4: Yelp Big Data Analytics

1-star reviews

Page 5: Yelp Big Data Analytics

Most frequent words 1-star review

Page 6: Yelp Big Data Analytics

5-star reviews

Page 7: Yelp Big Data Analytics

Most frequent words 5-star review

Page 8: Yelp Big Data Analytics

People rate more when they have positive experiences

Page 9: Yelp Big Data Analytics

Users find 1 star ratings useful

Page 10: Yelp Big Data Analytics

Most of the users are lonely !!

How many friends do users have

Page 11: Yelp Big Data Analytics

Almost half of the reviews leftare useful

Page 12: Yelp Big Data Analytics

Food and Shopping businesses arereviewed most

Page 13: Yelp Big Data Analytics

● Is it possible to use machine learning and predict ratings for a business?

● Can we use this human labelled data to train a predicting model?

Machine Learning - Rating Prediction

Page 14: Yelp Big Data Analytics

● Ratings are numeric and continuous.Use Linear Regression

● Y = a + b1 * x1 + b2 * x2 + b3 * x3 +.... bk * xk● Y = Continuous output value● x1,x2,...xk : Features of the input.● a, b1, b2..bk : Weights of the features.

● Stars = F(Biz attributes)● Stars = a + b1 * (wifi) + b2 * (drive_thru)... etc

Rating Predictions

Page 15: Yelp Big Data Analytics

Hurting features -0.5056 * drive_thru + -0.3192 * dj + -0.1692 * delivery + -0.1387 * good_for_groups + -0.0919 * accepts_credit_cards

Helpful features. 0.4745 * upscale + 0.4431 * intimate + 0.4003 * classy + 0.3434 * hipster + 0.341 * romantic + 0.2848 * valet + 0.2679 * coat_check + 0.2641 * dogs_allowed + 0.2112 * by_appointment_only + 0.1932 * background_music + 0.1496 * wheelchair_accessible + 0.1413 * divey + 0.1374 * good_for_dancing

Rating Predictions

Page 16: Yelp Big Data Analytics

Predict Ratings for a business

Rating Predictions

Page 17: Yelp Big Data Analytics

Advice to Improve your BusinessDon’t:

-Have a Drive-Thru

-Be cash only

-Have free delivery

-Have a noisy environment

Page 18: Yelp Big Data Analytics

Advice to Improve your BusinessDo:

-Take Appointments

-Offer valet services, coat-check

-Play smooth/ambient background music

-DJ music isn’t always the best!

-Be upscale, romantic and intimate

-Be socially responsible

-Wheel-chair accessible

-Dog friendly

Page 19: Yelp Big Data Analytics