isqs 7342 dr. zhangxi lin by: tej pulapa. recommendation system predict consumer behavior - is to...

10
Advanced Data Mining for recommendation systems using Decision Trees ISQS 7342 Dr. Zhangxi Lin By: Tej Pulapa

Upload: henry-miles

Post on 20-Jan-2018

216 views

Category:

Documents


2 download

DESCRIPTION

Multi Dimensional Profile Identifiers Now that we have the following Historical data of consumer purchasing patterns Demographic data Income Based on the above available information several profiles can be deduced, lets assume each profile is identified by a arbitrary identifier for eg., poor, middle class, rich or online shopper or city dweller, suburban dweller, or a techie gadget lover who can spare his food. What about a poor, online shopper residing in a suburban area of Hollywood who has skipped dozens of meals to buy a I phone?

TRANSCRIPT

Page 1: ISQS 7342 Dr. Zhangxi Lin By: Tej Pulapa. Recommendation System Predict consumer behavior - Is to discover…

Advanced Data Mining for recommendation systems using

Decision Trees

ISQS 7342 Dr. Zhangxi Lin

By: Tej Pulapa

Page 2: ISQS 7342 Dr. Zhangxi Lin By: Tej Pulapa. Recommendation System Predict consumer behavior - Is to discover…

Recommendation SystemPredict consumer behavior - Is to discover the

relationship between one’s personal characteristics, e.g. age, gender, hometown, and the probability that one will respond to a recommendation.

Such relationships can be used to recommend those customers with a matching profile that have the highest probability of responding.

Consumer Profiles have to be developed and updated constantly based on the historical purchasing behavior of the consumers, paying attention to season is also an important issue to prevent recommendation system jokes on SNL.

Page 3: ISQS 7342 Dr. Zhangxi Lin By: Tej Pulapa. Recommendation System Predict consumer behavior - Is to discover…

Multi Dimensional Profile IdentifiersNow that we have the following

Historical data of consumer purchasing patterns

Demographic dataIncomeBased on the above available information several

profiles can be deduced, lets assume each profile is identified by a arbitrary identifier for eg., poor, middle class, rich or online shopper or city dweller, suburban dweller, or a techie gadget lover who can spare his food.

What about a poor, online shopper residing in a suburban area of Hollywood who has skipped dozens of meals to buy a I phone?

Page 4: ISQS 7342 Dr. Zhangxi Lin By: Tej Pulapa. Recommendation System Predict consumer behavior - Is to discover…

15

43

2

Unique Profiles – idealistic conditionConstant changes in consumer wantsA consumer may belong to all the profiles

1,2,3,4 and 5 partially but with certain % of agreement with each of the 5 profiles.

10% of 1, 15% of 2, 35% of 3, 30% of 4 And 10% of 5, in this case a new profile is

necessary but it also important to consider that profile 1

Is in agreement to profile 2 and 3 with 20% and 25%.

Page 5: ISQS 7342 Dr. Zhangxi Lin By: Tej Pulapa. Recommendation System Predict consumer behavior - Is to discover…

This bears a large degree of complexity to the process of profiling consumers.

Each profile can be viewed as a composition of more than one element/ attribute.

Deducing the available data into basic elements is essential which can be perceived at this point as reaching the leaf of a decision tree with ideally zero entropy.

Page 6: ISQS 7342 Dr. Zhangxi Lin By: Tej Pulapa. Recommendation System Predict consumer behavior - Is to discover…
Page 7: ISQS 7342 Dr. Zhangxi Lin By: Tej Pulapa. Recommendation System Predict consumer behavior - Is to discover…

Decision Trees Instances describable by attribute- value pairs Target function is discrete valued Disjunctive hypothesis may be required Possible noisy training data

In building a decision tree, we aim to decrease the entropy of the dataset until we reach leaf nodes at which point the subset that we are left with is pure, or has zero entropy and represents instances all of one class (all instances have the same value for the target attribute).

Entropy – synonym with impurity or disorders in the data

Page 8: ISQS 7342 Dr. Zhangxi Lin By: Tej Pulapa. Recommendation System Predict consumer behavior - Is to discover…

Decision Trees with Association rule methodsXY

where X is a set of items and Y is a single itemAssociation rule methods are initial data exploration

approach

Use with decision trees:With use of variable transformation node and formula

builder, creating new variables that reflect basic association rule concept

X U Y Z

Page 9: ISQS 7342 Dr. Zhangxi Lin By: Tej Pulapa. Recommendation System Predict consumer behavior - Is to discover…

Respond * GradStud >= 15

Where Respond is binary and GradStud is a % of Population 25 years and over and Graduate or professional degree and the value 15 is the mean value. Hence you have a variable of upper half of population with a degree and above age of 25 who responded to the recommendation.

TENURE >= 24 & ExpHouse > 0 or PlusCar >3 = loyal_rich_cust

Tenure – months since first purchaseExpHouse – homes more than $300KPlusCar – extra number of cars

Page 10: ISQS 7342 Dr. Zhangxi Lin By: Tej Pulapa. Recommendation System Predict consumer behavior - Is to discover…

QuestionsCommentsSuggestions