icdm 2002 using text mining to infer semantic attributes for retail data mining rayid ghani &...

33
ICDM 2002 Using Text Mining to Infer Semantic Attributes for Retail Data Mining Rayid Ghani & Andrew Fano Accenture Technology Labs, USA

Upload: bruno-sullivan

Post on 24-Dec-2015

221 views

Category:

Documents


2 download

TRANSCRIPT

ICDM 2002

Using Text Mining to Infer Semantic Attributes for Retail Data Mining

Rayid Ghani & Andrew FanoAccenture Technology Labs, USA

Who are we?

Accenture Technology Labs• R&D Group for Accenture

• ~ 50 researchers in Chicago, Palo Alto (California) and Sophia Antipolis (France)

• Research in Data Mining, Machine Learning, Ubiquitous Computing, Wearable Computing, Language Technologies, Virtual & Augmented Reality, Collaborative Workspaces…

Current State of Retail Data Mining

• Large amounts of data captured about transactions

• Each Retailer has terabytes of data in their data warehouse

• Several data mining algorithms applied to this data

Problem:

Today’s transaction data can’t answer important marketing questions.

What do your best selling items have in common?

What about the worst sellers?

What do the products a customer has purchased say about them?

What do the products your competitors sell say about them?

What’s Missing?

Captured data focuses on the transaction, not the product.

Product information captured with transactions is typically limited to little more than SKU, size, brand and price.

• But what does a SKU mean?

Current Data Mining Practice

• Treat products as generic unique entities/objects with no associated semantics

• Semantics are applied by humans AFTER the algorithm has done the learning e.g. interpreting association rules, decision trees

Product Semantics:

What does a product mean?

What does this shirt say about her?

Is it conservative or flashy?

Trendy or classic?

Formal or casual?

Where would we get this information?

Extract underlying attributes from product marketing descriptions

Marketing descriptions are designed to convey a particular image to customers. These descriptions implicitly contain these more elusive attributes.

DKNY Jeans Ruched Side-Tie Tee

Get back to basics with a fresh new look this season. The Ruched Side-Tie Tee has a drawstring tie at left hip with shirred detail down the side. Stretch provides a flattering, shapely fit. V-neck. SKU : 655432

UPC: 4200006200

Item: DKNYTee

Price $49

Product Descriptions

Domain Experts

Product descriptionsmarked up with attribute values

SupervisedLearning Algorithm

Learned Statistical Models

Training the System

Inferring Attributes via Text Classification

• Build one classifier for each attribute type

• Simple statistical classifier – Naïve Bayes Multinomial model (McCallum & Nigam 1998)– For all words (description) and attribute values:

• calculate P(word | attribute value) using the manually rated items

– Given a new item description:

• Calculate P(attribute value | item description) for all attribute values

• Use Maximum Likelihood

Naïve Bayes Results

0

10

20

30

40

50

60

70

80

90

Ag

eG

rou

p

Fu

nc

tio

na

lity

Fo

rma

lity

Co

ns

erv

ati

ve

Sp

ort

ine

ss

Tre

nd

ine

ss

Bra

nd

Ap

pe

al

Baseline Naïve Bayes

Cla

ssif

icat

ion

Acc

ura

cy

Can we get something for free?

Semi-supervised Learning

• Lot of product descriptions available for minimal/no cost from retail websites

• Labeling them is expensive

• Can we utilize the unlabeled product descriptions to provide better performance?

Semi-Supervised Learning

• Apply algorithms that combine labeled and unlabeled data for classification

– Expectation-Maximization (Nigam et al. 1999)– Co-Training (Blum & Mitchell 1999)– Co-EM (Nigam & Ghani, 2000)– ECOC + Co-Training (Ghani, 2002)

The EM Algorithm

Naïve Bayes

Learn from labeled data

Estimate labels

Probabilistically add to labeled data

E-Step

M-Step

EM Results

0

10

20

30

40

50

60

70

80

90

100

Ag

eG

rou

p

Fu

nc

tio

na

lity

Fo

rma

lity

Co

ns

erv

ati

ve

Sp

ort

ine

ss

Tre

nd

ine

ss

Bra

nd

Ap

pe

al

Baseline Naïve Bayes EM

Cla

ssif

icat

ion

Acc

ura

cy

Extremely Conservative

Double-breasted

seasonless

trouser

classic

Blazer

A Peek at the Learned Models

Not Conservative (Flashy)

leopard

chemise

straps

flirty

Informal

jean

denim

sweater

tee

Formal

jacket

skirt

lines

seam

crepe

A Peek at the Learned Models

Loungewear

chemise

silk

kimono

lounge

robe

gown

A Peek at the Learned Models

Extremely Sporty

sneaker

rubber

miraclesuit

athletic

Mesh

What can this be used for?

Applications

Example applications that we have built include:

• Recommender System

• Copywriter’s Workbench

• Competitive Comparisons

Retailer’sWeb Site

ExtractedDescriptions of Products Browsed

Product Semantics Knowledge Base

Learned Statistical Models

EvolvingUser Profile

Query the Knowledge Base fo

r

Matching Products

Recommend Matching

Products to User

Recommender System

Advantages over Traditional Recommender Systems

This approach provides us some of the underlying attributes that characterize a customer’s preference.

We can therefore begin to explain the preference rather than simply rely on the co-occurrence of purchases (e.g. people who bought x also bought y).

This helps with:• Handling new products/rapidly changing products• Low Frequency Products• Cross Category Recommendations

Cross-Category Recommendations

• Difficult for collaborative filtering and content-based systems

• Build a model of the user - personality, stylistic attributes

• Taste in clothing might also be suggestive of taste in other products, say furniture and home decoration

• Create models for different product classes and create mappings among these models

Application II

Competitive Comparison Tool

• Just as consumers may be profiled by what they buy, retailers can be profiled by what they sell

• Track and compare how the positioning of products from different retailers changes over time

• Brands can track how different retailers/stores position their products

Application III

Copywriters toolkit

• Can this system be used to help write product descriptions?

• A tool for copywriters that provides feedback to help them position a product in a particular way.

• Writers can assess their descriptions and get word recommendations

ScreenShotClassy and chic, this long-sleeve pinstripe shirt has the glamorous appeal of a 40s movie star or European songstress.

• Shirring along front button placket.

• Double-button extended cuffs.

• 3 1/2" side-seam slits.

• Cotton/polyester; dry clean.

• By BCBG Max Azria; imported.

Increase Tone : skin, flirty, low-neck, slim-fit, straps,

Summary

• “Understand” a product and hence the individual customer

• Use Text Learning (supervised and semi-supervised) to abstract from product (description) to subjective, domain-specific features to create enhanced product databases

• Create applications that have more semantic knowledge of products and can help understand consumer behavior

• Provide Data Mining algorithms with semantic attributes to operate on and build better and more domain specific models