graphical models for the internet amr ahmed and alexander smola yahoo research, santa clara, ca

Post on 30-Mar-2015

215 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Graphical Models for the Internet

Amr Ahmed and Alexander SmolaYahoo Research, Santa Clara, CA

Thus far ...

• Motivation• Basic tools– Clustering– Topic Models

• Distributed batch inference– Local and global states– Star synchronization

Up next

• Inference – Online Distributed Sampling– Single machine multi-threaded inference– Online EM and Submodular Selection

• Applications– User tracking for behavioral Targeting– Content understanding– User modeling for content recommendation

4. Online Model

Scenarios• Batch Large-Scale

– Covered in part 1

• Mini-batches– We already have a model– Data arrives in batches– We would like to keep model up-to-data

• Time-sensitive– Data arrives one item at a time– Model should be up-to-data

Time

Time

4.1 Dynamic Clustering

The Chinese Restaurant Process• Allows the number of mixtures to grow with the data• They are called non-parametric models– Means the number of effective parameters grow with data– Still have hyper-parameters that control the rate of growth

• a: how fast a new cluster/mixture is born?• G0: Prior over mixture component parameters

The Chinese Restaurant Process

-For data point xi - Choose table j mj and Sample xi ~ f(fj)- Choose a new table K+1 a

- Sample fK+1 ~ G0 and Sample xi ~ f(fK+1)

Generative Process

f2f1 f3

62

63 6

1

6

The rich gets richer effectCANNOT handle sequential data

Recurrent CRP (RCRP) [Ahmed and Xing 2008]

• Adapts the number of mixture components over time– Mixture components can die out– New mixture components are born at any time– Retained mixture components parameters evolve according

to a Markovian dynamics

The Recurrent Chinese Restaurant Process

f2,1f1,1 f3,1 T=1

Dish eaten at table 3 at time epoch 1OR the parameters of cluster 3 at time epoch 1

-Customers at time T=1 are seated as before:- Choose table j mj,1 and Sample xi ~ f(fj,1)- Choose a new table K+1 a

- Sample fK+1,1 ~ G0 and Sample xi ~ f(fK+1,1)

Generative Process

The Recurrent Chinese Restaurant Process

f2,1f1,1 f3,1 T=1

T=2

f2,1f1,1 f3,1

m'1,1=2 m'2,1=3 m'3,1=1

f2,1f1,1 f3,1 T=1

T=2f2,1f1,1 f3,1

m'1,1=2 m'2,1=3 m'3,1=1

f2,1f1,1 f3,1 T=1

T=2f2,1f1,1 f3,1

62 6

3 61

6

m'1,1=2 m'2,1=3 m'3,1=1

f2,1f1,1 f3,1 T=1

T=2f2,1f1,1 f3,1

62

m'1,1=2 m'2,1=3 m'3,1=1

f2,1f1,1 f3,1 T=1

T=2f2,1f1,2 f3,1

Sample f1,2 ~ P(.| f1,1)

m'1,1=2 m'2,1=3 m'3,1=1

f2,1f1,1 f3,1 T=1

T=2f2,1f1,2 f3,1

16

21 16

3 16

1

16

And so on ……

m'1,1=2 m'2,1=3 m'3,1=1

f2,1f1,1 f3,1 T=1

T=2f2,2f1,2 f3,1

At the end of epoch 2

f4,2

Newly born clusterDied out cluster

m'1,1=2 m'2,1=3 m'3,1=1

f2,1f1,1 f3,1 T=1

T=2f2,2f1,2 f3,1

N1,1=2 N2,1=3 m'3,1=1

f4,2

T=3

f2,2f1,2 f4,2

m'4,2=1m'2,2=2m'1,2=2

Recurrent Chinese Restaurant Process• Can be extended to model higher-order

dependencies• Can decay dependencies over time– Pseudo-counts for table k at time t is

History size

Decay factory Number of customers sitting at table K at time epoch t-h

å=

-

-

÷ø

öçè

æH

hhtk

h

me1

,r

f2,1f1,1 f3,1

T=1

T=2f2,2f1,2 f3,1

m'1,1=2 m'2,1=3 m'3,1=1f4,2

T=3f2,2f1,2 f4,2

å=

-

-

÷ø

öçè

æH

hhtk

h

me1

,r

m'2,3

m'2,3 =

4.2 Online Distributed Inference

Tracking Users Interest

Characterizing User Interests• Short term vs long-term

Jan April July Oct

Music

Housing

Furniture

Buying a car

Travel plans

Characterizing User Interests• Short term vs long-term• Latent

Jan April July Oct

millage

fast

mortgageGaga

seafood

Barcelona

used

Problem formulationInput

• Queries issued by the user or tags of watched content• Snippet of page examined by user• Time stamp of each action (day resolution)

Output

• Users’ daily distribution over interests• Dynamic interest representation• Online and scalable inference• Language independent

FlightLondonHotelweather

SchoolSuppliesLoansemester

classesregistrationhousingrent

Problem formulation

FlightLondonHotelweather

Travel

BackTo school

financeSchoolSuppliesLoansemester

classesregistrationhousingrent

Input

• Queries issued by the user or tags of watched content• Snippet of page examined by user• Time stamp of each action (day resolution)

Output

• Users’ daily distribution over interests• Dynamic interest representation• Online and scalable inference• Language independent

Problem formulation

FlightLondonHotelweather

Travel

BackTo school

financeSchoolSuppliesLoansemester

classesregistrationhousingrent

When to show a financing ad?

Problem formulation

FlightLondonHotelweather

Travel

BackTo school

financeSchoolSuppliesLoansemester

classesregistrationhousingrent

When to show a financing ad?

Problem formulation

FlightLondonHotelweather

Travel

BackTo school

financeSchoolSuppliesLoansemester

classesregistrationhousingrent

When to show a financing ad?

Problem formulation

FlightLondonHotelweather

Travel

BackTo school

financeSchoolSuppliesLoansemester

classesregistrationhousingrent

When to show a hotel ad?

Problem formulation

FlightLondonHotelweather

Travel

BackTo school

financeSchoolSuppliesLoansemester

classesregistrationhousingrent

When to show a hotel ad?

Problem formulation

FlightLondonHotelweather

Travel

BackTo school

financeSchoolSuppliesLoansemester

classesregistrationhousingrent

Input

• Queries issued by the user or tags of watched content• Snippet of page examined by user• Time stamp of each action (day resolution)

Output

• Users’ daily distribution over interests• Dynamic interest representation• Online and scalable inference• Language independent

Mixed-Membership Formulation

job Career

BusinessAssistant

HiringPart-time

Receptionist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Diet Cars Job Finance

Objects

Mixtures

Degree ofmembership

Job Hiring speed price

part-time CamryCareer opening bonus package

card diet caloriesloan recipe milk

Weight lb kg

In Graphical Notation

In Polya-Urn Representation

• Collapse multinomial variables:• Fixed-dimensional Hierarchal Polya-Urn

representation– Chinese restaurant franchise

xx

Food Chicken

job Career

BusinessAssistant

HiringPart-timeReceptio

nist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Car speed offercamry accord career

Global topics trends

Topic word-distributions

User-specific topics trends(mixing-vector)

User interactions: queries, keyword from pages viewed

Food Chicken

Car speed offercamry accord career

• For each user interaction• Choose an intent from local distribution

• Sample word from the topic’s word-distribution

• Choose a new intent l • Sample a new intent from the global distribution• Sample word from the new topic word-

distribution

Generative Process

………

job Career

BusinessAssistant

HiringPart-timeReceptio

nist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Food Chickenpizza

Car speed offercamry accord career

• For each user interaction• Choose an intent from local distribution

• Sample word from the topic’s word-distribution

• Choose a new intent l • Sample a new intent from the global distribution• Sample word from the new topic word-

distribution

Generative Process

………

job Career

BusinessAssistant

HiringPart-timeReceptio

nist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Food Chickenpizza

Car speed offercamry accord career

• For each user interaction• Choose an intent from local distribution

• Sample word from the topic’s word-distribution

• Choose a new intent l • Sample a new intent from the global distribution• Sample word from the new topic word-

distribution

Generative Process

………

job Career

BusinessAssistant

HiringPart-timeReceptio

nist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Food Chickenpizza hiring

Car speed offercamry accord career

• For each user interaction• Choose an intent from local distribution

• Sample word from the topic’s word-distribution

• Choose a new intent l • Sample a new intent from the global distribution• Sample word from the new topic word-

distribution

Generative Process

………

job Career

BusinessAssistant

HiringPart-timeReceptio

nist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Food Chickenpizza millage

Car speed offercamry accord career

• For each user interaction• Choose an intent from local distribution

• Sample word from topic’s word-distribution

• Choose a new intent l • Sample a new intent from the global distribution• Sample from word the new topic word-

distribution

Generative Process

job Career

BusinessAssistant

HiringPart-timeReceptio

nist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Food Chickenpizza millage

Car speed offercamry accord career

job Career

BusinessAssistant

HiringPart-timeReceptio

nist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Problems- Static Model- Does not evolve user’s interests- Does not evolve the global trend of interests- Does not evolve interest’s distribution over terms

Food Chickenpizza millage

Car speed offercamry accord career

job Career

BusinessAssistant

HiringPart-timeReceptio

nist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

At time t At time t+1

Build a dynamic model

Connect each level using a RCRP

At time t At time t+1 At time t+2 At time t+3

User 1process

User 2process

User 3process

Globalprocess

mm'

nn'

Which time kernel to use at each level?

Pseudo counts = *

Decay factor

Food Chickenpizza millage

Car speed offercamry accord career

At time t

Observation 1-Popular topics at time t are likely to be popular at time t+1- fk,t+1 is likely to smoothly evolve from fk,t

At time t+1job

CareerBusinessAssistant

HiringPart-timeReceptio

nist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Food Chickenpizza millage

Car speed offercamry accord career

At time t At time t+1job

CareerBusinessAssistant

HiringPart-timeReceptio

nist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Observation 1-Popular topics at time t are likely to be popular at time t+1- fk,t+1 is likely to smoothly evolve from fk,t

CarAltimaAccordBookKelleyPricesSmallSpeed

fk,t+1 ~ Dir(bk,t+1)fk,t

CarBlueBookKelleyPricesSmallSpeedlarge

Intuition

Captures current trend of the car industry

(new release for e.g.)

~

Food Chickenpizza millage

Car speed offercamry accord career

At time t At time t+1

Observation 2- User prior at time t+1 is a mixture of the user short and long term interest

How do we get a prior that captures both long

and short term interest?

job Career

BusinessAssistant

HiringPart-timeReceptio

nist

CarAltimaAccord

BlueBookKelleyPricesSmallSpeed

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

All

week

job Career

BusinessAssistant

HiringPart-time

Receptionist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

month

Time t t+1

FoodChickenpizza

recipejobhiring

Part-timeOpeningsalary

foodchickenPizzamillage

Kellyrecipecuisine

Diet Cars Job Finance

Prior for user actions at time t

μ

μ2

μ3

Long-termshort-term

Food ChickenPizza millage

Car speed offercamry accord career

At time t At time t+1

• For each user interaction• Choose an intent from local distribution

• Sample word from the topic’s word-distribution

• Choose a new intent l • Sample a new intent from the global distribution• Sample word from the new topic word-

distribution

Generative Process

short-termpriors

job Career

BusinessAssistant

HiringPart-timeReceptio

nist

CarAltimaAccord

BlueBookKelleyPricesSmallSpeed

BankOnlineCreditCarddebt

portfolioFinanceChase

RecipeChocolate

PizzaFood

ChickenMilk

ButterPowder

Polya-UrnRCRF Process

?

Simplified Graphical Model

~

~

~

~

~

~

At time t At time t+1

Simplified Graphical Model

~

~

~

~

~

~

CarAltimaAccordBookKelleyPricesSmallSpeed

CarBlueBookKelleyPricesSmallSpeedlarge

Simplified Graphical Model

~

~

~

~

~

~

Food ChickenPizza millage

Simplified Graphical Model

~

~

~

~

~

~

~

~

~

~

~

~

Simplified Graphical Model

Topics evolve over time?

User’s intent evolve over time?

Capture long and term interests of users?

User 1process

User 2process

User 3process

Globalprocess

At time t At time t+1 At time t+2 At time t+3

mm'

nn'

4.2 Online Distributed Inference

Work Flow

Work Flowtoday

CurrentUsers’ models

Systemstate

newUsers’ models

User interactions

User interactions

User interactions

User interactions

User interactions

Daily Update

(inference)

Hundred of millions

~

~

~

~

~

~

Online Scalable Inference• Online algorithm– Greedy 1-particle filtering algorithm– Works well in practice – Collapse all multinomials except Ωt

• This makes distributed inference easier

– At each time t:

• Distributed scalable implementation– Used first part architecture as a subroutine– Added synchronous sampling capabilities

Distributed Inference (at time t)

W

f

q

z

w

clientclient

Distributed Inference (at time t)

q

z

w

q

z

w

W

f

Collapse all multinomial

After collapsing

Food ChickenPizza millage

job CareerBusines

sAssistan

tHiringPart-time

Receptionist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolio

Finance

Chase

RecipeChocol

atePizzaFood

Chicken

MilkButterPowde

r

client client

Car speed offercamry accord career

Use Star-Synchronization

Fully Collapsed

Food ChickenPizza millage

job CareerBusines

sAssistan

tHiringPart-time

Receptionist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolio

Finance

Chase

RecipeChocol

atePizzaFood

Chicken

MilkButterPowde

r

Shared memory

client client

Car speed offercamry accord career

Distributed Inference (at time t)

Food ChickenPizza millage

client client

Car speed offercamry accord career

Local trendGlobal trend Topic factor

Semi-Collapsed

Food ChickenPizza millage

job CareerBusines

sAssistan

tHiringPart-time

Receptionist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolio

Finance

Chase

RecipeChocol

atePizzaFood

Chicken

MilkButterPowde

r

Shared memory

client client

Car speed offercamry accord career

Semi-Collapsed

Food ChickenPizza millage

job CareerBusines

sAssistan

tHiringPart-time

Receptionist

CarBlueBookKelleyPricesSmallSpeedlarge

BankOnlineCreditCarddebt

portfolio

Finance

Chase

RecipeChocol

atePizzaFood

Chicken

MilkButterPowde

r

Shared memory

client client

ΩtΩtΩt

Car speed offercamry accord career

Semi-Collapsed

Food ChickenPizza millage

client client

ΩtΩt

Car speed offercamry accord career

Distributed Sampling Cycle

Sample ZFor users

Sample ZFor users

Sample ZFor users

Sample ZFor users

Barrier

Write counts to

memcached

Write counts to

memcached

Write counts to

memcached

Write counts to

memcached

Collect counts and sample W

Do nothing Do nothing Do nothing

Barrier

Read W from memcached

Read Wfrom

memcached

Read Wfrom

memcached

Read Wfrom

memcached

Sample Ωt

Requires a reduction step

Distributed Sampling Cycle

Sample ZFor users

Sample ZFor users

Sample ZFor users

Sample ZFor users

Barrier

Write counts Write counts Write counts Write counts

Collect counts and sample W

Do nothing Do nothing Do nothing

Barrier

Read W from Read Wfrom

Read Wfrom

Read Wfrom

Sampling W

• Introduce auxiliary variable mkt

– How many times the global distribution was visited – ~ AnotniaK

W

m

Distributed Sampling Cycle

Sample ZFor users

Sample ZFor users

Sample ZFor users

Sample ZFor users

Barrier

Write counts Write counts Write counts Write counts

Collect counts and sample W

Do nothing Do nothing Do nothing

Barrier

Read W from Read Wfrom

Read Wfrom

Read Wfrom

4.2 Online Distributed Inference

Behavioral Targeting

• Tasks is predicting convergence in display advertising• Use two datasets– 6 weeks of user history– Last week responses to Ads are used for testing

• Baseline:– User raw data as features– Static topic model

Experimental Results

InterpretabilityUser-1 User-2

Performance in Display Advertising

ROC

Number of conversions

Performance in Display Advertising

Weighted ROC measure

StaticBatch modelsEffect of number of topics

How Does It Scale?

2 Billion instances with 5M vocabularyusing 1000 machines

one iteration took ~ 3.8 minutes

Distributed Inference Revisited

To collapse or not to collapse?• Not collapsing– Keeps conditional independence

• Good for parallelization• Requires synchronous sampling

– Might mix slowly

• Collapsing – Mixes faster– Hinder parallelism– Use star-synchronization

• Works well if sibling depends on each others via aggregates

• Requires asynchronous communication

Inference Primitive

• Collapse a variable– Star synchronization for the sufficient statistics

• Sampling a variable– Local • Sample it locally (possibly using the synchronized statistics)

– Shared• Synchronous sampling using a barrier

• Optimizing a variable– Same as in the shared variable case– Ex. Conditional topic models

Online Models• Batch Large-Scale

– Covered in part 1

• Mini-batches– We already have a model– Data arrives in batches– We would like to keep model up-to-data

• Time-sensitive– Data arrives one item at a time– Model should be up-to-data

Time

Time

What Is Coming?

• Inference – Online Distributed Sampling– Single machine multi-threaded inference– Online EM and Submodular Selection

• Applications– User tracking for behavioral Targeting– Content understanding– User modeling for content recommendation

4.2 Scalable SMC Inference

Storylines

News Stream

News Stream

• Realtime news stream – Multiple sources (Reuters, AP, CNN, ...)– Same story from multiple sources– Stories are related

• Goals– Aggregate articles into a storyline– Analyze the storyline (topics, entities)

• How does the story develop over time?• Who are the main entities?• What topics are addressed?

A Unified Model• Jointly solves the three main tasks– Clustering, – Classification – Analysis

• Building blocks– A Topic model

• High-level concepts (unsupervised classification)

– Dynamic clustering (RCRP)• Discover tightly-focused concepts

– Named entities– Story developments

Infinite Dynamic Cluster-Topic HybridSports

gamesWonTeamFinal

SeasonLeague

held

Politics

GovernmentMinister

AuthoritiesOpposition

OfficialsLeadersgroup

Unrest

PoliceAttack

runman

grouparrested

move

UEFA-soccer

ChampionsGoalCoachStrikerMidfieldpenalty

Juventus AC Milan Lazio RonaldoLyon

g

Tax-Bill

TaxBillionCutPlanBudgetEconomy

BushSenateFleischerWhite HouseRepublican

Infinite Dynamic Cluster-Topic HybridSports

gamesWonTeamFinal

SeasonLeague

held

Politics

GovernmentMinister

AuthoritiesOpposition

OfficialsLeadersgroup

Unrest

PoliceAttack

runman

grouparrested

move

UEFA-soccer

ChampionsGoalCoachStrikerMidfieldpenalty

Juventus AC Milan Lazio RonaldoLyon

g

Tax-Bill

TaxBillionCutPlanBudgetEconomy

BushSenateFleischerWhite HouseRepublican

Border-Tension

NuclearBorderDialogueDiplomaticmilitantInsurgencymissile

PakistanIndiaKashmirNew DelhiIslamabadMusharrafVajpayee

The Graphical Model

• Topic model• Topics per cluster• RCRP for cluster• Hierarchical DP for

article• Separate model for

named entities• Story specific

correction

4.2 Fast SMC Inference

Inference via SMC

Online Inference Algorithm

• A Particle filtering algorithm• Each particle maintains a hypothesis

– What are the stories– Document-story associations– Topic-word distributions

• Collapsed sampling– Sample (zd,sd) only for each document

Particle Filter Representation

Fold the document into the structure of each filter

- s and z are tightly coupled- Alternatives

- Sample s then sample z (high variance)

w w w w w wentitiesDocument td

z z z z z zs

Fold the document into the structure of each filter

- s and z are tightly coupled- Alternatives

- Sample s then sample z (high variance)- Sample z then sample s (doesn’t make sense)

w w w w w wentitiesDocument td

z z z z z z s

Fold the document into the structure of each filter

- s and z are tightly coupled- Alternatives

- Sample s then sample z (high variance)- Sample z then sample s (doesn’t make sense)

- Idea - Run a few iterations of MCMC over s and z- Take last sample as the proposed value

How good each filter look now?

Get rid of bad filterReplicate good one

Get rid of bad filterReplicate good one

Efficient Computation and Storage• Particles get replicated

1

State P1

1 2

State P2State P1

1 23

State P2State P3 State P1

Efficient Computation and Storage• Particles get replicated– Use thread-safe Inheritance tree

1

State P1

1 2

State P2State P1

1 23

State P2State P3 State P1

Efficient Computation and Storage• Particles get replicated– Use thread-safe Inheritance tree [extends Canini et. Al 2009]

1

state

1

Efficient Computation and Storage• Particles get replicated– Use thread-safe Inheritance tree [extends Canini et. Al 2009]

Root

state

2

state

1

State1

1 2

Efficient Computation and Storage• Particles get replicated– Use thread-safe Inheritance tree [extends Canini et. Al 2009]

Root

3

state

1

2

State State

state

State1

1 2

1 23

Efficient Computation and Storage• Particles get replicated– Use thread-safe Inheritance tree [extends Canini et. Al 2009]

– Inverted representation for fast lookup

Root

3

India: story 5Pakistan: story 1

1

2

(empty) Congress: story 2

Bush: story 2India: story 3

India: story 1, US: story 2, story 3

1

1 2

1 23

Efficient Computation and Storage

• Why this is useful?

• Only focus on stories that mention at least one entity– Otherwise pre-compute and reuse

• We can use fast samplers for z as well [Yao et. Al. KDD09]

Experiments

• Yahoo! News datasets over two months– Three sub-sampled sets with different characteristics

• Editorially-labeled documents– Cannot-like and must-link pairs

• Performance measures using clustering accuracy• Baseline– A strong offline Correlation clustering algorithm [WSDM 11]

• Scaled with LSH to compute neighborhood graph (similar to Petrovic 2010)

Structured BrowsingSports

gamesWonTeamFinal

SeasonLeague

held

Politics

GovernmentMinister

AuthoritiesOpposition

OfficialsLeadersgroup

Unrest

PoliceAttach

runman

grouparrested

move

Border-Tension

NuclearBorderDialogueDiplomaticmilitantInsurgencymissile

PakistanIndiaKashmirNew DelhiIslamabadMusharrafVajpayee

UEFA-soccer

ChampionsGoalLegCoachStrikerMidfieldpenalty

Juventus AC Milan Real Madrid Milan Lazio RonaldoLyon

Tax-bills

TaxBillionCutPlanBudgetEconomylawmakers

BushSenateUSCongressFleischerWhite HouseRepublican

Structured Browsing

More Like India-Pakistan storyMore Like India-Pakistan story

Middle-east-conflict

PeaceRoadmapSuicideViolenceSettlementsbombing

Israel PalestinianWest bankSharonHamasArafat

Based on topics

Nuclear programs

Nuclearsummitwarningpolicymissileprogram

North KoreaSouth KoreaU.SBushPyongyang

Nuclear+ topics [politics]

Border-Tension

NuclearBorderDialogueDiplomaticmilitantInsurgencymissile

PakistanIndiaKashmirNew DelhiIslamabadMusharrafVajpayee

Structured BrowsingSports

gamesWonTeamFinal

SeasonLeague

held

Politics

GovernmentMinister

AuthoritiesOpposition

OfficialsLeadersgroup

Unrest

PoliceAttach

runman

grouparrested

move

Border-Tension

NuclearBorderDialogueDiplomaticmilitantInsurgencymissile

PakistanIndiaKashmirNew DelhiIslamabadMusharrafVajpayee

UEFA-soccer

ChampionsGoalLegCoachStrikerMidfieldpenalty

Juventus AC Milan Real Madrid Milan Lazio RonaldoLyon

Tax-bills

TaxBillionCutPlanBudgetEconomylawmakers

BushSenateUSCongressFleischerWhite HouseRepublican

More on Personalization later on the talk

Quantitative Evaluation

Number of topics = 100

Effect of number of topics

Scalability

Model Contribution

• Named entities are very important• Removing time increase processing up to 2

seconds per document

Putting Things Together

Time vs. Machines

• Data arrives dynamically• How to keep models up to date?

Batch Mini-batches Truly online

Single-Machine GibbsVariational

Online-LDA SMC

Multi-Machine Star-Synch. Star-Synch + Synchronous step

?

4.3 User Preference

Online EM and Submodularity

Storyline Summarization

• How to summarize a storyline with few articles?

Earthquake & Tsunami Nuclear PowerRescue & Relief Economy

Time

Storyline SummarizationEarthquake & Tsunami Nuclear PowerRescue & Relief Economy

Time

• How to summarize a storyline with few articles?• How to personalize the summary?

Storyline SummarizationEarthquake & Tsunami Nuclear PowerRescue & Relief Economy

Time

• How to summarize a storyline with few articles?• How to personalize the summary?

User Interaction

• Passive– We observe the user generated contents– Model user based on those content using unsupervised techniques

• Explicit– We present users with content– User give explicit feedback

• Like/dislike

– Learn user preference using supervised techniques• Implicit

– Mixture between the two– Present the user with items– Observe which items the user interact with– Learning user preference using semi-supervised models

User Satisfaction

• Modular– Present users with items she prefers

• Regardless of the context

– Targets relevance– Ex: vector space models

• Submodular– More of the same thing is not always better

• Dimensioning return

– Targets diversity – Ex: TDN [ElArini et. Al. KDD 09]

Sequential Click-View Model

Modeling Views based on position

Sequential Click-View Model

Threshold Information gain#clicks

Modeling clicks using position and information gain

Coverage function’s weights are learnt

Sequential Click-View Model

Selected Summary

Story

Features

Modular

Submodular

Online Inference

• Treat missing views as hidden variables– Realistic interaction model

• Use the online EM algorithm– Infer the value of hidden variables

• Optimize parameters using SGD– Use additive weights• Background + story + category + user

Online Inference

How Does it Work?

How Does It Work?

5. Summary Future Directions

Summary• Tools

– Load distribution, balancing and synchronization– Clustering, Topic Models

• Models– Dynamic non-parametric models– Sequential latent variable models

• Inference Algorithms– Distributed batch– Sequential Monte Carlo

• Applications– User profiling– News content analysis & recommendation

Future Directions

• Theoretical bounds and guarantees• Network data– Graph partitioning

• Non-parametric models– Learning structure from data

• Working under communication constraints• Data distribution for particle filters

top related