markov chains for the web - seo, usability, search engine scoring, and more

25
Using Markov Chains to Predict User Behavior Rivka Fogel

Post on 19-Oct-2014

755 views

Category:

Marketing


2 download

DESCRIPTION

Markov chains can take predictive theory to a new level, with large-scale applications for digital marketing. From social media network modeling to user pathing, site scoring and recommended pages, Markov chains can quantify, rank, and return likely outcomes on the web. In other words, they can demystify demographics. Here's how.

TRANSCRIPT

Page 1: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Using Markov Chains to Predict User Behavior

Rivka Fogel

Page 2: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Markov Chains: Probability without History

COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 2

Andrey Markov

Rivka Fogel

Page 3: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

What Are Probability Spaces?

COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 3

Focal Object / Function Co-Domain

Function/Possibility 1

Function/Possibility 2

• Also known as stochastic processes

Rivka Fogel

Presenter
Presentation Notes
Stochastic definition: Stochastic processes are random processes that describe the evolution of a random value over time. As opposed to deterministic processes, which are just ordinary differential equations
Page 4: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Type 1: Time Series

COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 4

First Event

Function/Possibility 1

Function/Possibility 2

Time

Also called “states”

Rivka Fogel

Page 5: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Application: Personalization

• To return more accurate SERPs (E) for that user

COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 5

Identifying user-specific authorities

User E B A

C D

Rivka Fogel

Page 6: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Type 2: Spatial Field

COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 6

Shared Event

• Variable interactions are often statistically correlated

Rivka Fogel

Page 7: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Addition of The Markov Property

COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 7

E because of B or D, not because of A

B A

C D

• The probability of B causing E, as opposed to D causing E, is calculated by the Bayesian Theorem

The Next State Depends Only on the Current State:

Rivka Fogel

Page 8: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Application: (not provided)

• The Markov Property enables the marketer to model paths without knowing every state.

• While some keyphrase data is known, it can also identify the keyphrase based on other users’ paths where the keyphrase is known.

COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 8

Homepage

Keyphrase?

Bounce

Model Landing Page

Homepage Video View

Inventory

Gallery Page Video View

Rivka Fogel

Presenter
Presentation Notes
In deterministic modeling (for the web), the user keyphrase is the focal point, and all subsequent stages are based on the focal point In stochastic modeling, the Markov theorem has all stages but the focal point and preceding stage irrelevant to the current stage. You can also define the preceding stage as the focal point This means that (not provided) is irrelevant when the focal point changes from the keyword to a SERP, landing page, or behavior (see relational Markov models/user behavior) Other users’ paths: See multichannel attribution
Page 9: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Application: Multichannel Attribution

• Identify A (or predict D) via multiple probability states within a Markovian chain. COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 9

Monitoring and prediction can be based on probability of a user’s path given other users’ paths

Probability of B A Probability of C

B 1 C Known Path 1

B 2 C Known Path 2

D

4

5

Rivka Fogel

Presenter
Presentation Notes
The Markov chain formula is generative, so modeling is easily automated. Monitoring and prediction is defined by the Bayesian theorem. E.g., The probability of the hypothesis given evidence from the initial source is dependent on the probability of the hypothesis given evidence from a different source
Page 10: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Application: Audience Segmentation

COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 10

Probability of B A

Probability of C

B 1 C Known Path 1

B 2 C

D

4

5

Landing Page

Known Path 2

Referral Paths On-Site Paths

Rivka Fogel

Page 11: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Relational Markov Properties

• Relational Markov Models group multiple types of objects – relations – and calculate the probability of the relation’s appearance in a state.

• They work off of Dynamic Bayesian Networks

COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 11

Relational Markov Models allow states to be of different types.

E because of B or D’s type, not because of A or C’s type

State B

State D

Type 2 Type 1

State A

State C

Rivka Fogel

Page 12: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Application: Audience Segmentation 2

COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 12

B

1 C

2

Paid

Organic

Known

Rivka Fogel

Page 13: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Application: User Experience

COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 13

Homepage Bounce

Model Landing Page

Homepage Video View

Inventory

Gallery Page Video View

Page Visit Video View Bounce

Types:

Rivka Fogel

Presenter
Presentation Notes
For example: the probability of a user picking a landing page and then picking an object on that landing page as opposed to the probability of picking both a different object, a different landing page, and a different path entirely can be calculated. Modeled spatially, not temporally Can be combined with probabilities as well
Page 14: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Application: Social Network Modeling

• This function will answer: if the user ended up converting/visiting the landing page, which [type(s)] of social interaction[s] came into play?

COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 14

Site Landing

Page

Rich Media Play Rich Media

Host Page

User Share

Influencer

News Feed

Brand Social Profile

Rivka Fogel

Presenter
Presentation Notes
Possible only via a spatial model because the nature of the co-domain means that you’d be modeling backwards
Page 15: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Application: HTTP Service Request Prediction

• Prefetch Page A given the probability that the user will want to see it. • The keyphrase cluster is predicted by the function with co-domain B and

is then used to predict the incidence of B where the first state isn’t known.

COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 15

Probability of 3 A Keyphrase 1

1

3

2

Known Paths

Keyphrase Cluster

Keyphrase 2

Rivka Fogel

Presenter
Presentation Notes
The keyphrase cluster is post-Hummingbird
Page 16: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Application: Agent Suggestion

• Auto-suggests searches (Search C) and links (URL E) that the user is likely to want to access, based on user history and other users’ history

COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 16

URL A URL B

URL C URL D URL E

Keyphrase Cluster or Authority

First words of Query

Search A

Search B Search C

Rivka Fogel

Page 17: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Application: Search Engine Scoring

• The function identifies hubs of authority that are probable next steps in many systems (each with individual focus objects).

COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 17

Identifying Authority 2:

Page A Keyphrase Cluster

Page B

Link 2

Page C Link 1

Authority 1 Authority 2

Rivka Fogel

Page 18: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Appendix: Formal Definitions

COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 18

Page 19: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Where, Probability Spaces: • The measurable space (S, Σ) and an object on the

measurable space X • The probability space is defined by the function P, the

assignment of probabilities to events, and where Ω is the set of possible outcomes, and F is set of events in which each event has 0 or more outcomes P(x) = Σ(t1-tk)P(t1) for all X on Ω

• The finite dimensional distribution X: Xt1 Ω -> Xk

• That arrow, or the push forward measures, or the random distribution of events, or the matrix of transition probabilities P PT1(.)=PT1(.)/x = Sk

– Where the Bayesian theorem allows for: P (H|E old) = P(H)*P(H|E new)/P(E entire set)

COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 19

Rivka Fogel

Page 20: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

• P(Xl+1=S | Xl=St | Xl-1 = St-1 … X0 = S0) = P(Xl+1=S | Xl = Sl) | Xl=I – The random distribution of events is defined because the

system is finite. • So, in the matrix of transition probabilities [defined

as Pl, l+1 over ij = P(Xl+1 = j | Xl=i)], Pl is independent of l.

• That is, s^(t) = s^

(t-1)A – s is the state space, A is the matrix of transition

probabilities, and ^ is the initial probability distribution of the states in s. s(t) is the probability vector for states at time “t.”

COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 20

Then, Markov Property:

Rivka Fogel

Page 21: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Markov Restatement 1: When a User’s History is Available

• A(s, s’)=C(s,s’)/Σs’’ C(s,s’’) and ^(s)=C(s)/Σs’ C(s’) – C(s,s’) counts the instances where s’ follows s – This can be applied to HTTP prediction and agent

suggestion

COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 21

Rivka Fogel

Page 22: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Markov Restatement 2: When the Evidence Comes from a User Pool • The Markov function becomes a generative chain

link system that can store counts and probabilities • s^(t) = a0i^(t-1)A+a1i^(t-2)A2+a2i^(t-3)A3… and

= Max(a0i^(t-1)A+a1i^(t-2)A2+a2i^(t-3)A3…) – s(t) is normalized to select a list of probable states. – Where probabilities are used:

This can be applied to authority hubs as well, where collected user path traversal patterns are represented in a traversal connectivity matrix.

COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 22

Rivka Fogel

Page 23: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Markov Restatement 3: When Groupings of States Are Estimated • These are Relational Markov Models • These groupings are also seen as abstractions. A(Q) forms a

lattice of abstractions. – {D, R, Q, A, π} where D ∈ D is the tree and a hierarchy of values. R is a

set of relations. Each relation is defined by nodes on leaves of D. Q is the set of states. A is the transition probability matrix. Π is the initial probability, that is the initial state in the chain. States are defined as abstractions on Q.

– The rank of an abstraction a=R(d1, …., dk) in the lattice is defined as 1+ Σk

1 depth(dk). Depth is a node’s depth on the tree, and increases with the abstraction’s rank. The rank of Q (the most general) is 0.

• States that have nodes on common leaves will more frequently appear in abstractions together.

COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 23

Rivka Fogel

Page 24: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Further Reading • Anderson, Corin R., Domingos, Pedro, and Weld, Daniel S.

“Relational Markov Models and their Application to Adaptive Web Navigation.” Proceedings of the eighth ACM SIGKDD international conference on knowledge discovery and data mining. (2002): 143-152. Electronic. http://homes.cs.washington.edu/~pedrod/papers/kdd02a.pdf

• Downey, Allen. “Bayesian statistics made (as) simple (as possible).” Pycon US. 7 March 2012. http://pyvideo.org/video/608/bayesian-statistics-made-as-simple-as-possible

• Ildiko, Flesch and Lucas, Peter. “Markov Equivalence in Bayesian Networks.” Electronic. http://www.cs.ru.nl/P.Lucas/markoveq.pdf

• Sarukkai, Ramesh R. “Link prediction and path analysis using Markov chains.” Computer Networks 3 (June 2000): 377-386. Electronic. http://www.sciencedirect.com/science/article/pii/S138912860000044X

COPYRIGHT 2013 CATALYST. ALL RIGHTS RESERVED. JANUARY 23, 2014 | PAGE 24

Rivka Fogel

Page 25: Markov Chains for the Web - SEO, Usability, Search Engine Scoring, and More

Questions?