Transcript
Page 1: Find and be Found: Information Retrieval at LinkedIn

Recruiting Solutions Recruiting Solutions Recruiting Solutions

formation Retrieval at LinkedIn Shakti Sinha Daniel Tunkelang Head, Search Relevance Head, Query Understanding

1

Shakti Daniel

Find and be Found:

Page 2: Find and be Found: Information Retrieval at LinkedIn

Why do 225M+ people use LinkedIn?

2

Page 3: Find and be Found: Information Retrieval at LinkedIn

Profile: the professional identity of record.

3

Page 4: Find and be Found: Information Retrieval at LinkedIn

Job recommendations.

4

Page 5: Find and be Found: Information Retrieval at LinkedIn

Publishing platform for professional content.

5

Page 6: Find and be Found: Information Retrieval at LinkedIn

Search helps members find and be found.

6

Page 7: Find and be Found: Information Retrieval at LinkedIn

Search for people,

7

Page 8: Find and be Found: Information Retrieval at LinkedIn

Search for people, jobs,

8

Page 9: Find and be Found: Information Retrieval at LinkedIn

Search for people, jobs, groups, and more.

9

Page 10: Find and be Found: Information Retrieval at LinkedIn

Every search is personalized.

10

Page 11: Find and be Found: Information Retrieval at LinkedIn

Let’s talk a bit about how it all works.

§  Query Understanding

§  Ranking More at http://data.linkedin.com/search.

11

Page 12: Find and be Found: Information Retrieval at LinkedIn

Query Understanding

12

Daniel Tunkelang Head, Query Understanding

Page 13: Find and be Found: Information Retrieval at LinkedIn

Pre-retrieval: segment and tag queries.

lucene software engineer

lucene “software engineer”

Page 14: Find and be Found: Information Retrieval at LinkedIn

LinkedIn’s focus: entity-oriented search.

14

Company

Employees

Jobs

Name Search

Page 15: Find and be Found: Information Retrieval at LinkedIn

Query tagging: key to query understanding.

§  Using human judgments to evaluate tag precision. –  Extremely accurate (> 99%) for identifying person names. –  Harder to distinguish company vs. title vs. skill (e.g., oracle dba).

§  Comparing CTR for tag matches vs. non-matches. –  Difference can be large enough to suggest filtering vs. ranking:

15

Page 16: Find and be Found: Information Retrieval at LinkedIn

Detecting navigational vs. exploratory queries.

Pre-retrieval §  Sequence of query tags.

Post-retrieval §  Distribution of scores / features.

16

Click behavior §  Title searches >50x more

likely to get 2+ clicks than name searches.

Page 17: Find and be Found: Information Retrieval at LinkedIn

Query expansion for exploratory queries.

17

software patent lawyer

Query expansions derived from reformulations.

e.g., lawyer -> attorney

Page 18: Find and be Found: Information Retrieval at LinkedIn

Understanding misspelled queries.

18

daniel tankalong infomation retrieval

marisa meyer ingenero eletrico

jonathan podemsky desenista industrail

Did you mean daniel tunkelang?

Did you mean marissa mayer?

Did you mean johnathan podemsky?

Did you mean information retrieval?

Did you mean ingeniero electrico?

Did you mean desenhista industrial?

Page 19: Find and be Found: Information Retrieval at LinkedIn

Spelling out the details.

entity data people, companies

successful queries tunkelang =>

reformulations marisa => marissa

n-grams dublin => du ub bl li in

metaphones mark/marc => MRK

word pairs johnathan podemsky

INDEX } { marisa meyer yoohoo

marissa

marisa

meyer

mayer

yahoo

yoohoo

19

Page 20: Find and be Found: Information Retrieval at LinkedIn

Ranking

20

Shakti Sinha Head, Search Relevance

Page 21: Find and be Found: Information Retrieval at LinkedIn

LinkedIn search is personalized.

21

kevin scott

Page 22: Find and be Found: Information Retrieval at LinkedIn

But global factors matter.

22

Page 23: Find and be Found: Information Retrieval at LinkedIn

Relevant results can be in or out of network.

23

§  Searcher’s network matters for relevance. –  Within network results have higher CTR.

§  But the network is not enough. –  About two thirds of search clicks come from out of

network results.

Page 24: Find and be Found: Information Retrieval at LinkedIn

Personalized machine-learned ranking.

24

§  Data point is a triple (searcher, query, document). –  Searcher features are important!

§  Labels: Is this document relevant to the query and the user? –  Depends on the user’s network, location, etc. –  Too much to ask random person to judge.

§  Training data has to be collected from search logs.

Page 25: Find and be Found: Information Retrieval at LinkedIn

Search log data has biases.

25

§  Presentation bias –  Results shown higher tend to get clicked more often. –  Use FairPairs [Radlinski and Joachims, AAAI’06].

not flipped

flipped

flipped

Clicked!

training data

Page 26: Find and be Found: Information Retrieval at LinkedIn

Search log data has biases.

26

§  Sample bias –  User clicks or skips only what is shown. –  What about low scoring results from existing model? –  Add low-scoring results as ‘easy negatives’ so model

learns bad results not presented to user.

label 0

label 0

label 0

label 0

page 1 page 2 page 3 page n

Page 27: Find and be Found: Information Retrieval at LinkedIn

27

How to train your model.

Page 28: Find and be Found: Information Retrieval at LinkedIn

How to train your model.

28

§  Train simple models to resemble complex ones. –  Build Additive Groves model [Sorokina et al, ECML ’07],

which is good at detecting interactions. §  Build tree with logistic regression leaves.

§  By restricting tree to user and query features, only regression model evaluated for each document.

β0 +β1T (x1)+...+βn xn

α0 +α1P(x1)+...+αnQ(xn )

X2=?

X10< 0.1234 ?

γ0 +γ1R(x1)+...+γnQ(xn )

Page 29: Find and be Found: Information Retrieval at LinkedIn

Take-Aways

§  LinkedIn’s search problem is unique because of deep role of personalization – users are integral part of the corpus.

§  Query understanding allows us to optimize for entity-oriented search against semi-structured content.

§  Ranking requires us to contextually apply global and personalized user, query, and document features.

29

Page 30: Find and be Found: Information Retrieval at LinkedIn

Thank you!

30

225,

Page 31: Find and be Found: Information Retrieval at LinkedIn

Want to learn more?

§  Check out http://data.linkedin.com/search.

§  Contact us:

–  Shakti: [email protected] http://linkedin.com/in/sdsinha

–  Daniel: [email protected] http://linkedin.com/in/dtunkelang

–  Asif: [email protected] http://linkedin.com/in/asifmakhani

§  Did we mention that we’re hiring?

31


Top Related