data by the people, for the people

33
Recruiting Solutions Data By The People, For The People Daniel Tunkelang Director, Data Science LinkedIn Daniel 1

Upload: daniel-tunkelang

Post on 09-May-2015

15.920 views

Category:

Technology


4 download

DESCRIPTION

Data By The People, For The People Daniel Tunkelang Director, Data Science at LinkedIn Invited Talk at the 21st ACM International Conference on Information and Knowledge Management (CIKM 2012) LinkedIn has a unique data collection: the 175M+ members who use LinkedIn are also the content those same members access using our information retrieval products. LinkedIn members performed over 4 billion professionally-oriented searches in 2011, most of those to find and discover other people. Every LinkedIn search and recommendation is deeply personalized, reflecting the user's current employment, career history, and professional network. In this talk, I will describe some of the challenges and opportunities that arise from working with this unique corpus. I will discuss work we are doing in the areas of relevance, recommendation, and reputation, as well as the ecosystem we have developed to incent people to provide the high-quality semi-structured profiles that make LinkedIn so useful. Bio: Daniel Tunkelang leads the data science team at LinkedIn, which analyzes terabytes of data to produce products and insights that serve LinkedIn's members. Prior to LinkedIn, Daniel led a local search quality team at Google. Daniel was a founding employee of faceted search pioneer Endeca (recently acquired by Oracle), where he spent ten years as Chief Scientist. He has authored fourteen patents, written a textbook on faceted search, created the annual workshop on human-computer interaction and information retrieval (HCIR), and participated in the premier research conferences on information retrieval, knowledge management, databases, and data mining (SIGIR, CIKM, SIGMOD, SIAM Data Mining). Daniel holds a PhD in Computer Science from CMU, as well as BS and MS degrees from MIT.

TRANSCRIPT

Page 1: Data By The People, For The People

Recruiting Solutions Recruiting Solutions Recruiting Solutions

Data By The People, For The People Daniel Tunkelang Director, Data Science LinkedIn

Daniel

1

Page 2: Data By The People, For The People

Why do 175M+ people use LinkedIn?

2

Page 3: Data By The People, For The People

Identity: find and be found

3

Page 4: Data By The People, For The People

Insights: discover and share knowledge

4

Page 5: Data By The People, For The People

People use LinkedIn because of other people.

5

Page 6: Data By The People, For The People

People as Users + People as Data

Unique opportunities and challenges! §  Search §  Recommendations §  Networking

6

Page 7: Data By The People, For The People

Search

7

Page 8: Data By The People, For The People

People search is personal!

8

Page 9: Data By The People, For The People

But not all relevance factors are personal.

9

Good Bad

Page 10: Data By The People, For The People

People are semi-structured objects.

10 10

for i in [1..n]! s ← w1 w2 … wi! if Pc(s) > 0! a ← new Segment()! a.segs ← {s}! a.prob ← Pc(s)! B[i] ← {a}! for j in [1..i-1]! for b in B[j]! s ← wj wj+1 … wi! if Pc(s) > 0! a ← new Segment()! a.segs ← b.segs U {s}! a.prob ← b.prob * Pc(s)! B[i] ← B[i] U {a}! sort B[i] by prob! truncate B[i] to size k!

Page 11: Data By The People, For The People

LinkedIn uses scale to derive structure.

11 11

Software Developer

Page 12: Data By The People, For The People

Social network is more than a ranking signal.

12 12

Page 13: Data By The People, For The People

People are a gateway to other entities.

13 13

Page 14: Data By The People, For The People

Search: Summary

14

People finding people.

People being found.

People finding content.

Through other people.

Page 15: Data By The People, For The People

Recommendations

15 15

Page 16: Data By The People, For The People

Recommendation products at LinkedIn

16 16

Similar Profiles

Events You May Be Interested In

News

Network updates

Connections

Page 17: Data By The People, For The People

LinkedIn’s recommender ecosystem

17

Recommendations drive:

> 50% of connections > 50% of job applications > 50% of group joins

Page 18: Data By The People, For The People

Inputs for recommender systems

18

Content Social Graph

Behavior

Page Views Actions

Queries

Page 19: Data By The People, For The People

Jobs You Might Be Interested In

19

Page 20: Data By The People, For The People

How LinkedIn matches people to jobs

20

Corpus Stats

Job

User Base

Filtered

title geo company

industry description functional area

Candidate

General expertise specialties education headline geo experience

Current Position title summary tenure length industry functional area …

Similarity (candidate expertise, job description)

0.56 Similarity

(candidate specialties, job description)

0.2 Transition probability

(candidate industry, job industry)

0.43

Title Similarity

0.8

Similarity (headline, title)

0.7 . . .

derived

Matching Binary Exact matches: geo, industry, … Soft transition probabilities, similarity, … Text

Transition probabilities Connectivity yrs of experience to reach title education needed for this title …

Page 21: Data By The People, For The People

Is job-hunting socially contagious?

21

[Posse, 2012]

Page 22: Data By The People, For The People

Social referral

22

Suggest based on connection strength and relevance to target user.

2x conversion!

[Amin et al, 2012]

Page 23: Data By The People, For The People

Suggested skill endorsements

23

Page 24: Data By The People, For The People

Recommendations: Summary

24 24

Content is king.

Connections provide social dimension.

Context determines where and when a recommendation is appropriate.

Page 25: Data By The People, For The People

Networking

25

Page 26: Data By The People, For The People

People You May Know

26

Page 27: Data By The People, For The People

Closing the triangles

§  Triads suggest and affect relationships. [Simmel, 1908], [Granovetter, 1973]

§  Triangle closing is a Big Data problem. [Shah, 2011]

§  Use machine learning to rank candidates. 27

Alice

Bob

Carol

?

Page 28: Data By The People, For The People

Shared connections as a signal

28

Page 29: Data By The People, For The People

Power of social proof

29

Page 30: Data By The People, For The People

More power of social proof

30

Page 31: Data By The People, For The People

Networking: Summary

31

Close triangles to suggest connections.

Connections as social proof.

Unleash the power of weak ties.

Page 32: Data By The People, For The People

Conclusion

§  People use LinkedIn because of other people. §  Primary use cases:

– Find and be found. – Discover and share knowledge.

§  People are at the heart of LinkedIn’s products: – Search – Recommendations – Networking

32

Page 33: Data By The People, For The People

2 4 8

17

32

55

90

2004 2005 2006 2007 2008 2009 2010 2011 LinkedIn Members (Millions)

175M+

25th Most visit website worldwide (Comscore 6-12)

Company pages

>2M

62% non U.S.

2/sec

85% Fortune 500 Companies use LinkedIn to hire

Thank You!

33

We’re

Hiring!

Learn more at http://data.linkedin.com/