keeping it professional: relevance, recommendations, and reputation at linkedin
DESCRIPTION
Keeping It Professional: Relevance, Recommendations, and Reputation at LinkedIn Daniel Tunkelang (LinkedIn) LinkedIn operates the world's largest professional network on the Internet with more than 100 million members in over 200 countries. In order to connect its users to the people, opportunities, and content that best advance their careers, LinkedIn has developed a variety of algorithms that surface relevant content, offer personalized recommendations, and establish topic-sensitive reputation -- all at a massive scale. In this talk, I will discuss some of the most challenging technical problems we face at LinkedIn, and the approaches we are taking to address them. Note: This talk was presented at the Carnegie Mellon University School of Computer Science Intelligence Seminar on September 20, 2011. As of May 2013, LinkedIn has over 225 million members.TRANSCRIPT
1Recruiting SolutionsRecruiting SolutionsRecruiting Solutions
Keeping It Professional:Relevance, Recommendations, and Reputation.
Daniel TunkelangPrincipal Data Scientist at LinkedIn
Daniel
2
Overview
What is LinkedIn?
Hard problems we’re tackling in:
Relevance
Recommendations
Reputation
Open problems
3
IdentityConnect, find and be foundLinkedIn Profile, Address Book, Search
InsightsBe great at what you doHomepage, LinkedIn Today, Groups
Work wherever our members work
EverywhereMobile, APIs, Plug-InsDesktop
Rolodex, Resume, Business Card
Newspapers,
Trade Magazines, Events
What is LinkedIn?
4
Identity: Profile of Record
5
Identity: Connect with Others
6
Identity: Join the Conversation
7
Insights: Power of Aggregation
Beforeemployees worked at
Yahoo! (169)Google (96)Oracle (78)Microsoft (72)IBM (43)
Beforeemployees worked at
Google(475)Microsoft (448)LinkedIn (169)Apple, Inc.
(154)ebay (133)
8
Insights: Market Research
9
Insights: Data Stories
10
Everywhere
11
Hard Problems: Examples
Relevance
People Search
Recommendations
Job Matching
Reputation
Skills
12
People Search
13
120M+ members
2B searches in 2010
Based on (cf. http://sna-projects.com/)
People Search: Scale
14
People Search: Faceted Search
15
People Search: Network Facet
16
People Search: Type-Ahead
17
Query-Independent Signals Network Rank, Profile Quality
Query-Dependent Signals Field-Based Relevance
Personalized Signals Network Distance
People Search: Relevance
18
People Search: Query-Independent Signals
19
People Search: Network Rank
20
People Search: Profile Quality
21
People Search: SEO
22
People Search: Query-Dependent Signals
23
People Search: Inferring Structure
24
vs.
vs.
People Search: Ambiguity
25
for i in [1..n] s w1 w2 … wi
if Pc(s) > 0 a new Segment() a.segs {s} a.prob Pc(s) B[i] {a} for j in [1..i-1] for b in B[j] s wj wj+1 … wi
if Pc(s) > 0 a new Segment() a.segs b.segs U {s} a.prob b.prob * Pc(s) B[i] B[i] U {a} sort B[i] by prob truncate B[i] to size k
People Search: HMM + Segmentation
26
People Search: Personalized Signals
27
QCon 2010 presentation by John Wang on “LinkedIn
Search: Searching the Social Graph in Real Time”
http://www.infoq.com/presentations/LinkedIn-Search
SIGIR 2011 Workshop on Entity-Oriented Search
http://research.microsoft.com/en-us/um/beijing/events/eos2011/
HCIR 2011 paper by Jonathan Koren on “Faceted Search
Query Log Analysis” (forthcoming)
http://hcir.info/hcir-2011/
People Search: Further Reading
28
Job Matching
29
Job Features Job Description, Location, Similar Jobs, …
Candidate Features Profile Data, Network, Activity, …
Standardization Companies, Job Titles, Education, …
Job Matching: Overview
30
Corpus StatsJob
User Base
Filtered
titlegeocompany
industrydescriptionfunctional area
…
Candidate
Generalexpertisespecialtieseducationheadlinegeoexperience
Current Positiontitlesummarytenure lengthindustryfunctional area…
Similarity (candidate expertise, job description)
0.56Similarity
(candidate specialties, job description)
0.2
Transition probability(candidate industry, job industry)
0.43
Title Similarity
0.8
Similarity (headline, title)
0.7
.
.
.derived
Matching
Binary Exact matches: geo, industry, …
Soft transition probabilities, similarity, …
Text
Job Matching: Algorithm
Transition probabilitiesConnectivityyrs of experience to reach title education needed for this title…
31
Most people aren't looking for jobs.
Complicates evaluation, training.
Important not to offend users.
e.g., by offering Peter Norvig a postdoc.
You can’t always get what you want
Every employer wants the hottest candidates.
Job Matching: Challenges
32
KDD 2011 paper by Bekkerman & Gavish on “High-
Precision Phrase-based Document Classification”
http://www.stanford.edu/~gavish/documents/phrase_based.pdf
SIGIR 2011 paper by Cetintas et al. on “Identifying Similar
People in Professional Social Networks”
http://dl.acm.org/citation.cfm?id=2010123
Blog post on LinkedIn’s recommendation engine
http://blog.linkedin.com/2011/03/02/linkedin-products-you-may-like/
Job Matching: Further Reading
33
Skills
34
Skills: What are Skills?
35
Skills: Identifying Skills
36
Skills: Cluster and Disambiguate
angel
37
Skills: Assigning Skills to People
38
Skills: Who are the Experts?
39
• Relevance
• Combine query-independent, query-dependent,
and personalized features.
• Recommendations
• Match people to jobs, groups, news, …
• Reputation
• Expertise relative to professional skills.
Summary: The 3 Rs
40
¿Open Problems?
41
Exploratory Search
Fact retrieval
Known item search
Navigation
Transition
Verification
Question answering
Knowledge acquisition
Comprehension/Interpretation
Comparison
Aggregation/Integration
Socialize
Accretion
Analysis
Exclusion/Negation
Synthesis
Evaluation
Discovery
Planning/Forecasting
Transformation
Lookup InvestigateLearn
Exploratory Search
42
Explore / Exploit
43
Incentives for Online Reputation