predicting click through rate for job listings manish gupta yahoo! hotjobs jan 22, 2009
TRANSCRIPT
![Page 1: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/1.jpg)
Predicting Click Through Rate for Job Listings
Manish Gupta
Yahoo! HotJobs
Jan 22, 2009
![Page 2: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/2.jpg)
![Page 3: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/3.jpg)
CTR and its applications
• CTR = Ratio of clicks to get full description of entity to views of a reduced version
• Rank results• Impacts publisher revenue in pay for perf
models• Bidding in ad exchanges• Trends can help detect click frauds
![Page 4: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/4.jpg)
![Page 5: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/5.jpg)
![Page 6: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/6.jpg)
CTR for new job listings
• Avg CTR = 2.29%• MLE would have high variance
![Page 7: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/7.jpg)
CTR for job listings
![Page 8: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/8.jpg)
Related work• Regelson and Fain – Estimate CTR using topic clusters (job categories)
• Richardson et. al.– Describe features for predicting CTR for ads.
• Our baseline: avg CTR for a test job (2.29%)
![Page 9: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/9.jpg)
Refined Problem definition
• Ideal: Predict CTR(job j, position p, user cluster u, context c)
Data sparsity Huge feature vector• Predict CTR(job)
Use CTR versus position curve• Predict CTR(job, position)
![Page 10: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/10.jpg)
Data set
• Used HotJobs data from Aug 11, 2008 to Aug 31, 2008 to predict CTR of jobs on Sep 1, 2008
• 40K jobs from 7k+ companies• 32K train set and 8K as test set• Jobs have location, company name, category,
creation date, posting date, optional position wise click history, job source, title, snippet & job description.
![Page 11: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/11.jpg)
Different models
• Weka: Linear Regression and SMOReg• Treenet: Gradient Boosted Decision Trees
• Feature selection:– Weka: wrapper with evaluator=linear regression
and search=GreedyStepwise– Treenet: Variable importance metrics
![Page 12: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/12.jpg)
Features
• Features from Similar Jobs (60)– CTR of jobs with same
title/company/state/city+state/category and their cardinalities posted in past one/two weeks or all jobs based on the click history of past one/two/three weeks
• Features from Related Jobs (288) – CTR_mn of related jobs with m= |A-B| and
n=|B-A| and cardinalities (0 ≤m,n≤ 5) posted in past one/two weeks or all jobs based on the click history of past one/two/three weeks
![Page 13: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/13.jpg)
Features
• Job Title Features (11)– #words, #capitalized words, isAllCaps, hasHighPunct,
hasLongWords, hasNumbers, vocabulory features• Daily CTR Features for past 3 weeks (21)• Other Features (10)– Job Category, age, location specificity, job source, and
job description page features• Other potential features– high-marketing-pitch words, brand value of company,
spam feedback, seasonal variations
![Page 14: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/14.jpg)
Experiments and results• Baseline: Predict avg CTR for a test job (2.29%)• Predicting avg - category-wise – CTR (A)• Linear Regression over 390 features (B) – uses only 142 regressors.• GBDT using Treenet over 390 features (C) – uses 300 regressors. (at
256_600_0.01_100)
![Page 15: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/15.jpg)
Analysis of regressor distribution
![Page 16: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/16.jpg)
Important features
• Similar Jobs features– Same company, title, city+state using 1 week click
history• Others features– Creation date, job description page size, date of
update, posting date, job category• Related Jobs features– Related_11, related_12 jobs posted in past 1/3
weeks over 1/3 week click history
![Page 17: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/17.jpg)
Pruning the feature set
![Page 18: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/18.jpg)
Pruning the feature set
• Wrapper based feature selection with linear regression and with Treenet’s variable importance (E) -11 features.
![Page 19: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/19.jpg)
In absence of click history …
• Linear regression with 369 features (F) – uses 187 regressors.
• Treenet uses 282 regressors at 256_600_0.01_20 (G)
![Page 20: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/20.jpg)
Analysis of regressor distribution
None of the sets alone helps!
![Page 21: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/21.jpg)
Pruning the feature set
![Page 22: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/22.jpg)
Variable importance curves
![Page 23: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/23.jpg)
Conclusion and future work• More features• Dyadic models to predict user-personalized CTR with
(job feature vector, user feature vector) dyads.• Auto model updates to correct model drift
• We built a machine learning system to predict CTR for job listings and presented our results using various regression metrics.
![Page 24: Predicting Click Through Rate for Job Listings Manish Gupta Yahoo! HotJobs Jan 22, 2009](https://reader037.vdocument.in/reader037/viewer/2022110206/56649cfa5503460f949cbd07/html5/thumbnails/24.jpg)
Thanks for your time