company recommendation for new graduates via implicit feedback multiple matrix factorization with...

Post on 12-Apr-2017

163 Views

Category:

Technology

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

CompanyRecommendationforNewGraduatesviaImplicitFeedbackMultipleMatrixFactorizationwithBayesianOptimization

IEEE BIG DATA2016 Washington D.C.

MasahiroKazama1,Issei Sato2,HaruakiYatabe3,Tairiku Ogihara3,Tetsuro Onishi3,HiroshiNakagawa21.RecruitTechnologiesCo.,Ltd.2.UniversityofTokyo3.RecruitCareerCo.,Ltd

Outline

• ProblemSettings• DataDescription• ProposedMethod• Experiments• Results• Conclusion

ProblemSetting

• UniquejobhuntingactivitiesofJapanesestudents• Thestartingtimeforjobhuntingisfixed• Allstudentsapplyatthesametime

Example.jobhuntingscheduleofstudentswhograduatein2015

Startjobhunting activities StartInterview Graduate/Join

Dec1,2013 April1,2014 April1,2015

ProblemSetting

• Studentshavetosendapplicationsheetformanycompaniestogetajoboffer• Manystudentsspendmuchtimeonjobhuntingactivities.ThisisabigsocialprobleminJapan• Manystudentssendapplicationsheettothepopularcompaniesatthebeginning.Buttheyhaveahighcompetitionrate,thereforetheycannotgetajoboffer.

Popularitybias• Browsingconcentratesonsomecompanies

5Company(orderedbypopularity)

Low-browsedcompanies(Bottom80%)

High-browsedcompanies(Top20%)

Numbe

rofStude

nts

ProblemSetting

• Itisimportanttofindacompanysuitableforstudentsatanearlystageofjobhuntingactivities• ItisimportanttoconsidernotonlyHigh-browsedcompaniesbutalsoLow-browsedcompanies

Solutions

• Werecommendsuitablecompaniestostudentsatanearlystage• Wefocusonlow-browsedcompanies

Data

• Ourcompany(Recruit.Co.Ltd)providesajobrecruitingservice• Almostallstudentsuseourservice

• Wehavethreetypesofdata1. Browsingdata2. Entrydata3. Student/Companyinformation

Browsingdata• Browsingdataofstudentsonourrecruitingservice• Usedfortrainingourmodel

• period: 2013/12/1〜2014/3/31

9

Entrydata• Entrydataofstudentsonourrecruitingservice• Usedforevaluatingourmodel

• period: 2013/12/1〜2014/3/31

10

Browsing(click)data

11

click i1 i2 i3 i4

j1 0 4 0 21

j2 71 31 0 18

j3 3 1 2 0

Students

Company

Entrydata

12

entry i1 i2 i3 i4

j1 0 1 0 0

j2 0 1 0 1

j3 1 0 1 0

Student

Company

Student/Companyinfo

13

Student

FacultyDepartmentetc..

Company

Industry typeLocationNumber of employees

Overview

14

Purpose

Solution

・Usingbrowsingdataandstudent/companyinformation,werecommendsuitablecompaniestostudents・Wefocusonlow-browsedcompanies

• Usingbrowsingdata->Implicitfeedbackrecommendation• Low-browseditemrecommendation->Popularitybias• Hyperparametersearch→Bayesianoptimization

ExplicitVSImplicit

15

Explicit feedback Implicit feedbackThedatauserexplicitlygive.

Theuseractiondataforguessinguserpreference

e.g. Amazon 5starrating Clicklog

Pros Good quality Easy to getMuch data

Con Difficult to get NoisePopularity bias

Popularitybias• Browsingconcentratesonsomecompanies→High-browsedcompaniesaremorelikelytoberecommended

16Company(orderedbypopularity)

Low-browsedcompany(Bottom80%)Wewanttorecommendthese

High-browsedcompany(Top20%)

Numbe

rofstude

nts

Implicitfeedbackmatrixfactorization

17

Numberofclicks

CollaborativeFilteringforImplicitFeedbackDatasets(2008)Yifan Hu,YehudaKoren,ChrisVolinsky

rui =10

rui > 0rui = 0

!"#

$#

confidence

preferencei1 i2 i3

j1 41

j2 2

j3 24 3 51

Browsingdata

Problem• High-browsedcompaniesaremorelikelytoberecommended

18

i1 i2 i3 i4 i5 i6 i7 i8 i9 i10 i11 i12Company

Numberof

clicks

Low-browsedcompaniesWewanttorecommendthese

Likelytoberecommended

Proposedmethod

19

=Numberofuserswhobrowsedthecompanyi(Company’spopularity)

cisbiggerwhenthecompanyhasfewerclicks→Low-browsedcompaniesarelikelytoberecommended

Proposedmethodwithsideinformation

20

Studentinformation

Companyinformation

Hyperparametersearch

• WeightofBrowsingα、β、Regularizationλ1,λ2,λ3• Whenthenumberofhyperparameter islarge,gridsearchdoesn’tworkwell

• UseBayesianoptimization forhyperparametersearch21

Bayesianoptimization

22

x y=f(x) y

OptimizationforBlack-box→Gaussianprocessisassumedfordistributionoffunctionf(x)→Itsuggeststhenexthyperparametertoevaluate

x:Hyperparameter α、β、λ1,λ2,λ3f(x) :RecallWewanttofindhyperparameterthatmaximizeRecall

Mockus,1978

DataandEvaluationRecall@100(lowbrowsed)

23

c01 c02 c03 c04 c05 c06 c07 c08 c09 c10Browsing

10 20 1 8 5 10 3 7 23 13

Entry ◯ ◯ ◯ ◯

60% 20% 20%

TrainingSetformatrixfactorization

ValidationSetforBayesianOptimization(BO)

EvaluationSet

Results

0 0.1 0.2 0.3 0.4 0.5

BO+Huetal.

BO+Fangetal.

Proposed

Proposedwithside

Proposedmodelsgetbetterrecall

TrialsofBayesianOptimizationIncreasingthetrials,wegetbetterrecall.->wecanfindbetterhyperparameters

Conclusions

• Webuiltarecommendationsystemthatrelaxespopularitybias• Byusingthesideinformation,therecommendationperformanceofthelow-browsedcompaniesimproved• HyperparameteroptimizationwasperformedusingBayesianoptimization

top related