jacob gardner · 2015. 3. 20. · quan zhou, wenlin chen, shiji song, jacob r. gardner, kilian q....

17
Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”

Upload: others

Post on 14-Aug-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”

Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen

Support Vector Elastic Network

“Sven the Terrible”

Page 2: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”

Traditional Computer Science

Data

ProgramOutput

Computer

Traditional CS:

Page 3: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”

Machine Learning

Data

ProgramOutput

Computer

Traditional CS:

Machine Learning:

Data

OutputProgram

Computer

Page 4: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”

Support Vector Machines

w >x

min

w

1

2

kwk22 + CnX

i=1

max(0, 1� yi(w>xi))

2}

L2 Regularization.

}

Squared hinge loss.

14644 Citations

Published in ML journals

Usable means MATLAB

Fast means parallel

Many GPU Implementations

Page 5: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”

Support Vector Machines

w >x

min

w

1

2

kwk22 + CnX

i=1

max(0, 1� yi(w>xi))

2}

L2 Regularization.

}

Squared hinge loss.

14644 Citations

Published in ML journals

Usable means MATLAB

Fast means parallel

Many GPU Implementations

Page 6: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”

Elastic Net/Lasso

min�

kX� � yk22 + �2k�k22such that |�|1 t

13856 Citations

Published in stats journals

Usable means R

Fast means Fortran

Zero GPU Implementations

Page 7: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”

min�

kX� � yk22 + �2k�k22such that |�|1 t

13856 Citations

Published in stats journals

Usable means R

Fast means Fortran

Zero GPU Implementations

Elastic Net/Lasso

Page 8: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”

min�

kX� � yk22 + �2k�k22such that |�|1 t

13856 Citations

Published in stats journals

Usable means R

Fast means Fortran

Zero GPU Implementations

Elastic Net/Lasso

Page 9: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”

min�

kX� � yk22 + �2k�k22such that |�|1 t

t

0 0.5 1 1.50.2

0

0.2

0.4

0.6 Glmnet

0 0.5 1 1.50.2

0

0.2

0.4

0.6 SVEN (GPU)

Coe

ffici

ents

�i

L1 budget t L1 budget t

Equivalence of regularization path

L1 Budget

Elastic Net/Lasso

Page 10: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”

+ interpretable+ parallel + scales to large data + multi-platform

- slow - does not scale

- not interpretable

Elastic Net SVM

Page 11: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”

Reductions

Problem A Problem B

Solution BSolution A

Elastic Net SVM

Input X,Y Input Xnew,Ynew

Output � ↵Output

Page 12: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”

Reductions

Problem A Problem B

Solution BSolution A

[n,p] = size(X); Xnew = [bsxfun(@minus,X,Y./t) bsxfun(@plus,X,Y./t)]'; Ynew = [ones(p,1); -ones(p,1)]; C = 1/(2*lambda);

alpha = C * max(1 - Ynew.*(Xnew*model.w),0); beta = t*(alpha(1:p) - alpha(p+1:2*p)) / sum(alpha);

model = trainsvmGPU(Ynew,sparse(Xnew),['-q -s 1 -c ' num2str(C)]);

Input X,Y Input Xnew,Ynew

Output � ↵Output

Elastic Net SVMfunction beta = SVEN(X,Y,t,lambda)

Page 13: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”

Results

0 0.5 1 1.50.2

0

0.2

0.4

0.6 Glmnet

0 0.5 1 1.50.2

0

0.2

0.4

0.6 SVEN (GPU)

Coe

ffici

ents

�i

L1 budget t L1 budget t

Equivalence of regularization path

Page 14: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”

ResultsO

ther

alg

. run

time

(sec

)

101

MITFaces [n=489410, p=361] Yahoo [n=141397, p=519] YMSD [n=463715, p=90]

SVEN (GPU) fa

ster

SVEN (GPU) s

lower

FD [n=400000, p=900]

SVEN (GPU) fa

ster

SVEN (GPU) s

lower

SVEN (GPU) fa

ster

SVEN (GPU) s

lower

SVEN (GPU) fa

ster

SVEN (GPU) s

lower

SVEN (GPU) runtime (sec)100

100

101

102

102 101100 102

100

101

102

100 10110-110-1

100

101

101

101

102

102

glmnet SVEN (CPU)Shotgun L1_Ls

n>>d datasets

O(d2)Running time:

Or…

Page 15: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”

ResultsO

ther

alg

. run

time

(sec

)

GLI85 [n=85, p=22283] arcene [n=900, p=10000] SMKCAN187 [n=187, p=19993] GLABRA180 [n=180, p=49151]

SVEN (GPU) fa

ster

SVEN (GPU) s

lower

SVEN (GPU) fa

ster

SVEN (GPU) s

lower

SVEN (GPU) fa

ster

SVEN (GPU) s

lower

SVEN (GPU) fa

ster

SVEN (GPU) s

lower

100

10-1

10-2

101

10010-110-2 101 10010-1 101 102

10-1

100

101

102

10010-1 101

10-1

100

101

10-1

100

101

102

10010-1 101 102

glmnet SVEN (CPU)Shotgun L1_Ls

PEMS [n=440, p=138672] scene15 [n=544, p=71963] dorothea [n=800, p=88119] E2006 [n=3308, p=72812]

SVEN (GPU) fa

ster

SVEN (GPU) s

lower

SVEN (GPU) fa

ster

SVEN (GPU) s

lower

SVEN (GPU) fa

ster

SVEN (GPU) s

lower

SVEN (GPU) fa

ster

SVEN (GPU) s

lower

SVEN (GPU) runtime (sec)10010-1 101 102

10-1

100

101

102

10-1

100

101

102

10010-1 101 102 10010-1 101 10210-1

100

101

102

100

101

102

103

100 101 102 103

d>>n datasets

Running time: O(n2)

Page 16: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”

Conclusion

Elastic Net and SVM are equivalent problems.

Many optimizations only for SVM now apply to Elastic Net.

This leads to the fastest Elastic Net solver we are aware of.

Page 17: Jacob Gardner · 2015. 3. 20. · Quan Zhou, Wenlin Chen, Shiji Song, Jacob R. Gardner, Kilian Q. Weinberger, Yixin Chen Support Vector Elastic Network “Sven the Terrible”

Questions?

“Sven the Nice?”