scikit-learn - or why i joined an open source software project

17
Scikit-Learn (or why I joined an open source software project) Gilles Louppe Dept. of EE & CS, & GIGA-R Universit´ e de Li` ege, Belgium October 30, 2013

Upload: gilles-louppe

Post on 10-May-2015

950 views

Category:

Social Media


3 download

TRANSCRIPT

Page 1: Scikit-Learn - Or why I joined an open source software project

Scikit-Learn(or why I joined an open source software project)

Gilles Louppe

Dept. of EE & CS, & GIGA-RUniversite de Liege, Belgium

October 30, 2013

Page 2: Scikit-Learn - Or why I joined an open source software project

Publishing scientific software matters 1

I Software is a central part of modern scientific discovery.

I Software developed in one field can often be applied toadvance a different field if the underlying mathematics iscommon.

I The public availability of code is a corner stone of thescientific method.

1. Pradal C. et al, Publishing scientific software matters, 2013.

Page 3: Scikit-Learn - Or why I joined an open source software project

if it’s not open and verifiable by others, it’s not science, orengineering, or whatever it is you call what we do 2

2. V. Stodden, The scientific method in practice.

Page 4: Scikit-Learn - Or why I joined an open source software project

As a young PhD student full of illusions...

I wanted to write useful scientific software, for me and others

Page 5: Scikit-Learn - Or why I joined an open source software project

Leverage existing software

... but I didn’t want to reinvent the wheel !

Page 6: Scikit-Learn - Or why I joined an open source software project

... and then I joined an OSS project

I An open source Machine Learning library in PythonI Classical and well-established algorithms

- Supervised and unsupervised algorithms- Model evaluation and selection- Data processing and feature engineering

Page 7: Scikit-Learn - Or why I joined an open source software project

Collaborative development

Page 8: Scikit-Learn - Or why I joined an open source software project

Software quality matters

Peer-reviewed and well-tested code

Page 9: Scikit-Learn - Or why I joined an open source software project

Simple and consistent API

from sklearn.ensemble import RandomForestClassifier

clf = RandomForestClassifier()

clf.fit(X_train, y_train)

y_pred = clf.predict(X_test)

Page 10: Scikit-Learn - Or why I joined an open source software project

Simple and consistent API

from sklearn.svm import SVC

clf = SVC()

clf.fit(X_train, y_train)

y_pred = clf.predict(X_test)

Page 11: Scikit-Learn - Or why I joined an open source software project

Simple and consistent API

from sklearn.linear_model import LassoCV

clf = LassoCV()

clf.fit(X_train, y_train)

y_pred = clf.predict(X_test)

Page 12: Scikit-Learn - Or why I joined an open source software project

Side effect 1 : Learn and improve your skills

I Strict programming practices

I Software management (release cycle, git, etc)

I Team work

Page 13: Scikit-Learn - Or why I joined an open source software project

Side effect 2 : People might start using your software

In research

In industry

Page 14: Scikit-Learn - Or why I joined an open source software project

Side effect 3 : You get to meet interesting people

(and eat pizzas !)

Page 15: Scikit-Learn - Or why I joined an open source software project

Start with small contributions...

Page 16: Scikit-Learn - Or why I joined an open source software project

Publish and share your research code

Join an open source software project

Page 17: Scikit-Learn - Or why I joined an open source software project

Questions ?