Data is oxygen for ML
Cracow, 8th December 2016
Hello, I’m Dima Boyko
Dima Boyko Rails developer, Python developer, Data Scientist Software engineer @inFakt
[email protected] /dimaboyko
What computer can do better than human?
What human can do better than computer?
What computer can do better than human? + What human can do better than computer?
”Field of study that gives computers the ability to learn without being explicitly programmed
1959, Arthur Samuel
What computer can do better than human? + What human can do better than computer?
Machine learning
Machine learning is not the future
Machine learningALVIN
YouTube https://youtu.be/ilP4aPDTBPE?t=39
1992 !
Boost ML can boost existing products by improving quality and usability of some modules
Unlock Using ML can unlock new product use-cases
Machine learning
Drawbacks?Machine learning
Drawbacks?Machine learning
WOW / WTF ratio
Data Algorithm Insight
Usage modelMachine learning
Data
Usage modelMachine learning
Using of dataRed Roof Inn
Using of data
2 to 3% of flights were canceled
Red Roof Inn
Using of data
500 daily
Red Roof Inn
Using of data
90 000 passengers
Red Roof Inn
Using of data
Weather data
Red Roof Inn
Using of data
10%more revenue during season
Red Roof Inn
Results
Los Angeles Police DepartmentUsing of data
Historical DATA
Analysis & Prediction
Reaction
Los Angeles Police DepartmentUsing of data
Los Angeles Police DepartmentUsing of data
Los Angeles Police DepartmentUsing of data
33% Less thefts
21% Less victims
Los Angeles Police DepartmentUsing of data
Results
UPS Cargo DeliveryUsing of data
Using of data
16,9M Delivered cargos daily
195 Countries around the globe
UPS Cargo Delivery
Orion
• Mathematical model for operations research • Huge processing power in real time
Using of dataUPS Cargo Delivery
Using of data
6M Litres less fuel usage during the year
13 000 Tons of exhausts less
UPS Cargo Delivery
+ Faster deliveries
Results
inFakt Automated Accounting
Using of data
~50 000 Invoices booked monthly by accountants
inFakt Automated AccountingUsing of data
AutoAccounting Brief product history
inFakt Automated AccountingUsing of data
AutoAccounting
• Data from last year • Scikit-learn • Infrastructure
Classification
15% invoices 95% correct
AutoAccounting
• Data from last year • Infrastructure
Classification
55% invoices 95% correct
AutoAccountingClassification
Keep it simple…
3% Wrong Inconsistent
8/10 Human mistake
AutoAccounting
~70% Invoices booked automatically
AutoAccounting
Results
~70% Invoices booked automatically
600 / month Hours saved for creative work
AutoAccounting
Results
Auto
What’s next?
Auto
What’s next?
#worldwide #vendor_independent #simple
Open Source ?
What’s next?
/OpenAutoX
Thanks! Any questions?
Dima Boyko Software engineer @ inFakt
[email protected] /dimaboyko