machine learning · 2012-01-03 · what can machine learning ... • compared to say php 33 seconds...
Post on 27-Jul-2020
0 Views
Preview:
TRANSCRIPT
Machine Learning
Tom Maiaroto@shift8creative
What is Machine Learning?
Algorithms & Approaches
Decision trees
Random forests
Artificial neural networks
k-NN (nearest neighbour)
Naive Bayesian classifier
Algorithms & Approaches
Decision trees
Random forests
Artificial neural networks
k-NN (nearest neighbour)
Naive Bayesian classifier
So could machines one day rulethe earth?
So could machines one day rulethe earth?
Maybe (ok probably not)
What can Machine Learning do for Apps?
Spam filtering
What can Machine Learning do for Apps?
Auto-tagging
What can Machine Learning do for Apps?
All Sorts of Categorization
What can Machine Learning do for Apps?
Sentiment Analysis
Languages Commonly Used
• Javao Java-ML, WEKA, Apache Mahout, many more...
• Pythono NLTK, scikit-learn, PyML, a good deal more...
• C++o libDAI, Armadillo, Orange, tons more...
and then some others...
Languages Commonly Used
http://www.mloss.org
MongoDB Too!
• Map/Reduce
• Stored JavaScript
• Geo-spatial Indexing
• Replication
Geo-spatial Indexing
Did someone say nearest neighbour?
Geo-spatial Indexing
Did someone say nearest neighbour?
Design geeks, imagine the visualizations...
Replication
• Store massive amounts of data
• Distributed performance benefits
• Dedicated databases for calculations
All the obvious benefits.
Map/Reduce
It's the brain.
Map/Reduce
It's the brain.
It's not just for aggregation.
Map/Reduce
It's the brain.
It's not just for aggregation.
It's faster than you might think.
Map/Reduce
It's the brain.
It's not just for aggregation.
It's faster than you might think.
It runs in the database.
Map/Reduce
In the computer...
Example Time!It's simple...Just take this...
Example Time!It's simple...Just take this...
Example Time!
Just kidding...
Let's Break Down a Naive Bayes Classifier
Classification/Naive BayesTraining the System
Classification/Naive BayesTraining the System
Simple...
$inc
Classification/Naive Bayes
Just Keep Count of Words per Category
Classification/Naive BayesReduce:
Classification/Naive BayesReduce:
Classification/Naive BayesFinalize:
Classification/Naive BayesFinalize:
Classification/Naive BayesCall the Command:
Classification/Naive BayesResults:
Can see total words.
Can also see word counts per category.
Classification/Naive BayesResults:
...and of course the scores per category...cae = arts and entertainment
cs = science...
Classification/Naive Bayes
• Accurate even with little training
• MongoDB on a small VMTook 1.7 seconds
• Compared to say PHP 33 seconds and timed out
• More training data == exponentially fasterthan PHP
Classification/Naive Bayes
• This wasn't even a full map/reduce
• Your mileage will vary based on formula
• You can cache certain values for speed
• Don't forget about stored JavaScript(but use it wisely)
Porter Stemming Algorithm
Thank You Martin Porter
http://tartarus.org/martin/PorterStemmer
Porter Stemming Algorithm • Exists for nearly every language
• MongoDB will use JavaScript of course
• Decent execution time
Porter Stemming Algorithm • About 2.5x faster than PHP class
• 663x faster than a web browser
Porter Stemming Algorithm • About 2.5x faster than PHP class
• 663x faster than a web browser
• 7x slower than PHP PECL extension
Real World Application
Social Harvest
Analyzes social data from the internet to determine languages spoken, gender, age, sentiment analysis, and categories.
www.social-harvest.com
Real World Application
Social Harvest
Who doesn't like pie charts?
Follow Tom@shift8creativewww.shift8creative.com
www.social-harvest.com
www.union-of-rad.com
Thank You!
top related