introduction to machine learning for oracle database professionals
DESCRIPTION
Basic Machine Learning introduction for Oracle folks.TRANSCRIPT
![Page 1: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/1.jpg)
Practical Machine Learning for DBAs
Alex Gorbachev
Las Vegas, NV
April 2014
![Page 2: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/2.jpg)
Alex Gorbachev• Chief Technology Officer at Pythian • Blogger • Cloudera Champion of Big Data • OakTable Network member • Oracle ACE Director • Founder of BattleAgainstAnyGuess.com • Founder of Sydney Oracle Meetup • IOUG Director of Communities • EVP, Ottawa Oracle User Group
![Page 3: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/3.jpg)
Agenda
• What’s Machine Learning – Typical Machine Learning applications
• Why using Oracle Database for Machine Learning
• Practical examples – Classifying PL/SQL code – Classifying database schemas into good
and bad – SQL statements clustering – Detecting anomalies in database
workload
![Page 4: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/4.jpg)
What is Machine Learning?
![Page 5: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/5.jpg)
data magic
![Page 6: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/6.jpg)
scientific data
analysis
![Page 7: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/7.jpg)
modern practical
AI
![Page 8: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/8.jpg)
building simplified models of the universe
using probabilistic models
![Page 9: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/9.jpg)
Tom Mitchell’s definition
• Machine Learning is the study of computer algorithms that improve automatically through experience.
!• A computer program is said to learn from
experience E with respect to some task T and some performance measure P, if its performance on T, as measured by P, improves with experience E.
![Page 10: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/10.jpg)
Why is it useful?
![Page 11: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/11.jpg)
Why is it useful?
![Page 12: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/12.jpg)
Why is it useful?
![Page 13: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/13.jpg)
Why is it useful?
![Page 14: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/14.jpg)
Classes of ML algorithms
• Supervised learning – Input: data + known facts; Output - predictions
• Unsupervised learning – Input: data; Output – hypothesis
!– Other less common algorithms such as reinforcement
learning, recommenders and etc
![Page 15: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/15.jpg)
Supervised Learning: Linear Regression
![Page 16: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/16.jpg)
Supervised Learning: Classification
![Page 17: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/17.jpg)
Unsupervised Learning: Clustering
![Page 18: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/18.jpg)
Unsupervised Learning: Anomaly Detection
![Page 19: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/19.jpg)
Machine Learning workflow
• Gather
• Clean & transform
• Explore
• Model
• Interpret
• Produce value
} today’s focus
![Page 20: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/20.jpg)
Why Machine Learning in Oracle Database?
![Page 21: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/21.jpg)
Machine Learning in Oracle DB?
• That’s where the data is
• Data in an RDBMS is often clean
• Easy to transform data with SQL
• Powerful algorithms implemented – Oracle Data Mining option
– Analytic SQL
![Page 22: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/22.jpg)
Machine Learning by example
Applying Machine Learning
to the business of DBAs
![Page 23: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/23.jpg)
Problem: Detect bad PL/SQL
• Goal: automated PL/SQL code grading – Classify as Good or Bad
• Typical classification task – Assignment of labels to the set of unlabeled items
based on prior observations
![Page 24: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/24.jpg)
Classification process
• Parse input data
• Extract features – Manually or automatically or they are clearly defined (if
row is an item, columns may be features)
• Train – calculate model based on labeled input
• Verify – test model on labeled input
• Apply labels to unlabeled input
!• Classification is supervised learning
![Page 25: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/25.jpg)
Features definition - easy task?
![Page 26: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/26.jpg)
Kittens vs …
![Page 27: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/27.jpg)
Kittens vs Puppies
![Page 28: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/28.jpg)
PL/SQL code features
• Automatically extract words from the text as features (tokenize) – EASY TO AUTOMATE
• Assign features intelligently – Code size
– Author
– Percent of comment lines
– Presence of specific code patterns
– DIFFICULT TO AUTOMATE
![Page 29: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/29.jpg)
Classification model workflow
1. Create Oracle Text policy (define lexer)
2. Configure and build the model on training set
3. Apply model to the testing set
4. Assess model performance
5. Adjust model settings/features/size and repeat
![Page 30: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/30.jpg)
Basic probability lesson
• p(A) is the probability that A is true
A is false
A is true
Area is 1
![Page 31: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/31.jpg)
Basic probability lesson
• p(A) is the probability that A is true
• Axioms of Probability
![Page 32: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/32.jpg)
Basic probability lesson
• p(A) is the probability that A is true
• Axioms of Probability
!!!!
• Bayes Law
![Page 33: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/33.jpg)
How Bayes Law can work for us?
!!!
• A – presence of a feature like WHEN OTHERS THEN NULL in PL/SQL
• B – bad PL/SQL code
A
B
Area is 1B|A
![Page 34: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/34.jpg)
PL/SQL data source
• OBJECT_ID – case ID
• CODE – text column
• TARGET_VALUE – 0 is good and 1 is bad
• Training set – where mod(object_id, 10) < 5
• Testing set – where mod(object_id, 10) >= 5
![Page 35: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/35.jpg)
Oracle Text policybegin begin ctx_ddl.drop_policy('plsql_nb_policy'); exception when others then null; end; begin ctx_ddl.drop_preference('plsql_nb_lexer'); exception when others then null; end; ctx_ddl.create_preference ('plsql_nb_lexer’, 'BASIC_LEXER'); ctx_ddl.create_policy ('plsql_nb_policy', lexer=>'plsql_nb_lexer'); end; /
![Page 36: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/36.jpg)
Model settingsCREATE TABLE plsql_nb_settings ( setting_name VARCHAR2(30), setting_value VARCHAR2(4000)); BEGIN -- Populate settings table INSERT INTO plsql_svm_settings VALUES (dbms_data_mining.algo_name, dbms_data_mining.algo_naive_bayes); INSERT INTO plsql_nb_settings VALUES (dbms_data_mining.prep_auto, dbms_data_mining.prep_auto_on); INSERT INTO plsql_nb_settings VALUES (dbms_data_mining.odms_text_policy_name, 'plsql_nb_policy'); -- INSERT INTO plsql_nb_settings VALUES -- (dbms_data_mining.NABS_PAIRWISE_THRESHOLD,0.01); -- INSERT INTO plsql_nb_settings VALUES -- (dbms_data_mining.NABS_SINGLETON_THRESHOLD,0.01); COMMIT; END; /
![Page 37: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/37.jpg)
Build modelDECLARE xformlist dbms_data_mining_transform.TRANSFORM_LIST; BEGIN BEGIN DBMS_DATA_MINING.DROP_MODEL('PLSQL_NB'); EXCEPTION WHEN OTHERS THEN NULL; END; ! dbms_data_mining_transform.SET_TRANSFORM( xformlist, 'code', null, 'code', null, 'TEXT(TOKEN_TYPE:NORMAL)'); ! DBMS_DATA_MINING.CREATE_MODEL( model_name => 'PLSQL_NB', mining_function => dbms_data_mining.classification, data_table_name => 'plsql_build', case_id_column_name => 'object_id', target_column_name => 'target_value', settings_table_name => 'plsql_nb_settings', xform_list => xformlist); END; /
![Page 38: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/38.jpg)
Test modelSELECT target_value AS actual_target, PREDICTION(plsql_nb USING *) AS predicted_target, COUNT(*) AS cases_count FROM plsql_test GROUP BY target_value, PREDICTION(plsql_nb USING *) ORDER BY 1, 2;
![Page 39: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/39.jpg)
Demo
![Page 40: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/40.jpg)
40
![Page 41: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/41.jpg)
Skyline and Oculus by Etsy blackbox anomaly detection
41
![Page 42: Introduction to Machine Learning for Oracle Database Professionals](https://reader036.vdocument.in/reader036/viewer/2022081717/55632f9bd8b42ad7398b538b/html5/thumbnails/42.jpg)
Thanks and Q&A
Contact info
+1-877-PYTHIAN
To follow us
pythian.com/blog
@alexgorbachev @pythian
linkedin.com/company/pythian