Download - Machine Learning for Aerospace Training
Applying Machine Learning to Aerospace Training
Mikhail Klassen Chief Data Scientist
Royal Aeronautical Society Conference Simulation-Based Training in the Digital Generation
London, UK 11—12 November, 2015
Background in computational astrophysics and the study of star formation.
Ph.D. (Almost), McMaster UniversityB.Sc., Columbia University Applied Physics & Applied Mathematics
Data Scientist Paladin:Paradigm Knowledge Solutions
Mikhail Klassen
Artist’s conception of a newborn star
Supercomputer simulation of star birth from Klassen et al. (2015, in prep.)
Data ScienceData science is a relatively new interdisciplinary field combining skills from:
• Mathematics, statistics
• Computer science, artificial intelligence, data mining
• Data visualization, databases
Teaching Machines to “Learn”Supervised Learning
• Developing a statistical model that gets better the more examples provided to it
• Examples: Automatic classification, image recognition, handwriting digitization
Teaching Machines to “Learn”Unsupervised Learning
• Automatic pattern extraction • Examples: clustering, personalized
recommendations
What is Big Data?“Big Data” refers to the exponential growth in data…
• …Volume: data sets are too large to fit in standard memory and challenge typical available storage
• …Velocity: data streams (e.g. Twitter, stock prices) pose challenges for real-time analysis
• …Variety: mixture of structured and unstructured data pose challenges for database paradigms
Big Data in Aerospace• Aircraft and other aerospace products
are some of the most instrumented products in the world
• Etihad using big data analytics to measure pilot aptitude
• GE sponsored competition to optimize flight routes
• PASSUR Aerospace created RightETA to better predict arrival time at airports
Competency-Based Training• Competency-based training is an
approach to teaching and learning applicable wherever a subject can be finely decomposed into discrete skills and concepts, and where the mastery of these can be measured.
• In aerospace, this is in contrast with some traditional approaches that required reaching prescribed time quotas in a simulator or in the air
Measuring AchievementThe challenges include selecting the right metrics and knowing how to measure.
• Subject matter experts still vital
Approaches to measurement
• Item Response Theory
• Bayesian Knowledge Tracing
Item Response Theory• Item Response Theory is a way of ‘measuring’ the
skill level of a trainee based on their responses to assessment problems
• Does not assume that every assessment is equal ‣ Variable difficulty ‣ Variable discriminatory power
�6 �4 �2 0 2 4 6Student Ability ✓
0.0
0.2
0.4
0.6
0.8
1.0P
roba
bilit
yof
Ans
wer
ing
Cor
rect
ly
P (✓) = 11+e�(✓�b)
Problem Difficultyb = -1 (Easy)b = 0 (Average)b = 1 (Hard)
�6 �4 �2 0 2 4 6Student Ability ✓
0.0
0.2
0.4
0.6
0.8
1.0P
roba
bilit
yof
Ans
wer
ing
Cor
rect
ly
P (✓) = 11+e�a✓
Problem Discriminationa = 0.1a = 1.0a = 5.0
Knowledge Tracing• Does not assume that a single parameter
characterizes the trainee’s entire ability • Instead, a trainee is measured against many
individual skills or ‘knowledge components’ • After each assessment, the probability that a
trainee has learned is updated in a Bayesian way • Over many assessments, we can build a clear
picture of the trainee’s mastery of many discrete skills
Correct Correct
Not Learned LearnedP(L0)P(T)
P(G) 1 - P(S)
P(T) Probability the skill was learned at each opportunity to use itP(L0) Probability the student had previously learned this skillP(G) Probability the student will guess correctly if skill is not knownP(S) Probability the students will ‘slip’ if skill is already known
Bayesian Knowledge TracingThe Equations
Cohort Analysis• When you already have training data for
hundreds of candidates, you can use supervised learning models to find predictors for candidate success
• In our research on pilot e-learning, we use supervised learning to predict completion rates
• With each successive cohort, you get better results, and more predictive power
Primer on Predictive Analytics
The decision tree algorithm repeatedly splits a data set on input variables (“features”), selecting and giving primacy to those features with the most discriminative power.
Trainee Name … Performance: Module 1
Performance: Module 2
Performance: Module 3 … Flight time
(hours)Predicted
Final Evaluation
10234 John Doe … 90% 68% 80% … … 85%
10235 Jane Philips … 85% 90% 86% … … 87%
10236 Sam Wilson … 87% 75% 91% … … 79%
… … … … … … … … …
• Through comparison against past cohorts, these types of regression algorithms can predict final scores, even as the candidate is still mid-training
• This allows for early identification of weaknesses • Because the feature weights of various training
inputs have already been calculated, the system knows where remedial action is most effective
Analytics Engine
Adaptive Training
Assessing Potential Competence
KC1: 84%KC2: 90% KC3: 77% KC4: 78% KC5: 54% KC6: 71%
…
Through evaluation across a range of core skills, knowledge tracing algorithms can identify areas for remediation or certify a candidate.
This is how competency-based training could work.
Data on career performance can then inform training metrics.
KnowledgeComponents
Admission & RecruitmentWhy would you want to use predictive analytics in admissions, hiring or recruitment?
• Avoid bias
• Predict outcome
Promises & PerilsUnstructured interviews
Reference checks
Number of years of work experience
Work sample test
General cognitive test
Structured interview
0 7.5 15 22.5 30
26%
26%
29%
3%
7%
14%
Adapted from Work Rules! by Laszlo Bock, Senior Vice President of People Operations at Google
Conclusion• Machine learning and other AI-based systems are
disrupting many industries and bringing us smarter, more targeted products and services
• Education & training are already feeling the wave of these technologies and will be dramatically transformed by them
• Data-driven adaptive training will become the industry standard as we move towards competency-based training