the frontiers of machine learning in health and behavioral
TRANSCRIPT
The Frontiers of Machine Learning in Health and Behavioral Science
Tufts University, Nov 15, 2012 Benjamin M. Marlin
Advanced Machine Learning Methods for Mobile Health Research
Benjamin M. MarlinCollege of Information and Computer Sciences
University of Massachusetts [email protected]
Wireless Health Workshops Oct 25, 2016
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Center for
ScienceData
Affiliations
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Collaborators and Sponsors
Dr. Robert MalisonYale UniversitySchool of Medecine
Malai NatarajanUMass CS PhD candidate
Abhinav ParateUMass CS PhD graduateNow at HP Labs
Roy AdamsUMass CS PhD candidate
Prof. Edison ThomazUT AustinECE
Prof. Santosh KumarU of MemphisComputer Science
Nazir SaleheenU of Memphis PhD CandidateComputer Science
Prof. Deepak GanesanUMass AmherstComputer Science
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Wearable Sensors
Accelerometer
ECG
TemperatureRespiration
Accelerometer
GPS
GyroscopeMagnetometer
Microphone
Scene Camera
Gaze Tracker
Eye Metrics
Accelerometer
GPS
GSRGyroscope
PPG
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
The Role of Machine Learning in Mobile Health ResearchPart 1:A Primer on Detection and Detector LearningPart 2:
Leveraging Domain Knowledge with Structured PredictionPart 3:Addressing Lab-to-field GeneralizationThrough Domain AdaptationPart 4:
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Part 1The Role of Machine Learning in
Mobile Health Research
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Data
Patient Population
Health & Behavior
SensorSystems
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Data Analytics
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
AnalyticsQuestions
PredictionWhat will happen in the future?
DetectionWhat is happening right now?
ControlHow to direct a process to a desired state?
CausationWhat is the causal structure of a process?
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Detectors
Data
Sensors
Subject
Monitoring
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Detectors
Data
Sensors
Subject
Interventions
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Detectors
Data
Sensors
Subject
Causal Analysis
EMA/Labs
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Part 2A Primer on Detection and Detector Learning
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
The Detection Problem: Suppose we have a dynamical system in which an event of interest either occurs or does not occur at each time instance t. Given a feature vector xt∈ℝD that partially describes the state of the system at time t, infer whether the event occurred at time t or not.
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
The Detection Problem: Suppose we have a dynamical system in which an event of interest either occurs or does not occur at each time instance t. Given a feature vector xt∈ℝD that partially describes the state of the system at time t, infer whether the event occurred at time t or not.
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Duration
Stre
tch
Feature Space
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
A Detection Function: Given a feature vector xt∈ℝD
that partially describes the state of a dynamical system at time t, a detection function f: ℝD→ {0,1}. 0 indicates the event of interest did not occur, and 1 indicates that the event of interest did occur.
f( )=
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Detection Function in Feature Space
Duration
Stre
tch
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Detection Model: A detection model is a set of functions F where for each f∈ F, f: ℝD→ {0,1}. It is typical for the set F to consist of a single function f(x,w)=fw(x) that depends a vector of parameters w∈ ℝK: F = {fw | w∈ ℝK}.
fw(x) =
8><
>:1 ... if
DX
d=1
wdxd > w0
0 ... otherwise
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Detection Model in Feature Space
F={ }
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Detector Learning Problem: Given a data setD={(xi,yi)|1<i<N} consisting of feature vectors xi∈ℝD
and event labels yi∈ {0,1}, select a function f: ℝD→{0,1} from F that maps feature vectors x ∈ℝD to their event labels as accurately as possible. For a parametric model fw, this problem reduces to finding the best model parameters w*.
w⇤ = argminw
NX
n=1
[fw(xn) 6= yn]
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Detector Learning Problem: Given a data setD={(xi,yi)|1<i<N} consisting of feature vectors xi∈ℝD
and event labels yi∈ {0,1}, select a function f: ℝD→{0,1} from F that maps feature vectors x ∈ℝD to their event labels as accurately as possible. For a parametric model fw, this problem reduces to finding the best model parameters w*.
w⇤ = argmin
w
NX
n=1
loss(fw(xn), yn)
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Classification Regression
Clustering Dimensionality Reduction
Supe
rvis
edU
nsup
ervi
sed
Learning to detect and
predict.
Learning to organize and represent.
Detection as Classification
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Detector Learning Example
Duration
Stre
tch
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Collect Data
Collect Labels
LearnDetector
Extract Features
Test
ing Collect
DataApply
DetectorExtract
FeaturesEvaluate
Performance
ChooseModel
Collect Labels
Trai
ning
Depl
oy
Collect Data
ApplyDetector
Extract Features
MonitoringIntervention
Analysis
CleanData
CleanData
CleanData
Full Detector Learning Process
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Detector Learning Challenges in mHealth
Driving Question: How can we increase the accuracy of learned detectors while keeping,
energy, cost and subject burden low?
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
1. Structured Prediction: Leverage domain knowledge regarding structure of event labels to improve detection accuracy.
2. Domain Adaptation: Use learning protocols that can account for limited ecological validity to improve lab-to-field generalization.
Machine Learning Solutions
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Part 3Leveraging Domain Knowledge with
Structured Prediction
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
• In many time series detection problems, there is structure in the event label values.
• If we can identify the constrains that this structure imposes on sequences of event labels, we can improve detection accuracy by jointly detecting events at multiple times.
• This is a detection strategy best suited for offline data analysis.
Basic Ideas
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
• Coherence: Event labels that are close in time are more likely to take the same value.
• Transitions: Transitions between some pairs of event types (eg: A-B) are more likely than transitions between other types (eg: A-C).
• Sequences: Events are more likely to occur in some sequences than others (eg: A-B-C-A-B-…).
• Hierarchy: Occurrence of high-level activities change the likelihood that different types of lower-level events occur.
Types of Structures
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Case Study 1: Preferred Transitions for ECG
Natarajan, Annamalai, Edward Gaiser, Gustavo Angarita, Robert Malison, Deepak Ganesan, and Benjamin Marlin. "Conditional Random Fields for Morphological Analysis of Wireless ECG Signals." Proceedings of the 5th Annual conference on Bioinformatics, Computational Biology and Health Informatics. Newport Beach, CA: ACM, 2014.
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Data Collection
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Challenges
http://www.physionet.org/physiotools/ecgpuwave/
High uncertainty due to noise and strictly local reasoning lead to labeling errors that corrupt features.
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Independent Detection
Y1
X1
Y2
X2
Y3
X3
Y4
X4
Y5
X5
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Proposed Solution: Structured Prediction
Y1
X1
Y2
X2
Y3
X3
Y4
X4
Y5
X5
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Proposed Solution: Structured Prediction
Y1
X1
Y2
X2
Y3
X3
Y4
X4
Y5
X5
Y6
X6
Y7
X7
P Q R S T N P
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Linear Chain CRF Probabilistic Model
Joint Label Distribution
Partition Function
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Linear Chain CRF Probabilistic Model
Energy Function
Feature potentials
Transition potentials
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
• Both MAP and marginal inference are O(L) using sum-product or max-product belief propagation (Lafferty, McCallum, Pereira, 2001).
• We fit the model parameters using the standard maximum likelihood approach. The optimization problem is convex and relies on marginal inference as a sub-routine.
Linear Chain CRF Inference and Learning
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Experimental Protocol: ECG Morphology1. Over-generate candidate peak locations using
peak detector.
2. Manually label randomly chosen clusters of peaks for each subject (~3000 peaks/subject, ~10 peaks/cluster).
3. Split clusters into train and test sets. Train models on train set. Evaluate accuracy on test set.
4. Consider both within subject and across subject protocols, CRF and MLR models.
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Results: ECG Labeling
Within Subjects Across Subjects
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Results: ECG Labeling vs Training Set Size
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Case Study 2: Hierarchical Detection with Gestures
Challenges:
• Wrist worn actigraphy provides weak evidence for occurrence of gestures of interest like eating and smoking.
• Activity sessions are characterized by higher-order interactions between gesture labels.
Adams, Roy, Nazir Saleheen, Edison Thomaz, Abhinav Parate, Santosh Kumar, and Benjamin Marlin. "Hierarchical Span-Based Conditional Random Fields for Labeling and Segmenting Events in Wearable Sensor Data Streams." International Conference on Machine Learning. 2016.
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Proposed Model
Events variables
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Proposed Model
Events variables
Segmentvariables
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Example Joint Labeling and Segmentation
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Example Joint Labeling and Segmentation
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Example Joint Labeling and Segmentation
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Adding Inter-Label Duration Segments
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Segmentation Constraint:
Nesting Constraint:
Global Segmentation Coordination Factors
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Probabilistic Model
Segmentation and Nesting Factors (l>1)
Joint Label and Segment Distribution
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Probabilistic Model
Feature Factors
Event Cardinality Factors
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Probabilistic Model
Positional Factors
Segment Label Alternation
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
MAP Inference in this model can be performed using a dynamic program in O(L2) time, similar to the semi-markov CRF (Sarawagi and Cohen 2004).
Inference and Learning
We fit the model parameters using loss augmented maximum-margin methods (Tsochantaridis et al. 2005).
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Related Models
Y1 Y2 Y3 Y4
Y2 Y3 Y4Y11 111
Y122 Y34
2
(a)
(b)
Y2 Y3 Y4Y11 111
Y122 Y34
2Y232
Y132 Y24
2
Y142
(c)
(d)
(e)
(f)
(a)Linear-chain CRF (LC-CRF)(b)Tree structured CRF (T-CRF)(c)Hierarchical Nested Segmentation model (HNS)
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Experimental Protocol
1. Gesture, inter-gesture, and session labels obtained from video record or direct observation.
2. Bottom-level segmentation of sensor time series into fixed-length windows (6s, eating), or using adaptive segmentation (smoking).
3. Extract features from windows.
4. Leave-one-subject out evaluation protocol.
5. Compare to tree-structured CRF and MLR models.
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Dataset DetailsPM T MP RQS
Behavior Smoking Eating Smoking SmokingModality Chest
bandWristaccel.
Chest band
Wrist accel.
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
30 7 03 5QS
0.1
0.3
0.5
0.7
0.9
F1L57-C5FH1S
Event Labeling Results
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
30 7 03 5QS0.3
0.5
0.7
0.9
F17-C5FH1S
Session Labeling Results
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Part 4Addressing Lab-to-field Generalization
Through Domain Adaptation
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
• Due to the issues with collecting high-quality event labels in the field, many mHealth event detection studies collect carefully labeled data in the lab, and then deploy learned detectors in the field.
• Because ecological validity is often very limited in the lab there is often a significant gap in performance (or even applicability) between the lab and the field.
• In machine learning, this problem is called domain shift. Lab data represent the source domain, while field data represent the target domain.
Basic Ideas
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
• Prior Probability Shift: The relative occurrence of different types of events changes from the source to the target domain.
• Covariate Shift: The distribution of feature values associated with an event type changes from the source domain to the target domain.
• Label Granularity Shift: The temporal granularity of the labels changes from the source to the target domain.
Types of Domain Shift
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Case Study 4: Domain Adaptation for Cocaine Detection
Natarajan, Annamalai, Gustavo Angarita, Edward Gaiser, Robert Malison, Deepak Ganesan, and Benjamin Marlin. "Domain Adaptation Methods for Improving Lab-to-field Generalization of Cocaine Detection using Wearable ECG." 2016 ACM International Joint Conference on Pervasive and Ubiquitous Computing. 2016
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Study Protocol
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin•68
Prior Probability Shift
Lab Field
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin•69
Covariate Shift
Lab Field
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin•70
Assessing Covariate Shift
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Correcting Domain Shift with Re-Weighting
w⇤ = argminw
NX
n=1
�(xm, yn)[fw(xn) 6= yn]
We can correct for prior probability shift and covariate shift by re-weighting lab data to better match statistics of field data when learning detectors.
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin•72
Prior Probability Shift Correction
Lab Field
=x
Weights
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin•73
Covariate Shift Correction
Lab Field
=x
Weights
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin•74
Calculating Multivariate Covariate Shift Weights
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin•75
Label Granularity Shift• In our study protocol, we have event labels for
cocaine use that are accurate at the minute level.
• In the field, we have self-reported intervals of cocaine use, which are not reliable. We also have daily utox measurements, which are considered ground truth, but are temporally .
• Both the lab cocaine event labels and the utoxlabels provide ground truth for presence of cocaine use, but there is a significant shift in the temporal granularity of the labels.
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin•76
Label Granularity Shift Correction
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
• Zephyr BioHarness chest band sensor paired to a smartphone.
• Extracted 24 ECG features per 5 minute sliding window
• Cocaine detection and urine test prediction model: penalized l2 logistic regression.
• 37 field days (28 days +ve urine test)
Study Details
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Results
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Conclusions
1. Probabilistic structured prediction models lead to performance improvements for mHealthproblems by using global reasoning to combat uncertainty due to low power, low cost sensing.
2. Domain adaptation methods can help to mitigate ecological validity issues in lab-to-field study designs.
Machine Learning Methods for Mobile Health Research
Wireless Health – Oct 25, 2016 Benjamin M. Marlin
Thank You!