machine learning challenges for automated prompting in smart homes
Post on 14-Jun-2015
624 Views
Preview:
DESCRIPTION
TRANSCRIPT
Machine Learning Challenges for Automated Prompting in Smart Homes
Barnan Das
May 22, 2014
2
2009 2030
Older adult (65+) population in US
72mn
40mn
3
5million
15million
60%
Alzheimer’s patient
Unpaid caregivers
Caregivers report stress
4
5
Machine learning algorithms trained on smart home sensor data can predict when an individual faces difficulty while performing everyday activities. “ ”
6
7
Smart Home Studies
Study 1 Study 2
Participants 400 180
Activities 8 6
Activity Errors Naturalistic
Naturalistic
8
Automated Prompting
Emulating Caregiver Prompt Timing
Detecting Activity Errors in Real Time
Imbalanced Class
DistributionClass Overlap
One-Class Classification
Overview
Study 1 Study 1, 2
9
Emulating Caregiver Prompt Timing
8 DailyActivities
Study 1
Prompts issued when errors were committed
Raw Data
1ActivityStep
17 EngineeredFeatures
Used by Algorithms
0/1
1Training Example
Binary class{prompt, no-prompt}
10
Total # training examples
39803.94%
Class Distribution
prompt
class
11
Automated Prompting
Emulating Caregiver Prompt Timing
Detecting Activity Errors in Real Time
Imbalanced Class
DistributionClass Overlap
One-Class Classification
Overview
12
Imbalanced Class Distribution
13
Preprocessing
Sampling• Over-sampling the minority class• Under-sampling the majority class
Oversampling• Spatial location of training examples in
Euclidean space
Existing Solutions
14
Preprocessing technique to oversample minority class
Approximate discrete probability distribution
using
Generate new minority class data points using
Chow-Liu’s algorithm
Gibbs sampling
Proposed Approach
17
Minority Class Samples
Majority Class
Samples
Markov Chains
Gibbs Sampling
18
(wrapper-based) RApidly COnverging Gibbs Sampler
RACOG wRACOG
Sample selection
Pre-defined lag on Markov chain
Highest probability of misclassification by wrapper classifier
Stopping criteria
Pre-defined number of iterations
No improvement of a performance measure
RACOG & wRACOG
19
Experimental Setup
Datasets Approaches Classifiers
Study 1 (Prompting) Baseline Classifier C4.5 Decision Tree
9 UCI Datasets SMOTE SVM
SMOTEBoost K-Nearest Neighbor
RUSBoost Logistic Regression
Baseline Prompting
RACOG
wRACOG
20
Results (True Positive Rate)
Baselin
e Cla
ssifie
r
SMO
TE
SMO
TEBoost
RUSBoost
Baselin
e Pro
mptin
g
RACOG
wRACOG
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
21
Results (G-mean)
Baselin
e Cla
ssifie
r
SMO
TE
SMO
TEBoost
RUSBoost
Baselin
e Pro
mptin
g
RACOG
wRACOG
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
22
Automated Prompting
Emulating Caregiver Prompt Timing
Detecting Activity Errors in Real Time
Imbalanced Class
DistributionClass Overlap
One-Class Classification
Overview
23
Class Overlap
24
Class Overlap in Prompting Data
3-dimensional PCA plot of prompting data
25
Tomek Links
26
Form clusters
Under-sampling clusters
Cluster-Based Under-Sampling
27
ClusBUS Ensemble
28
Experimental Setup
Dataset Approaches Classifiers
Study 1 (Prompting) Baseline C4.5 Decision Tree
SMOTE Naive Bayes
Clustering Algorithm ClusBUS K-Nearest Neighbor
DBSCAN ClusBUS Ensemble SVM
29
Result (True Positive Rate)
C4.5 Naïve Bayes IBk SMO0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Baseline SMOTEClusBUS ClusBUS Ensemble
30
Result (G-mean)
C4.5 Naïve Bayes IBk SMO0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Baseline SMOTEClusBUS ClusBUS Ensemble
31
Automated Prompting
Emulating Caregiver Prompt Timing
Detecting Activity Errors in Real Time
Imbalanced Class
DistributionClass Overlap
One-Class Classification
Overview
32
Detecting Activity Errors in Real Time
Sensor events labeled with
activity stepsAvailability of information on
activity errors
33
Basic Idea
Participants with no reported errors
One-Class Classifier
Participants who committed errors
Normal Activity
Data
Train Test
Activity Datawith ErrorsActivity
Data
34
6 DailyActivities
Participants
Annotated for error start times
Raw Data
1SensorEvent
>70EngineeredFeatures
1
1Training Example
One-class
{normal}
Used by Algorithms
580
DERT Data
35
One-Class SVM
x1
x2
36
Model Selection
37
Activity Error Classification
WHY? To characterize change in daily activities of older adults
HOW? Sensor data
Error Types Accuracy*
Study 1 4 73%Study 2 9 54%
*Using C4.5 decision tree and 10-fold CV
41
Activity Error Models
One-Class Multi-Class
42
Ensembles
One-Class SVM
Test Sample
Error Model
One-ClassMulti-Class
Logical AND
Normal/Error
43
Experimental Setup
Datasets Approaches
Study 1 (400 participants) Baseline
Study 2 (180) participants OCSVM
OCSVM + OCEM
OCSVM + MCEM
44
Results: Study 1
Sweepi
ng a
nd D
ustin
g
Takin
g M
edica
tion
Wat
erin
g Pla
nts
Cookin
g0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Sweepi
ng a
nd D
ustin
g
Takin
g M
edica
tion
Wat
erin
g Pla
nts
Cookin
g0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
Recall Precision
Baseline OCSVM OCSVM+OCEM OCSVM+MCEM
45
Results: Study 2Recall Precision
Baseline OCSVM OCSVM+OCEM OCSVM+MCEM
Sweepi
ng a
nd D
ustin
g
Clean
ing
Count
erto
ps
Takin
g M
edica
tion
Wat
erin
g Pla
nts
Was
hing
Han
ds
Cookin
g
0
0.1
0.2
0.3
0.4
0.5
0.6
Sweepi
ng a
nd D
ustin
g
Clean
ing
Count
erto
ps
Takin
g M
edica
tion
Wat
erin
g Pla
nts
Was
hing
Han
ds
Cookin
g
0
0.01
0.02
0.03
0.04
0.05
0.06
46
Clinical Evaluation
18%
Continuation of
Previous error
ActuallyTrue Positives
33%
• Evaluation of algorithm-predicted false positives
• Psychology clinician looked at participant’s videos
• Emulate caregiver intervention.
• Class imbalance and overlap.
• Detect activity errors in real-time.
47
Conclusion
• Validated primary hypothesis.
• Foundation of a real-world prompting system.
• RACOG and wRACOG for continuous values.
• ClusBUS in other domains.
• Precise annotation for activity errors.
Summary Significance
FutureWork
48
Publications
Book Chapter Journal
B. Das, N.C. Krishnan, D.J. Cook, “Handling Imbalanced and Overlapping Classes in Smart Environments Prompting Dataset”, Spinger book on Big Data, 2014.
B. Das, N.C. Krishnan, D.J. Cook, “Real-Time Activity Error Prediction to Assist Older Adults in Smart Homes: An Outlier Detection-Based Approach”, AI in Medicine, 2014. (Submitted)
B. Das, N.C. Krishnan, D.J. Cook, “Automated Activity Intervention to Assist with Activities of Daily Living”, IOS Press book on Agent-Based Approaches to Ambient Intelligence, 2012.
B. Das, N.C. Krishnan, D.J. Cook, “RACOG and wRACOG: Two Probabilistic Oversampling Techniques”, IEEE Transaction of Knowledge and Data Engineering, 2014.
A.M. Seelye, M. Schmitter-Edgecombe, B. Das, D.J. Cook, “Application of cognitive rehabilitation theory to the development of smart prompting technologies”, IEEE Reviews in Biomedical Engineering, 2012.
B. Das, D.J. Cook, M. Schmitter-Edgecombe, A.M. Seelye, “PUCK: An Automated Prompting System for Smart Environments”, Journal on Personal and Ubiquitous Computing, 2012.
49
Publications
Conference Workshop
B. Das, N.C. Krishnan, D.J. Cook, “wRACOG: A Gibbs Sampling-Based Oversampling Technique”, International Conference on Data Mining, 2013.
B. Das, N.C. Krishnan, D.J. Cook, “Handling Imbalanced and Overlapping Classes in Smart Environments, ICDM Workshop in Data Mining in Bioinformatics and Healthcare, 2013.
S. Dernbach, B. Das, N.C. Krishnan, B.L. Thomas, D.J. Cook, “Simple and Complex Activity Recognition Through Smart Phones”, International Conference on Intelligence Environments, 2012.
B. Das, A.M. Seelye, B.L. Thomas, D.J. Cook, L.B. Holder, “Using Smart Phones for Context-Aware Prompting in Smart Environments”, International Workshop on Consumer eHealth Platforms, Services and Applications, 2012.
B. Das, C. Chen, A.M. Seelye, D.J. Cook, “An Automated Prompting System for Smart Environments”, International Conference on Smart Homes and Health Telematics, 2011.
B. Das, D.J. Cook, “Data Mining Challenges in Automated Prompting Systems”, Interactions with Smart Objects Workshop, 2011.
E. Nazerfard, B. Das, L.B. Holder, D.J. Cook, “Conditional Random Fields for Activity Recognition in Smart Environments”, ACM Symposium on Human Informatics, 2010.
B. Das, C. Chen, N. Dasgupta, D.J. Cook, “Automated Prompting in Smart Home Environment”, ICDM Workshop on Data Mining Services, 2010.
C. Chen, B. Das, D.J. Cook, “A Data Mining Framework for Activity Recognition in Smart Environments”, International Conference on Intelligent Environments, 2010.
C. Chen, B. Das, D.J. Cook, “Energy Prediction Using Resident’s Activity”, International Workshop on Knowledge Discovery from Sensor Data, 2010.
50
AcknowledgementDr. Diane Cook Prafulla Dawadi Adri Seelye
Dr. Larry Holder Dr. Ehsan Nazerfard Carolyn Parsey
Dr. Narayanan C. Krishnan (CK) Dr. Kyle Feuz Christa Simon
Dr. Maureen Schmitter-Edgecombe Brian Thomas Alyssa Weakley
Dr. Behrooz Shirazi Chris Cain Jennifer Williams
Dr. Alex Mihailidis Shirin Shahsavand
Dr. Aaron Crandall
Dr. Hassan Ghasemzadeh
And, all previous colleagues, collaborators and friends…
51
top related