machine learning challenges for automated prompting in smart homes

46
Machine Learning Challenges for Automated Prompting in Smart Homes Barnan Das May 22, 2014

Upload: barnan-das

Post on 14-Jun-2015

624 views

Category:

Data & Analytics


0 download

DESCRIPTION

As the world's population ages, there is an increased prevalence of diseases related to aging, such as dementia. Caring for individuals with dementia is frequently associated with extreme physical and emotional stress, which often leads to depression. Smart home technology and advances in machine learning techniques can provide innovative solutions to reduce caregiver burden. One key service that caregivers provide is prompting individuals with memory limitations to initiate and complete daily activities. We hypothesize that sensor technologies combined with machine learning techniques can automate the process of providing reminder-based interventions or prompts. This dissertation focuses on addressing machine learning challenges that arise while devising an effective automated prompting system. Our first goal is to emulate natural interventions provided by a caregiver to individuals with memory impairments, by using a supervised machine learning approach to classify pre-segmented activity steps into prompt or no-prompt classes. However, the lack of training examples representing prompt situations causes imbalanced class distribution. We proposed two probabilistic oversampling techniques, RACOG and wRACOG, that help in better learning of the``prompt'' class. Moreover, there are certain prompt situations where the sensor triggering signature is quite similar to the situations when the participant would probably need no prompt. The absence of sufficient data attributes to differentiate between prompt and no-prompt classes causes class overlap. We propose ClusBUS, a clustering-based under-sampling technique that identifies ambiguous data regions. ClusBUS preprocesses the data in order to give more importance to the minority class during classification. Our second goal is to automatically detect activity errors in real time, while an individual performs an activity. We propose a collection of one-class classification-based algorithms, known as DERT, that learns only from the normal activity patterns and without using any training examples for the activity errors. When evaluated on unseen activity data, DERT is able to identify abnormalities or errors, which can be potential prompt situations. We validate the effectiveness of the proposed algorithms in predicting potential prompt situations on the sensor data of ten activities of daily living, collected from 580 participants, who were part of two smart home studies.

TRANSCRIPT

Page 1: Machine Learning Challenges For Automated Prompting In Smart Homes

Machine Learning Challenges for Automated Prompting in Smart Homes

Barnan Das

May 22, 2014

Page 2: Machine Learning Challenges For Automated Prompting In Smart Homes

2

2009 2030

Older adult (65+) population in US

72mn

40mn

Page 3: Machine Learning Challenges For Automated Prompting In Smart Homes

3

5million

15million

60%

Alzheimer’s patient

Unpaid caregivers

Caregivers report stress

Page 4: Machine Learning Challenges For Automated Prompting In Smart Homes

4

Page 5: Machine Learning Challenges For Automated Prompting In Smart Homes

5

Machine learning algorithms trained on smart home sensor data can predict when an individual faces difficulty while performing everyday activities. “ ”

Page 6: Machine Learning Challenges For Automated Prompting In Smart Homes

6

Page 7: Machine Learning Challenges For Automated Prompting In Smart Homes

7

Smart Home Studies

Study 1 Study 2

Participants 400 180

Activities 8 6

Activity Errors Naturalistic

Naturalistic

Page 8: Machine Learning Challenges For Automated Prompting In Smart Homes

8

Automated Prompting

Emulating Caregiver Prompt Timing

Detecting Activity Errors in Real Time

Imbalanced Class

DistributionClass Overlap

One-Class Classification

Overview

Study 1 Study 1, 2

Page 9: Machine Learning Challenges For Automated Prompting In Smart Homes

9

Emulating Caregiver Prompt Timing

8 DailyActivities

Study 1

Prompts issued when errors were committed

Raw Data

1ActivityStep

17 EngineeredFeatures

Used by Algorithms

0/1

1Training Example

Binary class{prompt, no-prompt}

Page 10: Machine Learning Challenges For Automated Prompting In Smart Homes

10

Total # training examples

39803.94%

Class Distribution

prompt

class

Page 11: Machine Learning Challenges For Automated Prompting In Smart Homes

11

Automated Prompting

Emulating Caregiver Prompt Timing

Detecting Activity Errors in Real Time

Imbalanced Class

DistributionClass Overlap

One-Class Classification

Overview

Page 12: Machine Learning Challenges For Automated Prompting In Smart Homes

12

Imbalanced Class Distribution

Page 13: Machine Learning Challenges For Automated Prompting In Smart Homes

13

Preprocessing

Sampling• Over-sampling the minority class• Under-sampling the majority class

Oversampling• Spatial location of training examples in

Euclidean space

Existing Solutions

Page 14: Machine Learning Challenges For Automated Prompting In Smart Homes

14

Preprocessing technique to oversample minority class

Approximate discrete probability distribution

using

Generate new minority class data points using

Chow-Liu’s algorithm

Gibbs sampling

Proposed Approach

Page 15: Machine Learning Challenges For Automated Prompting In Smart Homes

17

Minority Class Samples

Majority Class

Samples

Markov Chains

Gibbs Sampling

Page 16: Machine Learning Challenges For Automated Prompting In Smart Homes

18

(wrapper-based) RApidly COnverging Gibbs Sampler

RACOG wRACOG

Sample selection

Pre-defined lag on Markov chain

Highest probability of misclassification by wrapper classifier

Stopping criteria

Pre-defined number of iterations

No improvement of a performance measure

RACOG & wRACOG

Page 17: Machine Learning Challenges For Automated Prompting In Smart Homes

19

Experimental Setup

Datasets Approaches Classifiers

Study 1 (Prompting) Baseline Classifier C4.5 Decision Tree

9 UCI Datasets SMOTE SVM

SMOTEBoost K-Nearest Neighbor

RUSBoost Logistic Regression

Baseline Prompting

RACOG

wRACOG

Page 18: Machine Learning Challenges For Automated Prompting In Smart Homes

20

Results (True Positive Rate)

Baselin

e Cla

ssifie

r

SMO

TE

SMO

TEBoost

RUSBoost

Baselin

e Pro

mptin

g

RACOG

wRACOG

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Page 19: Machine Learning Challenges For Automated Prompting In Smart Homes

21

Results (G-mean)

Baselin

e Cla

ssifie

r

SMO

TE

SMO

TEBoost

RUSBoost

Baselin

e Pro

mptin

g

RACOG

wRACOG

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Page 20: Machine Learning Challenges For Automated Prompting In Smart Homes

22

Automated Prompting

Emulating Caregiver Prompt Timing

Detecting Activity Errors in Real Time

Imbalanced Class

DistributionClass Overlap

One-Class Classification

Overview

Page 21: Machine Learning Challenges For Automated Prompting In Smart Homes

23

Class Overlap

Page 22: Machine Learning Challenges For Automated Prompting In Smart Homes

24

Class Overlap in Prompting Data

3-dimensional PCA plot of prompting data

Page 23: Machine Learning Challenges For Automated Prompting In Smart Homes

25

Tomek Links

Page 24: Machine Learning Challenges For Automated Prompting In Smart Homes

26

Form clusters

Under-sampling clusters

Cluster-Based Under-Sampling

Page 25: Machine Learning Challenges For Automated Prompting In Smart Homes

27

ClusBUS Ensemble

Page 26: Machine Learning Challenges For Automated Prompting In Smart Homes

28

Experimental Setup

Dataset Approaches Classifiers

Study 1 (Prompting) Baseline C4.5 Decision Tree

SMOTE Naive Bayes

Clustering Algorithm ClusBUS K-Nearest Neighbor

DBSCAN ClusBUS Ensemble SVM

Page 27: Machine Learning Challenges For Automated Prompting In Smart Homes

29

Result (True Positive Rate)

C4.5 Naïve Bayes IBk SMO0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Baseline SMOTEClusBUS ClusBUS Ensemble

Page 28: Machine Learning Challenges For Automated Prompting In Smart Homes

30

Result (G-mean)

C4.5 Naïve Bayes IBk SMO0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Baseline SMOTEClusBUS ClusBUS Ensemble

Page 29: Machine Learning Challenges For Automated Prompting In Smart Homes

31

Automated Prompting

Emulating Caregiver Prompt Timing

Detecting Activity Errors in Real Time

Imbalanced Class

DistributionClass Overlap

One-Class Classification

Overview

Page 30: Machine Learning Challenges For Automated Prompting In Smart Homes

32

Detecting Activity Errors in Real Time

Sensor events labeled with

activity stepsAvailability of information on

activity errors

Page 31: Machine Learning Challenges For Automated Prompting In Smart Homes

33

Basic Idea

Participants with no reported errors

One-Class Classifier

Participants who committed errors

Normal Activity

Data

Train Test

Activity Datawith ErrorsActivity

Data

Page 32: Machine Learning Challenges For Automated Prompting In Smart Homes

34

6 DailyActivities

Participants

Annotated for error start times

Raw Data

1SensorEvent

>70EngineeredFeatures

1

1Training Example

One-class

{normal}

Used by Algorithms

580

DERT Data

Page 33: Machine Learning Challenges For Automated Prompting In Smart Homes

35

One-Class SVM

x1

x2

Page 34: Machine Learning Challenges For Automated Prompting In Smart Homes

36

Model Selection

Page 35: Machine Learning Challenges For Automated Prompting In Smart Homes

37

Activity Error Classification

WHY? To characterize change in daily activities of older adults

HOW? Sensor data

Error Types Accuracy*

Study 1 4 73%Study 2 9 54%

*Using C4.5 decision tree and 10-fold CV

Page 36: Machine Learning Challenges For Automated Prompting In Smart Homes

41

Activity Error Models

One-Class Multi-Class

Page 37: Machine Learning Challenges For Automated Prompting In Smart Homes

42

Ensembles

One-Class SVM

Test Sample

Error Model

One-ClassMulti-Class

Logical AND

Normal/Error

Page 38: Machine Learning Challenges For Automated Prompting In Smart Homes

43

Experimental Setup

Datasets Approaches

Study 1 (400 participants) Baseline

Study 2 (180) participants OCSVM

OCSVM + OCEM

OCSVM + MCEM

Page 39: Machine Learning Challenges For Automated Prompting In Smart Homes

44

Results: Study 1

Sweepi

ng a

nd D

ustin

g

Takin

g M

edica

tion

Wat

erin

g Pla

nts

Cookin

g0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Sweepi

ng a

nd D

ustin

g

Takin

g M

edica

tion

Wat

erin

g Pla

nts

Cookin

g0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

Recall Precision

Baseline OCSVM OCSVM+OCEM OCSVM+MCEM

Page 40: Machine Learning Challenges For Automated Prompting In Smart Homes

45

Results: Study 2Recall Precision

Baseline OCSVM OCSVM+OCEM OCSVM+MCEM

Sweepi

ng a

nd D

ustin

g

Clean

ing

Count

erto

ps

Takin

g M

edica

tion

Wat

erin

g Pla

nts

Was

hing

Han

ds

Cookin

g

0

0.1

0.2

0.3

0.4

0.5

0.6

Sweepi

ng a

nd D

ustin

g

Clean

ing

Count

erto

ps

Takin

g M

edica

tion

Wat

erin

g Pla

nts

Was

hing

Han

ds

Cookin

g

0

0.01

0.02

0.03

0.04

0.05

0.06

Page 41: Machine Learning Challenges For Automated Prompting In Smart Homes

46

Clinical Evaluation

18%

Continuation of

Previous error

ActuallyTrue Positives

33%

• Evaluation of algorithm-predicted false positives

• Psychology clinician looked at participant’s videos

Page 42: Machine Learning Challenges For Automated Prompting In Smart Homes

• Emulate caregiver intervention.

• Class imbalance and overlap.

• Detect activity errors in real-time.

47

Conclusion

• Validated primary hypothesis.

• Foundation of a real-world prompting system.

• RACOG and wRACOG for continuous values.

• ClusBUS in other domains.

• Precise annotation for activity errors.

Summary Significance

FutureWork

Page 43: Machine Learning Challenges For Automated Prompting In Smart Homes

48

Publications

Book Chapter Journal

B. Das, N.C. Krishnan, D.J. Cook, “Handling Imbalanced and Overlapping Classes in Smart Environments Prompting Dataset”, Spinger book on Big Data, 2014.

B. Das, N.C. Krishnan, D.J. Cook, “Real-Time Activity Error Prediction to Assist Older Adults in Smart Homes: An Outlier Detection-Based Approach”, AI in Medicine, 2014. (Submitted)

B. Das, N.C. Krishnan, D.J. Cook, “Automated Activity Intervention to Assist with Activities of Daily Living”, IOS Press book on Agent-Based Approaches to Ambient Intelligence, 2012.

B. Das, N.C. Krishnan, D.J. Cook, “RACOG and wRACOG: Two Probabilistic Oversampling Techniques”, IEEE Transaction of Knowledge and Data Engineering, 2014.

A.M. Seelye, M. Schmitter-Edgecombe, B. Das, D.J. Cook, “Application of cognitive rehabilitation theory to the development of smart prompting technologies”, IEEE Reviews in Biomedical Engineering, 2012.

B. Das, D.J. Cook, M. Schmitter-Edgecombe, A.M. Seelye, “PUCK: An Automated Prompting System for Smart Environments”, Journal on Personal and Ubiquitous Computing, 2012.

Page 44: Machine Learning Challenges For Automated Prompting In Smart Homes

49

Publications

Conference Workshop

B. Das, N.C. Krishnan, D.J. Cook, “wRACOG: A Gibbs Sampling-Based Oversampling Technique”, International Conference on Data Mining, 2013.

B. Das, N.C. Krishnan, D.J. Cook, “Handling Imbalanced and Overlapping Classes in Smart Environments, ICDM Workshop in Data Mining in Bioinformatics and Healthcare, 2013.

S. Dernbach, B. Das, N.C. Krishnan, B.L. Thomas, D.J. Cook, “Simple and Complex Activity Recognition Through Smart Phones”, International Conference on Intelligence Environments, 2012.

B. Das, A.M. Seelye, B.L. Thomas, D.J. Cook, L.B. Holder, “Using Smart Phones for Context-Aware Prompting in Smart Environments”, International Workshop on Consumer eHealth Platforms, Services and Applications, 2012.

B. Das, C. Chen, A.M. Seelye, D.J. Cook, “An Automated Prompting System for Smart Environments”, International Conference on Smart Homes and Health Telematics, 2011.

B. Das, D.J. Cook, “Data Mining Challenges in Automated Prompting Systems”, Interactions with Smart Objects Workshop, 2011.

E. Nazerfard, B. Das, L.B. Holder, D.J. Cook, “Conditional Random Fields for Activity Recognition in Smart Environments”, ACM Symposium on Human Informatics, 2010.

B. Das, C. Chen, N. Dasgupta, D.J. Cook, “Automated Prompting in Smart Home Environment”, ICDM Workshop on Data Mining Services, 2010.

C. Chen, B. Das, D.J. Cook, “A Data Mining Framework for Activity Recognition in Smart Environments”, International Conference on Intelligent Environments, 2010.

C. Chen, B. Das, D.J. Cook, “Energy Prediction Using Resident’s Activity”, International Workshop on Knowledge Discovery from Sensor Data, 2010.

Page 45: Machine Learning Challenges For Automated Prompting In Smart Homes

50

AcknowledgementDr. Diane Cook Prafulla Dawadi Adri Seelye

Dr. Larry Holder Dr. Ehsan Nazerfard Carolyn Parsey

Dr. Narayanan C. Krishnan (CK) Dr. Kyle Feuz Christa Simon

Dr. Maureen Schmitter-Edgecombe Brian Thomas Alyssa Weakley

Dr. Behrooz Shirazi Chris Cain Jennifer Williams

Dr. Alex Mihailidis Shirin Shahsavand

Dr. Aaron Crandall

Dr. Hassan Ghasemzadeh

And, all previous colleagues, collaborators and friends…

Page 46: Machine Learning Challenges For Automated Prompting In Smart Homes

51