feature selection and classification in supporting report based self-management with chronic pain
TRANSCRIPT
Feature Selection and Classification in Supporting Report-Based Self-Management for People with
Chronic Pain
Author:Yan Huang, Huiru Zheng, Chris Nugent, Paul McCullagh, Norman Black, Kevin E. Vowles, and Lance McCracken
Advisor: Ben-Jye Chang Student: YU-HSIEN CHO
Source: Information Technology in Biomedicine, IEEE Transactions on Jan. 2011, Journals & Magazines
OUTLINE• I. Introduction• II. Issue• III. Motivations• VI. Approaches• VI. Numerical results• V. Conclusion
Introduction
• Older people has increased, that two thirds of people who reached retirement age had at least two chronic conditions.
Introduction
• Machine learning approach, self-reporting data collected from the integrated biopsychosocial treatment, in order to identify an optimal set of features for supporting self management.
Fig. 1. Assessment interface of the PSMS and the assessment workflow for self-management
Introduction
Introduction
• We assess the feasibility of applying automated classification techniques to identify "low" and "better" health status levels from self-reporting data and explore an appropriate classification algorithm.
OUTLINE• I. Introduction• II. Issue• III. Motivations• VI. Approaches• VI. Numerical results• V. Conclusion
Issue
• Numbers of selected questions and classification performance of a person’s health status level.
• Which ranking method and which classification model had the best performance.
OUTLINE• I. Introduction• II. Issue• III. Motivations• VI. Approaches• VI. Numerical results• V. Conclusion
Motivations
• Traditional health care, expensive, consuming significant resources , inconvenient.
• PWCP, self-management of their health care has been shown to be effective in terms of improving their QoL.
PWCP: People With Chronic PainQoL: Quality of Life
OUTLINE• I. Introduction• II. Issue• III. Motivations• VI. Approaches• VI. Numerical results• V. Conclusion
Approachs
A.Dataset
187 subjects who suffered from chronic pain
8 types of questionnaire total number of questions was 329, answers had values
"pretreatment“ stage as " low health level “, "posttreatment“ stage as " better health level “
16 (8.6%) of the patients withdrew , 171 (91.4%) of the patients completed the treatment
training sets:114 patients, testing sets:57 patients
Approachs
B.Methods
Four feature selection methods, rank the questions.
1.SVM-RFE(Support Vector Machine With Recursive Feature Elimination):
The ranking criterion for feature i :
Methods: Step 1: Train an SVM on the dataset. Step 2: Rank features according to the criterion c. Step 3: Eliminate the lowest ranked feature. Step 4: If more than one feature remains, return to step 1.
Approachs
2.OneR: 1-level decision tree
Q.1
1 21
2 16
3 22
4 20
5 35
Steps:
For each feature fiFor each value v from the domain of fiSelect the set of instances where feature fi has value vLet c = the most frequent class in that setAdd the clause “if feature fi has value v then the class is c”to the rule for feature fiOutput the rule with the highest classification accuracy.
Approachs
3.Information Gain:based on Shannon’s information theory and can be calculated from (1)–(3)
A represents a feature (question) of an instance, which has n values
two classes(pre. and post.),each has 114 instances
Approachs
Approachs
4. X2 Statistic:
m, number of answers for one question(feature) ni , frequency of that answer i Pi , probability of that answer i n , total frequency for all the questions’ answers, 228×329
Approachs
C. Classification Performance Assessment• Purpose : classify the person’s appropriate
health status
• Classifier: C4.5,Naive Bayes, SVM, MLP
Approachs
1.Overall accuracy:
2.Area Under the ROC Curve(AUC):
Suggested as a tool, which can evaluate the performance of the classification alorgithm
OUTLINE• I. Introduction• II. Issue• III. Motivations• VI. Approaches• VI. Numerical results• V. Conclusion
Numerical results
There were no significant differences between the feature ranking methods in overall classification accuracy. (any of the four feature ranking methods can be used)
There were significant differences between the classifiers for each ranking method.
The MLP classifier has been identified as the best option to build the classification model for PSMS in the sense that both overall accuracy and AUC were very high.
OUTLINE• I. Introduction• II. Issue• III. Motivations• VI. Approaches• V. Numerical results• IV. Conclusion
Conclusion
• Feedback information for their self-management
• Changing their behavior,lifestyle, and care plan in order to achieve effective self-management of their chronic condition
Thank you for
your listening