vital sign quality assessment using ordinal regression of time series data
DESCRIPTION
Vital Sign Quality Assessment using Ordinal Regression of Time Series Data. Risa B. Myers Comp 600 September 30, 2013. Christopher M. Jermaine PhD Rice University Department of Computer Science. John C. Frenzel MD University of Texas MD Anderson Cancer Center. Patient Monitoring. - PowerPoint PPT PresentationTRANSCRIPT
Vital Sign Quality Assessment using Ordinal Regression of
Time Series Data
Risa B. MyersComp 600
September 30, 2013
Christopher M. Jermaine PhDRice University
Department of Computer Science
John C. Frenzel MDUniversity of Texas
MD Anderson Cancer Center
Patient Monitoring
http://ak4.picdn.net/shutterstock/videos/1240198/preview/stock-footage-looping-animation-of-a-medical-hospital-monitor-of-normal-vital-signs-hd.jpg
• Physiological Measures– Temperature– Blood Pressure– Heart Rate– Respiration Rate
What Vitals Signs Are
Vital signs vs. EKG
Seconds
Systolic BP
Minutes
✔
✗
Heart RateDiastolic BP
Volatility• New term, wrt vital signs• Changes• Not just variance
Anesthesia Vital Signs
Motivation
• Computer Science– Learn to interpret pattern-less signals
• Biomedical– Assess quality of care– Clinical Decision Support
• Interpret patient data• Discover underlying causes• Predict outcomes and events
#7
Goals• Interpret vital sign data in a patient chart• Assign a volatility label
• Mimic an expert’s assessment• Predict outcome
Contributions
• Novel approach to ordinal regression for time series data lacking characteristic patterns
• Ability to identify outlier time series
• Model that can mimic expert assessment
Terms
Vital Sign Quality Assessment using Ordinal Regression of Time Series Data
Time Series
• Ordered series of data• Some relationship exists
63, 66, 72, 79, 85, 90, 92, 93, 88, 81, 72, 65 Average monthly high temperatures in Houston
www.weather.com
Ordinal Regression
Ordinal Temperature Labels
Tempera-ture
0
20
40
60
80
100
120
Really HotHotNiceCool°
Fahr
enhe
it
Ordinal Regression
Classification vs. Ordinal Regression
Classes have order
Labeled Vital Signs
State of the Art
• Bayesian modeling of time series– Sykacek & Roberts – Hierarchical Bayesian model
to perform feature extraction and classify time segments using a latent feature space• Small # of real examples
• Time Series– kNN – DTW– Complexity-invariant classification– Shapelets– …
kNN-DTW
C. Cassisi, P. Montalto, M. Aliotta, A. Cannata, and A. Pulvirenti, “Similarity Measures and Dimensionality Reduction Techniques for Time Series Data Mining,” no. 3, InTech, 2012.
1NN-DTW
Complexity Invariance
G. Batista, X. Wang, and E. J. Keogh, “A complexity-invariant distance measure for time series,” SIAM Conf Data Mining, 2011.
Shapelets
L. Ye and E. Keogh, “Time series shapelets,” presented at the the 15th ACM SIGKDD international conference, New York, New York, USA, 2009, p. 947.
Biomedical Labeling
• Vital sign analysis– Yang et al. – Classification of anesthesia time series
segments • Patterns, durations, frequencies and sequences of patterns
defined by an anesthesiologist• (Ordinal) regression
– Meyfroidt et al. – Length of stay prediction after cardiac surgery
• Vital signs derived values + additional patient and case data• Off-the-shelf classifiers• Regression problem, but use RMSE for evaluation• Best result: better than nurses, better than standard risk model,
comparable to physicians’ predictions
The AR-OR Model
• Autoregressive – Ordinal Regression Model
• Generates ordinal labels using statistical properties of time series
• Assumes patients with the same volatility label have similar state profiles
AR-OR Model Components
1. Autoregression – Time series representation
2. Segmenting – State assignment
3. Ordinal Regression – Integer valued output
1. Autoregression
Linear combination of previous values + noise
Autoregression in AR-OR
• Order = 1• Coefficients = 1
63, 66, 72, 79, 85, 90, 92, 93, 88, 81, 72, 65
3, 6, 7, 6, 5, 2, 1, -5, -7, -9, -7
Average monthly high temperatures
Change in average monthly high temperatures
2. States via HMM
• Hidden Markov Model– States (hidden)– Emissions (visible)– Transition Matrix
2. State Assignment
Inference
2. Segmenting
41%25% 7%19%8%
State 1: State 2:State 3: State 4:State 5:
3. Regression
Generative ProcessK Number of states
L Number of labels
D Number of time series
Φ Autoregression coefficients
R Autoregressive order
p State transition probabilities
μ State means
Σ State covariance matrices
r Regression coefficients
p0 Initial state probabilities
ω Goalpost
σ2ω Goalpost variance
σ2r Regression variance
Mi Time series length
s State
f Fraction of time in each state
x Observations
v’ Real valued label
v Ordinal label
Bayesian Approach
• Probability Density Function of the form
• X - training data set– Observed values
• Y - hidden variables– States, hidden label, …
• Θ - model parameters– State means, co-variances, transition matrix, …
Data
• MD Anderson Cancer Center• Surgical vital sign
– Systolic Blood Pressure• 3 anesthetists• 200 time series• Labels:1 (stable) to 5 (highly volatile)
Implementation
• Markov chain Monte Carlo– Iterative process– Sampling from probability distributions
• Gibbs Sampling– Conjugate priors– Rejection Sampling
• Two phases– Learning model parameters– Labeling unknown series
Final Label
• Assign label based on the mode of last n iterations
Comparison
1. Upper Bound – 2 experts predicting 12. AR-OR Model*3. 1NN-DTW 4. 1NN-Complexity-Invariant Distance5. Linear Regression on variance6. Guess the most common label
*My model
Results
Upper
Bound
AR-OR*
1NN-D
TW
1NN-C
ID
Linea
r Reg
ressio
n
Guess
30
1
2
3
AllOutliers
Current Work• Other time series without patterns
– ICU• Expanded model
– Demographics– Time series features– Multiple time series
• Direct comparisons– Demographic data only– Demographics + 1st and 2nd order features– Demographics + times series features + time series
• More objective labels– Length of stay– Expiration
Next Steps
• Focus on feature selection• Solving a clinical problem• Expand model
–History• Medications• Lab results
References and Acknowledgements
• P. Sykacek and S. Roberts, “Bayesian time series classification,” presented at the Advances in Neural Information Processing 14, Boston, MA, 2002, pp. 937–944.
1. P. Yang, G. Dumont, and J. M. Ansermino, “Online pattern recognition based on a generalized hidden Markov model for intraoperative vital sign monitoring,” Int. J. Adapt. Control Signal Process., vol. 24, 2010.
2. G. Meyfroidt, F. Güiza, D. Cottem, W. De Becker, K. Van Loon, J.-M. Aerts, D. Berckmans, J. Ramon, M. Bruynooghe, and G. Van Den Berghe, “Computerized prediction of intensive care unit discharge after cardiac surgery: development and validation of a Gaussian processes model.,” BMC Med Inform Decis Mak, vol. 11, p. 64, 2011.
Supported in part by by the NSF under grant number 0964526 and by a training fellowship from the Keck Center of the Gulf Coast Consortia, on Rice University’s NLM Training Program in Biomedical Informatics (NLM Grant No. T15LM007093).
Take-aways• Time series data are difficult to analyze
• Using time series data produces better results than approaches like Linear Regression
• Machine learning approaches can approximate expert assessments
• Opportunity & need for clinical decision support
Provider Labels
Apply Bayes’ Theorem
• To learn the model parameters
• To learn the label for the test time series
Autoregression in the AR-OR Model
• Time series values used to determine the state means and variances
• Each state has a set of AR coefficients• Simplified
– AR(1) – Coefficients = 1
• Values are the differences between points
MSE- All Test Cases
Pro-vider
Gold Stnd
AR-OR
1NN-DTW
1NN-CID
LR Guess 3
1 0.52 0.50 1.38 0.65 0.63 0.61
2 0.81 0.94 1.25 1.39 1.20 1.01
3 0.58 0.58 1.10 0.89 0.80 0.76
TPR– All Test Cases
Pro-vider
Gold Stnd
AR-OR
1NN-DTW
1NN-CID
LR Guess 3
1 0.57 0.58 0.35 0.50 0.55 0.55
2 0.44 0.35 0.42 0.30 0.35 0.39
3 0.52 0.53 0.39 0.46 0.41 0.41
MSE – Outliers
Pro-vider
Gold Stnd
AR-OR
1NN-DTW
1NN-CID
LR Guess 3
1 2.71 2.08 4.81 2.51 4.00 4.00
2 2.32 1.99 2.20 1.80 3.49 4.00
3 1.68 1.16 3.54 2.15 3.55 4.00
TPR– Outliers
Pro-vider
Gold Stnd
AR-OR
1NN-DTW
1NN-CID
LR Guess 3
1 0.01 0.11 0.05 0.05 0.00 0.00
2 0.00 0.06 0.28 0.41 0.00 0.00
3 0.04 0.06 0.01 0.06 0.00 0.00
State Fraction Equation
• Time spent in state S
States for time series i
Indicator function
Length of time series i
State S
Ordinal Regression in the AR-OR Model
Real valued outcome
Number of states
State fraction function
Ordinal regression noise
Regression coefficient for state k
Autoregression
Observed data
Order of the regression
Regression coefficient
Constant
Noise
State Assignments
Bootstrapping
• Randomly sample test set with replacement– 30% of records
• Remaining records are training set• Repeat
• Alternative to k-fold cross-validation