bridging the gap between applications and tools: modeling multivariate time series
DESCRIPTION
Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series. X Liu, S Swift & A Tucker Department of Computer Science Birkbeck College University of London. MTS Applications at Birkbeck. Screening Forecasting Explanation. Forecasting. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/1.jpg)
Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series
X Liu, S Swift & A TuckerX Liu, S Swift & A TuckerDepartment of Computer ScienceDepartment of Computer Science
Birkbeck CollegeBirkbeck College
University of LondonUniversity of London
![Page 2: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/2.jpg)
![Page 3: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/3.jpg)
![Page 4: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/4.jpg)
MTS Applications at Birkbeck
ScreeningScreening
ForecastingForecasting
ExplanationExplanation
![Page 5: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/5.jpg)
Forecasting
Predicting Visual Field Deterioration of Predicting Visual Field Deterioration of Glaucoma PatientsGlaucoma Patients
Function Prediction for Novel Proteins from Function Prediction for Novel Proteins from Multiple Sequence/Structure DataMultiple Sequence/Structure Data
![Page 6: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/6.jpg)
Explanation
Input (observations):
t - 0 : Tail Gas Flow in_state 0t - 3 : Reboiler Temperature in_state 1
Output (explanation):
t - 7 : Top Temperature in_state 0 with probability=0.92t - 54 : Feed Rate in_state 1 with probability=0.71t - 75 : Reactor Temperature in_state 0 with probability=0.65
![Page 7: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/7.jpg)
The Gaps
ScreeningScreening Automatic / Semi- Automatic Analysis of Automatic / Semi- Automatic Analysis of
OutliersOutliers ForecastingForecasting
Analysing Short Multivariate Time SeriesAnalysing Short Multivariate Time Series ExplanationExplanation
Coping with Huge Search SpacesCoping with Huge Search Spaces
![Page 8: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/8.jpg)
The Problem - What/Why/How Short-Term Forecasting of Visual Field Progression Using a Statistical MTS Model The Vector Auto-Regressive Process - VAR(P) There Could be Problems if the MTS is Short A Modified Genetic Algorithm (GA) can be Used VARGA
The Prediction of Visual Field Deterioration Plays anImportant Role in the Management of the Condition
![Page 9: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/9.jpg)
Background - The Dataset
The interval between testsis about 6 months
Typically, 76 pointsare measured
The number of tests canrange between 10 and 44
xPoints used in this paper (Right Eye)
Usual Position of Blind Spot (Right Eye)
x
Values Range Between60 =very good, 0 = blind
76 75 18 19
74 73
71
15 16 17
70 69 68
67 66 65
11 12 13 14
64 63
72
6 7 8 9 10
62 61 60 59 58 1 2 3 4 5
43 42 41 40 39 20 21 22 23 24
48 47 46 45 44 25 26 27 28 29
52 51 50 49 30 31 32 33
55 54 53 34 35 36
57 56 37 38
![Page 10: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/10.jpg)
Background - The VAR ProcessVector Auto-Regressive Process of Order P: VAR(P)
x(t) VF Test for Data Points at Time t (K1)Ai Parameter Matrix at Lag i (KK)x(t-i) VF Test for Data Points at lag i from t (K1) (t) Observational Noise at time t (K1)
![Page 11: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/11.jpg)
The Genetic Algorithm
Generate a Generate a PopulationPopulation of random of random ChromosomesChromosomes (Solutions)(Solutions)
Repeat for a number of Repeat for a number of GenerationsGenerations
Cross OverCross Over the current Population the current Population
MutateMutate the current the current PopulationPopulation
Select the Select the FittestFittest for the next Population for the next Population
LoopLoop
The best solution to the problem is the Chromosome inThe best solution to the problem is the Chromosome inthe last generation which has the highest the last generation which has the highest FitnessFitness
“A Search/Optimisation method that solves a problem
through maintaining and improving a population of
suitable candidate solutions using biological metaphors”
![Page 12: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/12.jpg)
GAs - Chromosome Example
X
0-1270000000-1111111
Y
0-3100000-11111
0000000.00000-1111111.11111
![Page 13: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/13.jpg)
GAs - Mutation
Each Bit (gene) of a Chromosome is Given Each Bit (gene) of a Chromosome is Given a Chance MP of invertinga Chance MP of inverting
A ‘1’ becomes a ‘0’, and a ‘0’ becomes a 1’A ‘1’ becomes a ‘0’, and a ‘0’ becomes a 1’
01101101
These Ones!
00101111
![Page 14: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/14.jpg)
GAs - Crossover (2)
01011101 11101010AA BB
X=4X=4
01011010
CC DD11101101
![Page 15: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/15.jpg)
VARGA - Representation
Chromosome
a111 … …
… a1ij …
… … a1KK
A1 A2 Am Ap
... ...a211 … …
… a2ij …
… … a1KK
am11 … …
… amij …
… … amKK
ap11 … …
… apij …
… … apKK
![Page 16: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/16.jpg)
VARGA - The Genetic Algorithm GA With Extra Mutation Order Mutation After Gene Mutation Parents and Children Mutate (Both) Genes are Bound Natural Numbers Fitness is -ve Forecast Error Minimisation Problem - Roulette Wheel Run for EACH Patient
![Page 17: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/17.jpg)
Evaluation - Methods for Comparison
SPlus: Yule Walker Equations, AIC and Whittles Recursion, NK(P+1), Standard Package Holt-Winters Univariate Forecasting Method, Is the Data Univariate? (GA Solution) Pure Noise Model, VAR(0), Worst Case Forecast, (Non-Differenced = 0) 54 out of the Possible 82 Patients VF Records Could not be Used : SPlus Implementation
![Page 18: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/18.jpg)
Results - Graph Comparison
Scores for Cases 0 to 6
0
500
1000
1500
2000
0 1 2 3 4 5 6
Case Number
Score
HW
S-Plus
VARGA
Noise
The Lower the Score - the Better Score is the One Step Ahead Forecast Error
![Page 19: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/19.jpg)
Results - Table Summary
Average = The Average One Step Forecast ErrorFor the 28 Patients (Both GA’s Fitness)
(The Lower - The Better)
Method Order(number of order)
AverageScore
VARGA 26 of 1, 2 of 2 559.82S-Plus 12 of 0, 14 of 1, 1 of 2, 1 of 3 616.12HW N/A 683.79
Noise 28 of 0 816.53
![Page 20: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/20.jpg)
Conclusion - Results
VARGA Has a Better Performance VARGA Can Model Short MTS The Visual Field Data is Definitely Multivariate Data Has a High Proportion of Noise
![Page 21: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/21.jpg)
Conclusion - Remarks
Non-Linear Methods and Transformations Performance Enhancements for the GA Improve Crossover Irregularly Spaced Methods Space-Time Series Methods Time Dependant Relationships Between Variables
![Page 22: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/22.jpg)
Generating Explanations in MTS
Useful to know probable explanations for a Useful to know probable explanations for a given set of observations within a time series given set of observations within a time series
E.g. Oil Refinery: ‘Why a temperature has E.g. Oil Refinery: ‘Why a temperature has become high whilst a pressure has fallen below become high whilst a pressure has fallen below a certain value?’a certain value?’
Possible paradigm which facilitates Explanation Possible paradigm which facilitates Explanation is the Bayesian Networkis the Bayesian Network
Evolutionary Methods to learn BNsEvolutionary Methods to learn BNs Extend work to Dynamic Bayesian NetworksExtend work to Dynamic Bayesian Networks
![Page 23: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/23.jpg)
Dynamic Bayesian Networks Static BNs repeated over t time slicesStatic BNs repeated over t time slices Contemporaneous / Non-Contemporaneous LinksContemporaneous / Non-Contemporaneous Links Used for Prediction / Diagnosis within dynamic Used for Prediction / Diagnosis within dynamic
systemssystems
n
iiin XPXXP
11 )|()...(
![Page 24: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/24.jpg)
Assume all variables take at least one time slice to Assume all variables take at least one time slice to impose an effect on another.impose an effect on another.
The more frequently a system generates data, the The more frequently a system generates data, the more likely this will be true.more likely this will be true.
Contemporaneous Links can be excluded from the Contemporaneous Links can be excluded from the DBNDBN
Each variable at time, t, will be considered Each variable at time, t, will be considered independent of one anotherindependent of one another
Assumptions - 1
![Page 25: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/25.jpg)
Representation P pairs of the form (ParentVar, TimeLag)P pairs of the form (ParentVar, TimeLag) Each pair represents a link from a node at a previous time Each pair represents a link from a node at a previous time
slice to the node in question at time t.slice to the node in question at time t.
Examples :Variable 1: { (1,1); (2,2); (0,3)}Variable 4: { (4,1); (2,5)}
![Page 26: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/26.jpg)
Search Space
Given the first assumption and proposed Given the first assumption and proposed representation the Search Space for each representation the Search Space for each variable will be:variable will be:
MaxLagN2
![Page 27: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/27.jpg)
Structure Search : Evolutionary Algorithms, Hill Climbing etc.
Parameter Calculation given structure
Dynamic Bayesian Network Library for Different Operating States
MultivariateTime Series
Explanation Algorithm (e.g. using Stochastic
Simulation)User
Algorithm
![Page 28: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/28.jpg)
Generating Synthetic Data
(1)
(2)
![Page 29: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/29.jpg)
Oil Refinery Data
Data recorded every minuteData recorded every minute Hundreds of variablesHundreds of variables Selected 11 interrelated variablesSelected 11 interrelated variables Discretised each variable into k statesDiscretised each variable into k states Large Time Lags (up to 120 minutes between Large Time Lags (up to 120 minutes between
some variables)some variables) Different Operating StatesDifferent Operating States
![Page 30: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/30.jpg)
ResultsSOT
FF
TGF
TT
RinT
![Page 31: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/31.jpg)
Explanations - using Stochastic Simulation
![Page 32: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/32.jpg)
Explanations - using Stochastic Simulation
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97time-x
P(y
=1)
SOF-SPSOTTTBPF-SPBPF
![Page 33: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/33.jpg)
Explanation
Input (observations):
t - 0 : Tail Gas Flow in_state 0t - 3 : Reboiler Temperature in_state 1
Output (explanation):
t - 7 : Top Temperature in_state 0 with probability=0.92t - 54 : Feed Rate in_state 1 with probability=0.71t - 75 : Reactor Temperature in_state 0 with probability=0.65
![Page 34: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/34.jpg)
Future Work
Exploring the use of different searches and metricsExploring the use of different searches and metrics Improving accuracy Improving accuracy
(e.g. different discretisation policies, continuous (e.g. different discretisation policies, continuous DBNs)DBNs)
Using the library of DBNs in order to quickly Using the library of DBNs in order to quickly classify the current state of a systemclassify the current state of a system
Automatically Detecting Changing Dependency Automatically Detecting Changing Dependency StructureStructure
![Page 35: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/35.jpg)
Acknowledgements
BBSRCBP-AMOCOBritish Council for Prevention of BlindnessEPSRCHoneywell Hi-Spec SolutionsHoneywell Technology CenterInstitute of OpthalmologyMoorfields Eye HospitalMRC
![Page 36: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/36.jpg)
Intelligent Data Analysis
X LiuX LiuDepartment of Computer ScienceDepartment of Computer Science
Birkbeck CollegeBirkbeck College
University of LondonUniversity of London
![Page 37: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/37.jpg)
Intelligent Data Analysis
An interdisciplinary study concerned with An interdisciplinary study concerned with effective analysis of dataeffective analysis of data
Intelligent application of data analytic Intelligent application of data analytic toolstools
Application of “intelligent” data analytic Application of “intelligent” data analytic toolstools
![Page 38: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/38.jpg)
IDA Requires
Careful thinking at every stage of an Careful thinking at every stage of an analysis process (strategic aspects)analysis process (strategic aspects)
Intelligent application of relevant domain Intelligent application of relevant domain knowledgeknowledge
Assessment and selection of appropriate Assessment and selection of appropriate analysis methodsanalysis methods
![Page 39: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/39.jpg)
IDA Conferences
IDA-95, Baden-BadenIDA-95, Baden-Baden IDA-97, LondonIDA-97, London IDA-99, AmsterdamIDA-99, Amsterdam IDA-2001, LisbonIDA-2001, Lisbon
![Page 40: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/40.jpg)
IDA in Medicine and Pharmacology
IDAMAP-96, BudapestIDAMAP-96, Budapest IDAMAP-97, NagoyaIDAMAP-97, Nagoya IDAMAP-98, BrightonIDAMAP-98, Brighton IDAMAP-99, Washington DCIDAMAP-99, Washington DC IDAMAP-2000, BerlinIDAMAP-2000, Berlin
![Page 41: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/41.jpg)
Other IDA Activities
IDA Journal (Elsevier 1997)IDA Journal (Elsevier 1997) Journal Special Issues (1997 -)Journal Special Issues (1997 -) Introductory Books (Springer 1999)Introductory Books (Springer 1999) The Dagstuhl Seminar (Germany 2000)The Dagstuhl Seminar (Germany 2000) European Summer School (Italy 2000)European Summer School (Italy 2000) Special Sessions at ConferencesSpecial Sessions at Conferences
![Page 42: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/42.jpg)
Concluding Remarks
Strategies for data analysis and miningStrategies for data analysis and mining Strategies for human-computer Strategies for human-computer
collaboration in IDAcollaboration in IDA Principles for exploring and analysing “big Principles for exploring and analysing “big
data”data” Benchmarking interesting real-world data-Benchmarking interesting real-world data-
sets as well as computational methodssets as well as computational methods A long term interdisciplinary effortA long term interdisciplinary effort
![Page 43: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/43.jpg)
The Screening Architecture
![Page 44: Bridging the Gap between Applications and Tools: Modeling Multivariate Time Series](https://reader035.vdocument.in/reader035/viewer/2022070401/56813767550346895d9efa40/html5/thumbnails/44.jpg)
Results from a GP Clinic