study of learning entropy for onset detection of epileptic seizures in eeg...

4
Study of Learning Entropy for Onset Detection of Epileptic Seizures in EEG Time Series Ivo Bukovsky, Matous Cejnek, Jan Vrba Department of Instrumentation and Control Faculty of Mechanical Engineering, CTU Prague, Czech Republic [email protected] Noriyasu Homma Tohoku University Graduate School of Medicine Sendai, Japan [email protected] Abstract—This paper presents a case study of non-Shannon entropy, i.e. Learning Entropy (LE), for instant detection of onset of epileptic seizures in individual EEG time series. Contrary to entropy methods of EEG evaluation that are based on probabilistic computations, we present the LE-based approach that evaluates the conformity of individual samples of data to the contemporary learned governing law of a learning system and thus LE can detect changes of dynamics on individual samples of data. For comparison, the principle and the results are compared to the Sample Entropy approach. The promising results indicate the LE potentials for feature extraction enhancement for early detection of epileptic seizures on individual-data-sample basis. Keywords—adaptive novelty detection; Sample Entropy; non- Shannon entropy; Learning Entropy; EEG time series; epileptic seizure; onset detection; incremental learning; higher order neural units I. INTRODUCTION Further advances in feature extraction and prediction (or at least as early as possible detection) of onsets of events in behavior of dynamical systems, such as epileptic seizures, life threatening arrhythmia in cardiovascular system, fault or perturbation detection in technical control systems, etc, are fundamental and still intriguing tasks. Such events can be detected by novelty detection approaches that can be in principal either statistical ones, e.g. [1], or learning system ones, e.g. neural network approaches [2]. As regards evaluation of EEG for epileptic seizure detection, for probabilistic-based entropy approaches we can reference, e.g. to work [3], [4] and for the learning system approaches those can be, e.g., [5], [6]. Most common terms sounding similar to Learning Entropy (LE) [7]–[9] that is proposed for EEG in this paper, though they are distinct from LE in principle, are as follows: Shanon-based Entropies are data-window and probabilistic- based computations that are widely used for time series analyses. Sample Entropy is a signal complexity evaluation algorithm (floating window based quantification of signal complexity, probability-based approach) [10], [11] Entropy Learning is a Shannon-inspired neural network learning algorithm based on minimizing complexity (entropy) of neural weights in a network. Learning Entropy is a recently introduced non-Shannon based novelty detection algorithm based on observation of unusual learning effort of incrementally learning systems. Or alternatively, LE is a relative measure of novelty (information) recognized as unusual learning effort of pre- trained learning system on individual data samples 2013 [7] To highlight and compare the principle of LE to other nowadays fundamental approaches, we discuss the distinction of LE from a probabilistic-approach representative, i.e., the Sample Entropy (SampEn) [11]–[14] including the comparisons on the same EEG data sets with epileptic seizures. Unless stated otherwise, italic small caps as stands for scalars, small bold letters as stands for vectors, and bold capitals as denote matrices or more dimensional arrays. Small letter usually denotes length of a corresponding vector, and, time indexes appear in round brackets as or when necessary for clarity, array indexes comes as lower indexes such as in , e.g.. II. APPLIED APPROACHES A. Sample Entropy If we denote the full length of time series to be , then the SampEn trajectory can be calculated for (discrete) time range . The trajectory of SampEn is calculated as SampEn of sliding windows within the time series , i.e. for each window , (1) where is the length of the sliding window. Then, we calculate the point of the trajectory according to SampEn definition as follows , (2) where , (3) where , denotes the Euclidean norm, and then

Upload: others

Post on 27-Feb-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Study of Learning Entropy for Onset Detection of Epileptic Seizures in EEG …users.fs.cvut.cz/.../author_versions/WCCI16_LEonseteeg.pdf · 2016. 4. 12. · For the experimental study

Study of Learning Entropy for Onset Detection of Epileptic Seizures in EEG Time Series

Ivo Bukovsky, Matous Cejnek, Jan VrbaDepartment of Instrumentation and ControlFaculty of Mechanical Engineering, CTU

Prague, Czech [email protected]

Noriyasu HommaTohoku University

Graduate School of Medicine Sendai, Japan

[email protected]

Abstract—This paper presents a case study of non-Shannonentropy, i.e. Learning Entropy (LE), for instant detection of onsetof epileptic seizures in individual EEG time series. Contrary toentropy methods of EEG evaluation that are based onprobabilistic computations, we present the LE-based approachthat evaluates the conformity of individual samples of data to thecontemporary learned governing law of a learning system andthus LE can detect changes of dynamics on individual samples ofdata. For comparison, the principle and the results are comparedto the Sample Entropy approach. The promising results indicatethe LE potentials for feature extraction enhancement for earlydetection of epileptic seizures on individual-data-sample basis.

Keywords—adaptive novelty detection; Sample Entropy; non-Shannon entropy; Learning Entropy; EEG time series; epilepticseizure; onset detection; incremental learning; higher order neuralunits

I. INTRODUCTION

Further advances in feature extraction and prediction (or atleast as early as possible detection) of onsets of events inbehavior of dynamical systems, such as epileptic seizures, lifethreatening arrhythmia in cardiovascular system, fault orperturbation detection in technical control systems, etc, arefundamental and still intriguing tasks. Such events can bedetected by novelty detection approaches that can be inprincipal either statistical ones, e.g. [1], or learning systemones, e.g. neural network approaches [2]. As regards evaluationof EEG for epileptic seizure detection, for probabilistic-basedentropy approaches we can reference, e.g. to work [3], [4] andfor the learning system approaches those can be, e.g., [5], [6].Most common terms sounding similar to Learning Entropy(LE) [7]–[9] that is proposed for EEG in this paper, thoughthey are distinct from LE in principle, are as follows:

• Shanon-based Entropies are data-window and probabilistic-based computations that are widely used for time seriesanalyses.

• Sample Entropy is a signal complexity evaluation algorithm(floating window based quantification of signalcomplexity, probability-based approach) [10], [11]

• Entropy Learning is a Shannon-inspired neural networklearning algorithm based on minimizing complexity(entropy) of neural weights in a network.

• Learning Entropy is a recently introduced non-Shannonbased novelty detection algorithm based on observation ofunusual learning effort of incrementally learning systems.Or alternatively, LE is a relative measure of novelty(information) recognized as unusual learning effort of pre-trained learning system on individual data samples 2013 [7]

To highlight and compare the principle of LE to othernowadays fundamental approaches, we discuss the distinctionof LE from a probabilistic-approach representative, i.e., theSample Entropy (SampEn) [11]–[14] including thecomparisons on the same EEG data sets with epileptic seizures.

Unless stated otherwise, italic small caps as stands forscalars, small bold letters as stands for vectors, and boldcapitals as denote matrices or more dimensional arrays.Small letter usually denotes length of a correspondingvector, and, time indexes appear in round brackets as or

when necessary for clarity, array indexes comes as lowerindexes such as in , e.g..

II. APPLIED APPROACHES

A. Sample Entropy

If we denote the full length of time series to be , thenthe SampEn trajectory can be calculated for (discrete) timerange . The trajectory of SampEn is calculated asSampEn of sliding windows within the time series , i.e. foreach window

, (1)

where is the length of the sliding window. Then, wecalculate the point of the trajectory according to SampEndefinition as follows

, (2)

where

, (3)

where , denotes theEuclidean norm, and then

Page 2: Study of Learning Entropy for Onset Detection of Epileptic Seizures in EEG …users.fs.cvut.cz/.../author_versions/WCCI16_LEonseteeg.pdf · 2016. 4. 12. · For the experimental study

, (4)

where

. (5)

It is apparent that the principle of SampEn is a window based(probabilistic) approach that evaluates the change of dynamicsof the whole interval of time series rather than of the individualsamples of data. Further the principle of Learning Entropy isreviewed showing that LE evaluates novelty of each newindividual sample of data.

B. Learning Entropy for Adaptive Novelty Detection

This section reviews approximate LE algorithm, moreprecisely the Approximate Individual Sample Learning Entropy(AISLE [7] further shortly as LE) for incrementally learningsystems with supervised learning. The sample-by-sampleadaptation in principle allows LE for novelty detection and thusfor information evaluation of every individual sample of data.

The novelty detection in EEG time series via LE is here basedon evaluation of incrementally learning predictor

, (6)

where is a long vector of all adaptable parameters,and the augmented input vector

, (7)

where are measured values and the augmenting unit allows for absolute term in the model that can function as for aneural biases in neural network models and furthermore itallows HONU for involving lower-polynomial terms (seesubsection II.C).

(8)

and the recent average magnitude of each learning increment attime k is calculated as

, (9)

where Δw is a vector of learning increments of all adaptableparameters, the Learning Entropy is evaluated as a ratio ofunusually large magnitudes of learning increments in respect tothe recent learning history as follows

(10)

where α=[αmin,...,αmax] is vector of detection sensitivitiesto overcome the otherwise single-scale nature of this adaptivenovelty detection.

C. Applied Learning Systems

The learning capability of a learning system and also thequality of pretraining is crucial for LE use (e.g. [7]). Two typesof fundamental neural network architectures are hereinvestigated for the purpose of the seizure onset detection andtheir details are given in Tab. 1. These are static LNU and QNU(i.e. HONU of polynomial orders =1 and =2) where represents correspondingly long vector of all weights of HONU[15]–[17].

Tab. 1: Mathematical Details of the Used Neural Models

neuralmodel

details of neural model input-output mapping

LNU

QNU ; ;

The LE algorithm has been showing an interestingperformance on novelty detection with HONU ( =1,2) with thefundamental learning rule such as the pure gradient descent[7]–[9], that is also implemented in this study, i.e. defining theerror , we calculate the weight increments for LEalgorithm and for the above HONUs as follows

Fig. 1: Comparison of SampEn and LE (of orders 1-4) on dataset 1 Fig. 2: Comparison of SampEn and LE (of orders 1-4) on dataset 2

Page 3: Study of Learning Entropy for Onset Detection of Epileptic Seizures in EEG …users.fs.cvut.cz/.../author_versions/WCCI16_LEonseteeg.pdf · 2016. 4. 12. · For the experimental study

. (11)

In the following experimental analysis, it is shown thateven a linear adaptive predictor (LNU) displays interestingperformance for LE, so it was not necessary to implement anonlinear predictor this time.

III. EXPERIMENTAL ANALYSIS

For the experimental study we used EEG measured onsewer-rat suffering from epilepsy. To avoid high dimensionalityof input vector due to high sampling, the EEG recordings werepragmatically re-sampled from original fast sampling of 5000samples per second (s.p.s) to 500 s.p.s that was practicallyfound as the as long as possible sampling where no significantchanges were visually observed in the signal due to re-sampling. Before re-sampling the data was filtered withbandpass filter 0.1-2200 Hz.

A. Sample Entropy for Epileptic Seizure Onset Detection

The trajectory of SampEn was calculated for EEG data setsto compare its capability to detect the onsets of epilepticseizures. The SampEn was calculated for window of 200samples and input vector size of 2 samples. The results areshown in Fig. 1 and Fig. 2. The Python PyEEG package wasused to implement the SampEn [18] .

B. Learning Entropy for Epileptic Seizure Onset Detection

The LE was estimated for static LNU (first order HONU)prediction model. The predictor was pre-trained on first 200samples in 30 epochs. The learning rate was set to ,prediction horizon was set to one sample ahead and inputvector has size of samples. The LE was estimated forhistory of 20 samples with detection sensitivity setup

. The results are shown in Fig. 1 and Fig. 2.

IV. DISCUSSION

The results in Fig. 1 and Fig. 2 on the two exemplar EEGrecordings with the onset of epilepsy are interesting from thepoint of observed behavior of SampEn vs. LE. While theSampEn was rather settled before the onset of epilepsy fordataset 1, it was substantially oscillating for the dataset 2. It isinteresting that LE with the linear predictor (LNU) is settledbefore the onset of epilepsy in the both cases. Theoretically, itmight be because of that the learning system was able to find agoverning law in both recordings prior the epilepsy onset, sothe variations of LE were much smaller than variations ofSampEn and the LEP was kept almost constant in both casesprior the onset. To attempt to support this assumption, we plotthe Recurrence Plot (RP) [4], [19] as the scatter graph of RPmatrix whose points are defined as follows

. (12)

The RP then compare periodicity of embedding states of thetime series and thus can give us some notion for complexitycomparison of both the EEG signals prior the onsets ofepilepsy. Similarly, we propose to draw another RP-inspiredgraph, i.e. so called False Neighbor Plot (FNP). FNP visualizes

false neighbors from the point of input-output mapping of alearning system, i.e. the predictor (6), so it visualizes matrixwhose elements are defined as follows

. (13)

Thus, FNP gives us a notion of uncertainty in input-output datafor the given setups. Both RP and FNP were calculated for bothdatasets for the same setups and for embedding . Asresults from their brief comparison in Fig.3 and Fig.4 inAppendix, the complexity of both EEG signals prior the onsetof epilepsy appears similar that supports the conclusions thatboth dynamical behaviors could really be similarly learned by alearning system prior the onset, so the both learning entropyprofiles can really be settled before the onset of epilepsy.

V. CONCLUSION

In this paper we reviewed the adaptive novelty detectionalgorithm Learning Entropy (i.e. Entropy of Learning) and wedemonstrated its principal distinction from Sample Entropy(SampEn) on the application tor detection of epileptic seizuresonset in EEG signal. LE has visually better performance ontested data than SampEn. Other advantage is that LE evaluatesnovelty of every individual sample of data while thecontemporary system behavior is encoded in neural weights ofa learning system. Thus the LE detection can be performed,e.g., here with a floating window of 20 samples in comparisonto SampEn, where the window have to be significantly longer(200 samples are recommended at least for EEG). Thecomputation time of LE with adaptive linear filters was morethan 100 times faster for all EEG recordings than computationtime of common implementation of SampEn on the sameplatform (both Python, i5). This paper presented the newmethod for adaptive novelty detection in EEG using learningsystems, its principle and its potentials. In future, moreextensive study shall be made including more challengingmulti channel EEG recordings.

ACKNOWLEDGEMENT

This work was supported by the Grant Agency of the CzechTechnical University in Prague, grant No.SGS15/189/OHK2/3T/12, and also by JSPS KAKENHIGrants No. 25293258 and No. 26540112.

REFERENCES

[1] M. Markou and S. Singh, “Novelty detection: a review—part 1: statisticalapproaches,” Signal Process., vol. 83, no. 12, pp. 2481–2497, Dec. 2003.

[2] M. Markou and S. Singh, “Novelty detection: a review—part 2:: neuralnetwork based approaches,” Signal Process., vol. 83, no. 12, pp. 2499–2521, Dec. 2003.

[3] V. Srinivasan, C. Eswaran, and N. Sriraam, “Approximate Entropy-BasedEpileptic EEG Detection Using Artificial Neural Networks,” IEEE Trans.Inf. Technol. Biomed., vol. 11, no. 3, pp. 288–295, May 2007.

[4] G. Ouyang, X. Li, C. Dang, and D. A. Richards, “Using recurrence plotfor determinism analysis of EEG recordings in genetic absence epilepsyrats,” Clin. Neurophysiol., vol. 119, no. 8, pp. 1747–1755, Aug. 2008.

[5] A. Petrosian, D. Prokhorov, R. Homan, R. Dasheiff, and D. Wunsch,“Recurrent neural network based prediction of epileptic seizures in intra-and extracranial EEG,” Neurocomputing, vol. 30, no. 1–4, pp. 201–218,Jan. 2000.

Page 4: Study of Learning Entropy for Onset Detection of Epileptic Seizures in EEG …users.fs.cvut.cz/.../author_versions/WCCI16_LEonseteeg.pdf · 2016. 4. 12. · For the experimental study

[6] A. H. Shoeb, “Application of machine learning to epileptic seizure onsetdetection and treatment,” Thesis, Massachusetts Institute of Technology,2009.

[7] I. Bukovsky, “Learning Entropy: Multiscale Measure for IncrementalLearning,” Entropy, vol. 15, no. 10, pp. 4159–4187, Sep. 2013.

[8] I. Bukovsky, C. Oswald, M. Cejnek, and P. M. Benes, “Learning entropyfor novelty detection a cognitive approach for adaptive filters,” in SensorSignal Processing for Defence (SSPD), 2014, 2014, pp. 1–5.

[9] I. Bukovsky, N. Homma, M. Cejnek, and K. Ichiji, “Study of LearningEntropy for Novelty Detection in lung tumor motion prediction for targettracking radiation therapy,” in 2014 International Joint Conference onNeural Networks (IJCNN), 2014, pp. 3124–3129.

[10]S. M. Pincus, “Approximate entropy as a measure of system complexity,”Proc. Natl. Acad. Sci. U. S. A., vol. 88, no. 6, pp. 2297–2301, Mar. 1991.

[11] J. S. Richman and J. R. Moorman, “Physiological time-series analysisusing approximate entropy and sample entropy,” Am. J. Physiol. HeartCirc. Physiol., vol. 278, no. 6, pp. H2039–2049, Jun. 2000.

[12]D. E. Lake, J. S. Richman, M. P. Griffin, and J. R. Moorman, “Sampleentropy analysis of neonatal heart rate variability,” Am. J. Physiol. -Regul. Integr. Comp. Physiol., vol. 283, no. 3, pp. R789–R797, Sep.2002.

[13]J. M. Yentes, N. Hunt, K. K. Schmid, J. P. Kaipust, D. McGrath, and N.Stergiou, “The Appropriate Use of Approximate Entropy and SampleEntropy with Short Data Sets,” Ann. Biomed. Eng., vol. 41, no. 2, pp.349–365, Feb. 2013.

[14] I. Voicu and J.-M. Girault, “Multi-scale sample entropy and recurrenceplots distinguish healthy from suffering foetus,” in Acoustics 2012, 2012.

[15]M. M. Gupta, L. Jin, and N. Homma, Static and dynamic neural networks:from fundamentals to advanced theory. New York: Wiley, 2003.

[16]M. M. Gupta, N. Homma, Z.-G. Hou, M. Solo, and I. Bukovsky, “Higherorder neural networks: fundamental theory and applications,” Artif. High.Order Neural Netw. Comput. Sci. Eng. Trends Emerg. Appl., pp. 397–422, 2010.

[17]Madan M. Gupta, et al, “Fundamentals of Higher Order Neural Networksfor Modeling and Simulation,” in Artificial Higher Order NeuralNetworks for Modeling and Simulation, Ming Zhang, Ed. Hershey, PA,USA: IGI Global, 2013, pp. 103–133.

[18]F. S. Bao, X. Liu, and C. Zhang, Research Article PyEEG:An OpenSource Python Modulefor EEG/MEG Feature Extraction. .

[19]J.-P. Eckmann, S. O. Kamphorst, and D. Ruelle, “Recurrence plots ofdynamical systems,” Eur. Lett, vol. 4, no. 9, pp. 973–977, 1987.

APPENDIX

Fig. 3: The Recurrence Plots with the same setups (embedding ) visually indicates that there is no significant difference in complexity (insense of periodicity) of both datasets prior the onset of epilepsy.

Fig. 4: The False Neighbor Plots with the same setups (input embedding ) visually indicates that there is no significant difference incomplexity (in sense of input-output mapping uncertainty) of both datasets prior the onset of epilepsy.