implementation and testing of atrial fibrillation ... · implementation and testing of atrial...

POLITECNICO DI MILANO School of Industrial and Information Engineering

Master of Science in Biomedical Engineering

IMPLEMENTATION AND TESTING OF ATRIAL FIBRILLATION DETECTORS FOR A MOBILE PHONE

APPLICATION

Supervisors: Prof. Luca Mainardi Prof. Gari Clifford

Roberta Colloca 769961

Academic year 2011/2012

Abstract

Atrial Fibrillation (AF) is the most common and sustained heart rhythm disorder. It isassociated with an increased risk of hospitalization, stroke and death and its incidenceis destined to raise with the aging of the population.

Considered the silent nature of AF, current diagnostic methods, via symptoms or otherindirect medical evaluations, result in a large under-representation of the disturbance.For this reason, screening of the population should be taken into account. The prominentfeatures of a device for a screening application, in terms of accessibility, sensitivity, ease-of-use and cost effectiveness are met by smart-phones and their increasing computationalpower.

This thesis examines several signal processing methods for automatic detection of AFto accomplish the final goal of bringing the most suitable on a mobile platform. Theattention is particularly focused on methods based on the analysis of the heart ratevariability signal, which must be preferred for external devices, as they provide a morerobust detection against noise.

The three most promising detectors examined (on the base of results reported in lit-erature) are implemented in Matlab.

Each AF detector is successively tested by using a common underlying evaluationprotocol: this is to allow for more objective considerations on final performance. Inparticular, the automatic methods are evaluated by varying the window length and byexploring different assumptions on target data. For training and testing of the AF de-tectors, standard databases of Physionet are used.

The predictive power of the algorithms together with other simple AF indicators isthen investigated, in the attempt to improve the performance obtained by each detec-tor, individually. For this purpose, three different classifiers for supervised learning areemployed, i.e. Logistic Regression, Random Forest and Support Vector Machines.

Finally, the effect of adding simulated ectopic beats on detection is considered, asectopy significantly alters the harmonic content of the ECG signal. Therefore, differentpercentages of abnormalities are introduced in normal ECGs using a Hidden MarkovModel approach. Drop in performance is consequently quantified for each detector and

1

for each classifier.

2

Sommario

La Fibrillazione Atriale (FA) è la più comune delle aritmie cardiache ed è caratterizzata daun’attività estremamente caotica degli atri. In presenza di tale anomalia, l’efficienza dellapompa cardiaca risulta complessivamente ridotta, a causa dell’irregolarità, nella trasmis-sione degli impulsi elettrici, che compromette la funzionalità dei sincizi. Numerosi studiconfermano un aumento statisticamente significativo del rischio di complicazioni car-diovascolari e di morte, associate alla presenza di FA. Anche i dati sull’incidenza sonopreoccupanti: la prevalenza è dell’1%, ed è destinata a crescere, a causa del progressivoinvecchiamento della popolazione e dell’impiego crescente di ausili diagnostici dedicatiall’ individuazione della FA. Si tratta di una problematica per lungo tempo sottovalutata,della quale, solo di recente, si stanno accertando le implicazioni cliniche importanti. Allostato attuale, la diagnosi è strettamente legata alla sintomatologia, in altri casi, emergeda esami clinici condotti per ragioni differenti. Purtroppo, la natura di questo disordinedel ritmo è spesso silente e, non per questo, associato ad un rischio minore di infarto,scompenso o ictus cerebrale. Si deduce, allora, la forte inadeguatezza dei metodi diag-nostici tuttora in uso, aspetto che porta a sottostimare l’incidenza reale dell’aritmia e adaggravarne gli effetti, nei casi di necessario, ma mancato trattamento. Il conseguente au-mento dei costi sanitari, legati soprattutto ai ricoveri ospedalieri, per quei casi scopertitardivamente e/o già degenerati in stadi più gravi , non costituisce un aspetto secon-dario, in un quadro di risorse limitate, che devono fronteggiare una domanda in costanteaumento. Lo screening della FA potrebbe, dunque, rappresentare una valida risposta aquesto problema: studi clinici randomizzati dimostrano un aumento del numero di casidiagnosticati attraverso screening della popolazione più a rischio (età >60 anni), rispettoalla normale pratica clinica. Inoltre, diagnosi precoci forniscono esiti migliori nel ridurrei rischi associati all’aritmia, considerata l’efficacia dei trattamenti esistenti. È attual-mente molto fertile il campo della ricerca per la realizzazione di metodi automatici dirilevazione della FA, che possano supportare i processi clinici decisionali e, contestual-mente, ridurre l’impatto sull’utilizzo delle risorse sanitarie. Un dispositivo adatto adun’applicazione di screening dovrebbe rispondere a requisiti di accessibilità, semplicità diutilizzo, basso costo, a fronte di una buona accuratezza, per essere largamente accettato

3

nel contesto di interesse. Impossibile, allora, non volgere lo sguardo alla tecnologia mo-bile: i nuovi smartphone dispongono di processori sufficientemente potenti, da consentirel’implementazione di algoritmi complessi. Costituiscono, inoltre, uno strumento assolu-tamente familiare per l’uso e la comunicazione dei dati da parte degli utilizzatori e sonosottoposti a continuo sviluppo ed innovazione. L’implementazione di una applicazionesu smartphone per lo screening della FA rappresenterebbe un modo efficace ed inter-operabile per la diagnosi dell’aritmia, la prontezza del trattamento, l’abbattimento deicosti connessi ad interventi non tempestivi. Questi aspetti risultano ancora più essenzialiquando prendiamo in considerazione i paesi in via di sviluppo, dove parlare di risorselimitate per la salute, significa dipingere un quadro addirittura ottimistico.

Questo lavoro di tesi si propone di individuare e mettere a confronto metodi auto-matici ed accurati per la detezione della FA, che siano appropriati nella prospettivadell’implementazione finale su piattaforma mobile. Il primo capitolo introduce il prob-lema della Fibrillazione Atriale: vengono brevemente trattati elementi di elettrofisiologiaatriale, necessari a comprendere meglio i complessi meccanismi alla base di tale aritmia,per altro, non ancora del tutto chiari. Vengono esposte le tre ipotesi cruciali ritenuteresponsabili dell’insorgenza della FA. Il capitolo prosegue spiegando l’incidenza, ed i fat-tori di rischio dell’aritmia e si conclude descrivendone i sintomi, le opzioni terapeutiche letecniche diagnostiche in uso, di cui sono evidenziati i principali limiti. Il secondo capitolooffre ampio spazio alla descrizione di tecniche, basate sull’estrazione di caratteristiche dalsegnale di variabilità cardiaca, finalizzate all’identificazione dell’aritmia e si conclude conla selezione di tre algoritmi, che usano approcci diversi per rilevare la FA, e sono statigiudicati appropriati per gli scopi di questo lavoro. I tre algoritmi sono stati implementatiin Matlab per essere testati e comparati. Il codice sarà pubblicato sul sito web Phys-ionet.org, al fine di renderlo disponibile per tutti. Il terzo capitolo è interamente dedicatoalla fase di re-training e di testing degli algoritmi. A questo scopo, sono stati utilizzati idatabase standard di Physionet: MIT-BIH Atrial Fibrillation database, MIT-BIH Nor-mal Sinus Rhythm database, MIT-BIH Arrhythmia database. Le performance dei diversimetodi sono state analizzate al variare della finestra temporale, esplorando diverse as-sunzioni sui dati di riferimento, ed utilizzando, ove possibile, il medesimo protocollo ditest, allo scopo di rendere i risultati il più possibile comparabili.

Nel quarto capitolo, le caratteristiche estratte dal segnale di variabilità cardiaca, at-traverso gli algoritmi precedentemente selezionati ed implementati, vengono combinatead altri semplici predittori di FA. In particolare, tre complessi modelli matematici perla classificazione, Logistic Regression, Random Forests e Support Vector Machines sonostati utilizzati, previa ottimizzazione dei parametri interni (tuning phase) e training, per

4

la classificazione binaria delle stesse osservazioni, nel tentativo di migliorare i risultatiottenuti dai singoli algoritmi.

Nel quinto capitolo, viene affrontato il problema dei battiti ectopici, i quali , modi-ficando il contenuto spettrale del segnale di variabilità cardiaca, possono confondere imetodi di detezione della FA, che rilevano, in prima approssimazione, un’ aumentatairregolarità nel tracciato elettrocardiografico (ECG). Un approccio basato su HiddenMarkov Models viene utilizzato per modificare tracciati ECG normali, ossia caratteriz-zati da ritmo sinusale normale, introducendo diverse percentuali di battiti ectopici, perquantificare come varia la performance dei tre algoritmi e dei classificatori sotto questediverse condizioni.

Nel sesto capitolo vengono tratte le conclusioni sul lavoro svolto e vengono illustratii passi successivi da compiere per giungere all’implementazione finale del migliore FAdetector su smart-phone.

5

Summary

Atrial Fibrillation (AF) is the most common sustained arrhythmia and it is characterizedby predominantly uncoordinated atrial activations. When AF occurs, electrical signalsfollow an abnormal conduction pathway which causes an irregular ventricular responseand affects the global functionality of the heart.

AF is associated with an increased risk of of cardiovascular complications and death.Also statistics about its incidence are quite alarming: it has been estimated a preva-lence of 1% in the general population, but it is destined to raise with the ageing of thepopulation and the improvement of diagnostic tools to detect it.

The problem of AF has been under-estimated for a long time: only recently consid-erable efforts have been directed to understanding this progressive arrhythmia and verygood results achieved over the past ten years.

Actually AF is almost diagnosed by symptoms or revealed incidentally during medicalevaluations, performed for uncorrelated reasons.

Unfortunately, this arrhythmia is often silent, but the absence of symptoms is notassociated with a lower consequential risk of myocardial infarction, heart failure or stroke:this leads to a large under-representation of the disturbance and may offer a biased viewof the clinical epidemiology of AF, with several negative consequences. As an example, amajor exposure to cardiovascular complications, since appropriate prevention measuresare not taken promptly.

The increase in healthcare resource utilization for such undiagnosed cases is not aproblem of less importance, in the perspective of sanitary systems which, though havinglimited resources, are compelled to cope with ever-increasing demands.

To address this global challenge, screening of general population should be taken intoaccount. Several randomized clinical trials demonstrate that systematic screening (e.g. insubjects aged 60 or over) would detect additional AF cases compared to routine practice.Also, earlier detection would result in improved outcomes within the screened popula-tion, since existing treatments are quite effective in managing symptoms and reducingconsequential risks associated with this arrhythmia.

There is the pressing need to provide clinicians accurate and automatic methods to

6

detect Atrial Fibrillation whilst reducing costs related to healthcare delivery.A monitor device suitable for a screening application should be easily accessible, cost-

effective and easy to use in order to be largely accepted in a clinical scenario. The useof mobile phones is likely to meet these requirements. In fact, mobile community offersa familiar way of communication to users, a worldwide-connected network and it is com-mitted to continuous development and innovation. Since significant proportions of globalhealthcare cost are related to cardiac arrhythmias, and incidence of Atrial Fibrillation isdestined to raise as ages in developed countries and population in developing countries,mobile Health could really encounter governments, private insurers and individuals’ needto prevent and timely treat such debilitating conditions while supporting cost-effectiveand interoperable ways of delivering healthcare.

Biomedical research contribution is very prolific in this field: since the electrocardio-graphic signal (ECG) contains high potential diagnostic information for detection of AF,a large number of publications have appeared recently regarding AF detectors utilizingECG recordings.

Unfortunately, such methods are very rarely comparable, as claimed results are oftenobtained by using different evaluation protocols and/or different sets of data.

This thesis investigates and compares several techniques to reveal AF, to accomplishthe final goal of bringing the most suitable one on a mobile platform.

The first chapter introduces the reader to the Atrial Fibrillation burden: the basisof atrial electrophysiology are introduced in order to better interpret both the currentclinical knowledge about human AF and the reasons why many questions regarding AFstill remain unanswered. The three crucial arrhythmic mechanisms held to be responsiblefor AF are described: multiple-circuit, single-circuit and rapid ectopic activity hypothesis.This chapter also considers AF incidence, risk factors and clinical presentation. The mostcommon diagnostic options are finally explored and the correspondent limits highlighted.

In the second chapter physiological concepts behind the development of an AF de-tection algorithm are described together with some of the most recent signal processingtechniques to reveal Atrial Fibrillation. The attention is focused on detectors based onthe ventricular response analysis, which are least confounded by artifacts and noise. Thechapter ends with the selection of three algorithms, which use different approaches to re-veal AF: their theoretical performance makes them eligible for a further implementationon smart-phone.

The three methods are implemented in Matlab for successive evaluation and compari-son. The code will be published on the website Physionet.org.

Chapter 3 is fully dedicated to retraining and testing the three algorithms. Standard

7

databases of Physionet are used for this purpose. In particular the AF detectors areretrained on the MIT-BIH Atrial Fibrillation database, by using a wide range of windowlengths. This particular dataset was preferred, because it mainly contains AF episodes,mostly paroxysmal, and normal sinus rhythm. As the final goal is trying to detect asmany true AF events as possible while still maintaining good performance in rejectingnormal sinus rhythm, this dataset provides the exact diversity of heart rhythms required.The algorithms are successively tested on the MIT-BIH Normal Sinus Rhythm database,MIT-BIH Arrhythmia database by using a common underlying evaluation protocol: thisis to allow for more objective considerations on final performance.

In chapter 4 the features extracted from the heart rate variability signal by usingthe selected detectors are combined with other simple AF predictors, in the attempt toimprove the predictive power of single detectors. Three classifiers for supervised learningare adopted for this purpose: Logistic Regression, Random Forests, Support VectorMachines. The models are trained on the MIT-BIH AF database and feature importanceanalysis is performed on the held out data during a five folds cross validation.

In chapter 5, the effect of adding simulated ectopic beats on detection is considered,as ectopy significantly alters the harmonic content of the ECG signal. Therefore, ectopicbeats are likely to cause misclassification of non-AF events.

Different percentages of abnormalities are introduced in normal ECGs (ECGs char-acterized by normal sinus rhythm) using a Hidden Markov Model approach. Drop inperformance is consequently quantified for each detector and for each classifier.

In chapter 6 conclusions on the work are drawn and future steps are exposed to accom-plish the final aim of implementing the best AF detector on a smart-phone for screeningapplication.

8

Acknowledgements

The realization of this work would not have been possible without the people that, indifferent ways, at different time, gave me their invaluable contribution. I would like tothank all of them and I hope not to forget anyone!

In primis, I would like to say thanks to my academic supervisors: Luca Mainardi forsupport, knowledge transmitted and, above of all, for passion about biosignal processing:this amazing experience, that he made possible, would have been much more difficult toface without his teaching; Gari Clifford for pushing me in facing my limits and alwaysgive my best: I learnt so much during these last months, in particular, that there arestill so many things that I want to learn.

I’d like to thank Professor Leif Sörnmo for his precious hints.I’d like to say thanks to the components of the “Intelligent Patient Monitoring Group”

of Oxford University: Julien for help, Alistair for all the help and support until thevery last days, thousands suggestions, and practical machine learning lessons; Joachim,Ahmar for our “over-weekend research”; Carmelo for endless patience and help, preciousfriendship and the huge amounts of caffeine and sublime quality food so generouslyprovided; definitely thanks to everyone for the beautiful moments spent together: it wasan honour for me to be surrounded by so competent researchers in such an amazing,welcoming working environment!

I’d like to thank Eileen, Dave, John with all my heart: your support and your love hasbeen invaluable, I will never forget.

I’d like to thank to friends for their support, always.Most of all I’d like to thank my family, thanks mum, dad, Andrea, you are my biggest

strength.Finally I’d like to thank Adriano, for always being by my side in every single important

moment, making achievements even more special.

L’ideazione e la realizzazione di questo lavoro non sarebbero stati possibili senza le per-sone che, in vari modi e in tempi diversi, hanno dato il loro contributo. Vorrei ringraziare

9

ciascuno singolarmente, sperando di non dimenticare nessuno.Grazie, innanzitutto, al mio relatore Luca Mainardi per il suo supporto, le conoscenze

trasmesse, ma soprattutto la passione per l’elaborazione dei segnali biologici: senza i suoipreziosi insegnamenti, questa straordinaria esperienza, che lui stesso ha reso possibile,sarebbe stata molto più ardua da portare a termine.

Ringrazio il mio correlatore Gari Clifford, perché mi ha dimostrato la sua fiducia emi ha reso autonoma sul lavoro mettendomi continuamente alla prova e spingendomi adaffrontare le mie paure ed i miei limiti: ho imparato così tanto negli ultimi mesi, ed ildesiderio di saperne di più è solo cresciuto.

Ringrazio il Professor Leif Sörnmo per i suoi preziosi suggerimenti.Un affettuoso ringraziamento a tutti i componenti dell’ “Intelligent Patient Monitoring

Group” di Oxford; Julien per l’aiuto; Alistair per l’aiuto, il supporto, le lezioni di machinelearning; Joachim, Ahmar per la nostra “over weekend research”; Carmelo per l’ infinitapazienza, la preziosa amicizia e le consistenti dosi di caffeina e cibo di qualità sopraffinagentilmente offerti, a tutti, davvero, per i splendidi momenti trascorsi assieme: e’ statoun onore lavorare in un ambiente tanto stimolante, competente ed allo stesso tempo cosìaccogliente!

Un grazie particolare ad Eileen, Dave, John: il vostro sostegno ed il vostro affetto èstato impagabile, non potrò mai dimenticarlo.

Ringrazio con affetto i miei amici, quelli che dopo tanti anni sono ancora qua. Graziea Luca e alle ragazze: con voi mi son sempre sentita a casa. Grazie Eli, perché ci seisempre ed anche per avermi salvato un numero indefinito di volte da scadenze e dai mieiincubi burocratici!

Grazie alla mia famiglia, grazie a mamma, papà, Andrea: siete la mia ancora.E poi grazie a te che mi sei accanto in tutte le nostre scelte importanti e rendi i risultati

ancora più speciali.

10

Contents

1. Atrial Fibrillation: mechanisms, incidence, diagnosis 251.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251.2. Atrial electrophysiology . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261.3. Mechanisms of Atrial Fibrillation . . . . . . . . . . . . . . . . . . . . . . . 27

1.3.1. Multiple-circuit hypothesis . . . . . . . . . . . . . . . . . . . . . . . 271.3.2. Single-circuit reentry . . . . . . . . . . . . . . . . . . . . . . . . . . 281.3.3. Rapid ectopic activity . . . . . . . . . . . . . . . . . . . . . . . . . 281.3.4. Atrial Fibrillation begets Atrial Fibrillation . . . . . . . . . . . . . 29

1.4. A clinical view . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291.4.1. Incidence and Risk Factors . . . . . . . . . . . . . . . . . . . . . . 291.4.2. Clinical presentation and treatment . . . . . . . . . . . . . . . . . . 301.4.3. The under-detection problem . . . . . . . . . . . . . . . . . . . . . 321.4.4. Identification and diagnosis . . . . . . . . . . . . . . . . . . . . . . 331.4.5. AF detection and m-Health . . . . . . . . . . . . . . . . . . . . . . 33

2. Ventricular Response Analysis 352.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352.2. Physiological concepts for development of detection algorithms . . . . . . 36

2.2.1. Atrial activity analysis . . . . . . . . . . . . . . . . . . . . . . . . . 362.2.2. Ventricular response analysis . . . . . . . . . . . . . . . . . . . . . 37

2.3. AF detection algorithms: description and comparison . . . . . . . . . . . . 382.3.1. Analysis of the dynamics of RR interval series for the detection of

atrial fibrillation episodes (Cerutti et al.) . . . . . . . . . . . . . . . 382.3.1.1. Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 392.3.1.2. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 40

2.3.2. Long-Term Monitoring for detection of Atrial Fibrillation (LinkerDT.) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402.3.2.1. Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 402.3.2.2. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 40

11

Contents

2.3.3. Automatic detection of atrial fibrillation using the coefficient ofvariation and density histograms of RR and ∆RR intervals.(Tatenoet al.) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 422.3.3.1. Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 422.3.3.2. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 43

2.3.4. A detector for a chronic implantable atrial tachyarrhythmia moni-tor (Sarkar et al.) . . . . . . . . . . . . . . . . . . . . . . . . . . . 442.3.4.1. Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 442.3.4.2. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 45

2.3.5. Robust detection of Atrial Fibrillation for a long-term telemonitor-ing system (Logan et al.) . . . . . . . . . . . . . . . . . . . . . . . . 472.3.5.1. Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 472.3.5.2. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 48

2.3.6. High accuracy of automatic detection of atrial fibrillation usingwavelet transform of heart rate intervals (Duverney et al.) . . . . . 492.3.6.1. Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 492.3.6.2. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 50

2.3.7. Atrial fibrillation detection algorithms for very long term ECGmonitoring (Petrucci et al.) . . . . . . . . . . . . . . . . . . . . . . 512.3.7.1. Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 512.3.7.2. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 52

2.3.8. Accurate estimation of entropy in very short physiological timeseries: the problem of atrial fibrillation detection in implanted ven-tricular devices (Lake and Moorman) . . . . . . . . . . . . . . . . . 532.3.8.1. Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 532.3.8.2. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 54

2.3.9. Improvements in atrial fibrillation detection for real-time monitor-ing (Babaeizadeh et al.) . . . . . . . . . . . . . . . . . . . . . . . . 552.3.9.1. Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . 552.3.9.2. Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 56

2.3.10. Comparison table and algorithms selection . . . . . . . . . . . . . . 562.3.11. Lorenz: implementation details . . . . . . . . . . . . . . . . . . . . 58

3. Algorithms evaluation 613.1. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 613.2. General steps for AF detection . . . . . . . . . . . . . . . . . . . . . . . . 61

12

Contents

3.3. Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623.3.0.1. MIT-BIH Atrial Fibrillation database . . . . . . . . . . . 643.3.0.2. MIT-BIH Normal Sinus Rhythm database . . . . . . . . . 643.3.0.3. MIT-BIH Arrhythmia database . . . . . . . . . . . . . . . 66

3.4. Practical considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683.4.1. Window size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683.4.2. Threshold . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693.4.3. Reference data: minimum percentage of AF and minimum number

of AF beats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 703.5. K-Fold Cross Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

3.5.1. Folds generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 723.6. Performance metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 733.7. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

3.7.1. Performance on training set . . . . . . . . . . . . . . . . . . . . . . 733.7.1.1. CosEn (Lake and Moorman) . . . . . . . . . . . . . . . . 763.7.1.2. MAD (Linker DT.) . . . . . . . . . . . . . . . . . . . . . 773.7.1.3. Lorenz (Sarkar et al.) . . . . . . . . . . . . . . . . . . . . 793.7.1.4. Discussion and comparison . . . . . . . . . . . . . . . . . 80

3.7.2. Performance on test sets . . . . . . . . . . . . . . . . . . . . . . . . 813.7.2.1. MIT-BIH NSR database . . . . . . . . . . . . . . . . . . . 813.7.2.2. MIT-BIH Arrhythmia database . . . . . . . . . . . . . . . 83

3.8. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

4. Classification of AF: a machine learning approach 884.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 884.2. Supervised learning setting . . . . . . . . . . . . . . . . . . . . . . . . . . 884.3. Practical considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

4.3.1. Reference selection . . . . . . . . . . . . . . . . . . . . . . . . . . . 904.3.2. Window size selection . . . . . . . . . . . . . . . . . . . . . . . . . 91

4.4. Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 914.4.1. Logistic Regression . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

4.4.1.1. Model selection methods . . . . . . . . . . . . . . . . . . 924.4.1.2. Training on the MIT-BIH AF database . . . . . . . . . . 934.4.1.3. Feature importance analysis . . . . . . . . . . . . . . . . . 95

4.4.2. Support Vector Machines . . . . . . . . . . . . . . . . . . . . . . . 984.4.2.1. Non-linear SVM . . . . . . . . . . . . . . . . . . . . . . . 100

13

Contents

4.4.2.2. The kernel trick . . . . . . . . . . . . . . . . . . . . . . . 1004.4.2.3. Training on the MIT-BIH AF database . . . . . . . . . . 102

4.4.3. Random Forests . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1034.4.3.1. Training on the MIT-BIH AF database . . . . . . . . . . 1044.4.3.2. Feature importance analysis . . . . . . . . . . . . . . . . . 105

4.5. Comparative results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1104.5.1. Sensitivity-weighted results . . . . . . . . . . . . . . . . . . . . . . 1104.5.2. Results with LIBSVM . . . . . . . . . . . . . . . . . . . . . . . . . 113

5. Ectopy simulation 1175.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1175.2. Identifying the problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1175.3. Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1195.4. Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1225.5. Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

6. Summary, considerations and future work 1286.1. Summary and considerations . . . . . . . . . . . . . . . . . . . . . . . . . 1286.2. Future work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

A. Appendix 136A.1. MIT-BIH database

136A.1.1. The annotation definitions

136

B. Appendix 138B.1. SVM results including MAD . . . . . . . . . . . . . . . . . . . . . . . . . . 138

14

List of Figures

1.1. Mechanism of Atrial Fibrillation: multiple-circuit reentry. AF is charac-terized by uncoordinated atrial activations, causing inconsistent impulsetransmission through the AV node and a consequential irregular rapidventricular rate. According to Moe’s hypothesis, AF can be attributed tomultiple wavelets varying continuously in location and time within the atria. 28

1.2. Prevalence of Atrial Fibrillation (Rotterdam study [14]) . . . . . . . . . . 30

2.1. Physiological concepts for development of Atrial Fibrillation detectors. Ona surface ECG tracing, AF may cause a sudden disappearance of regularly-occurring P-waves, normally associated with synchronous activation of theatria. P-waves are replaced by a fluctuating baseline, composed by fibrilla-tory wavelets of low-amplitude varying from patient to patient. Moreover,ventricular activity becomes less effective and weakly predictable: as aconsequence, RR interval series becomes irregular. . . . . . . . . . . . . . 38

2.2. Steps of Linker DT. algorithm for AF detection. . . . . . . . . . . . . . . . 412.3. Distribution patterns in a Lorenz plot of �RR intervals for NSR and AF.

On the x axis there is the actual �RR interval, on the y axis the previous�RR interval. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

2.4. The two-dimensional histogram, numeric representation of a Lorenz plotof �RR intervals. The histogram is divided into 13 discrete segments:for each rhythm, there will be an higher probability of points’ positioningwithin specific segments. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

2.5. Histograms of RR intervals computed over 10s windows. Significant dif-ferences in RR distribution can be seen between AF (top figure) and NSR(bottom figure) in terms of standard deviation. . . . . . . . . . . . . . . . 47

2.6. Discrete Wavelet Transform of the RR series at different scales.The ampli-tude of coefficients highlights onset and offset of high heart rate variabilityregions. Pattern changes are more evident at the highest scales. . . . . . . 49

15

List of Figures

2.7. Power Spectral Density of the HRV signal on a log − log plot. NSR (onthe left) has a single linear downsloping pattern, AF (on the right) re-sults in two different slopes. Computation of Hurst exponent allows todiscriminate between the two stages. . . . . . . . . . . . . . . . . . . . . . 50

2.8. Prematurity histograms during Normal Sinus Rhythm, Iterative AtrialTachycardia and Atrial Fibrillation. In these graphs, differences in distri-butions are evident: in Atrial Fibrillation the histogram is sparser thanNormal Sinus Rhythm and unimodal with respect to Iterative Atrial Tachy-cardia. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

3.1. 10s ECG for record 08215 of the MIT-BIH Atrial Fibrillation database.The reference rhythm annotation “(N” and the unaudited beat annotations“·” are displayed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

3.2. MinAF% represents the minimum percentage of AF-beats a reference seg-ment must contain to classify the entire segment as True Positive. Forpercentages of AF beats lower than minAF%, a segment is classified as aTrue Negative. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

3.3. AUROC values across the 5 folds of the MIT-BIH AF database for CosEn(Lake and Moorman), MAD (Linker DT.) and Lorenz (Sarkar et al.):mean AUROC values (solid line) and correspondent standard deviations(shaded regions) across the 5 folds CV. . . . . . . . . . . . . . . . . . . . . 74

3.4. Number of changes in True Negatives varying minAF% from 10% to 60%as a function of the window length. The colorbar encodes the number ofTP segments which turned into TN segments changing minAF% from 10%to 60%. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

3.5. Accuracy on the MIT-BIH AF database for CosEn. Accuracy is plottedas a function of window length and threshold when minNbeat= 10 beats. 76

3.6. Positive Predictive value on the MIT-BIH AF database for MAD. PPV isplotted as a function of window length and threshold when minNbeat= 10

beats. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 783.7. Accuracy on the MIT-BIH AF database for Lorenz. Accuracy is plotted

as a function of window length and threshold when minNbeat= 10 beats. . 793.8. Sensitivity and Total Error on the MIT-BIH Atrial Fibrillation database

by varying minNbeat. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

16

List of Figures

4.1. A dataset D for classification problems counts N observations. Each obser-vation is composed by M predictive variables (features) and, when known,the outcome Y. The classification problem is binary, when instances belongto two classes only (Y ∈ [0, 1]). . . . . . . . . . . . . . . . . . . . . . . . . 89

4.2. Logistic curve. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 924.3. The LARS algorithm in the case of M = 2 features. The model starts in

µ0 (x1 and x2 are the variables and y is the outcome).The residual (thegreen line) makes the least angle with x1 (x1 is the most correlated withthe residual), so the model starts moving in x1 direction. At µ1, there isequiangularity/equicorrelation of the residual with x1 and x2: so, we startmoving in the direction that preserves this property. . . . . . . . . . . . . 93

4.4. Number of folds containing each feature into the final model for eachexponentially separated window size (minNbeat=50). . . . . . . . . . . . . 97

4.5. Average feature importance across the 5 folds on the held out test set ofthe MIT-BIH AF, using LR (minNbeat=50). . . . . . . . . . . . . . . . . 98

4.6. Possible separating hyperplanes for a set of linearly separable data. . . . . 994.7. Support vectors, decision boundary and margin. The support vectors are

the most critical points to be classified in the dataset: they are the closestto the separation plane, thus having direct bearing on its location. . . . . 99

4.8. Non linear SVM. There upper dataset cannot be linearly separated; but,if we embed the data in a higher dimensional space, e.g. x �→ x2, in thisnew domain, a line can be easily drawn to divide red and green points. . . 101

4.9. Performance metrics as a function of window length and threshold, acrossthe 5 folds on the held out test set of the MIT-BIH AF (minNbeat=10).Thresholds are specified in the y axis: 0.2 (4.9a) and 0.4 (4.9b). . . . . . . 106

4.10. Performance metrics as a function of window length and threshold, acrossthe 5 folds on the held out test set of the MIT-BIH AF (minNbeat=10).Thresholds are specified in the y axis: 0.6 (4.10a) and 0.8 (4.10b). . . . . . 107

4.11. Average feature importance across the 5 folds on the held out test set ofthe MIT-BIH AF, using RF (minNbeat=10). . . . . . . . . . . . . . . . . 108



17

List of Figures

5.1. A ventricular ectopic beat (4th from the left) for record 109 of the MIT-BIH Arrhythmia database. . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

5.2. State sequence and observations for record 16265 of the MIT-BIH NSRdatabase. Top figure shows states’ transitions: short beat (state index1), regular beat (state index 2), long beat (state index 3). Central figuredepicts the original RR intervals series. Bottom figure shows the RRsequence, when ectopy is added, by using the transition probability matrix(eq. 5.5). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

5.3. Cumulative error for different percentages of simulated ectopic beats. Dropin performance for each AF detector is depicted by varying minNbeat:from 10 beats (left figure), 30 (central figure) to 50 (right figure) on theMIT-BIH NSR database. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

5.4. Cumulative error for different percentages of simulated ectopic beats. Dropin performance for each classifier (i.e. SVM, LR, RF) is depicted by vary-ing minNbeat: from 10 beats (left figure), 30 (central figure) to 50 (rightfigure) on the MIT-BIH NSR database. . . . . . . . . . . . . . . . . . . . . 126

18

List of Tables

2.1. Results of the Cvar test applied to RR series and �RR series on the MIT-BIH Atrial Fibrillation database. . . . . . . . . . . . . . . . . . . . . . . . 42

2.2. Results of the KS test applied to RR series and �RR series on the MIT-BIH AF database. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

2.3. AF vs. Not-AF rhythms transitions rules. . . . . . . . . . . . . . . . . . . 522.4. COSEn as function of age (years). COSEn values show significant changes

between NSR and AF. Slight variations are also related to age, due to thephysiological decrease of complexity in cardiac activity that occurs withageing. However, it does not significantly affect AF detection power. . . . 54

2.5. AF detectors comparison table.*Performance metrics on the MIT-BIH AF database are reported in thecomparative study by Larburu et al. [23].1s200 and s100 are series 100 and series 200 of the MIT-BIH Arrhythmiadatabase (see Section 3.2 for more details).2AF database (81 patients 476 h of AF, 76 h of Atrial Tachycardia, 1397h of NSR) is not available online.3MIT-BIH NSR database and MIT-BIH RR interval NSR database, avail-able on Physionet.org.4A validation set of 50 subjects was selected with these characteristics: 19with chronic AF 13 (CAF), 15 with paroxysmal AF (PAF) and 16 withpermanent normal sinus rhythm (NSR).5Algorithm based on RR histogram analysis. 6Algorithm based on Pre-maturity histogram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

2.6. Selected algorithms based on ventricular response analysis. . . . . . . . . . 582.7. Performance metrics on the MIT-BIH AF database before (1) and after

(2) applying frequency compensation and the criteria for outlier points. . . 59

3.1. MIT-BIH Atrial Fibrillation database profile: for each record, the numberof Atrial Fibrillation and Atrial Flutter episodes and duration are reported. 65

19

List of Tables

3.2. MIT-BIH Normal Sinus Rhythm database profile: for each record, thenumber of normal and ectopic beats and the total duration for each recordis reported. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

3.3. MIT-BIH Arrhythmia database profile: rhythm type for each half-hourrecord is reported in its entirety. . . . . . . . . . . . . . . . . . . . . . . . 67

3.4. Threshold range of variation for each AF detector. . . . . . . . . . . . . . 693.5. Minimum percentage and minimum number of AF used to build the ref-

erence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713.6. Performance metrics for CosEn on the MIT-BIH AF database. . . . . . . 773.7. Performance metrics for MAD on the MIT-BIH AF database. . . . . . . . 783.8. Performance metrics for Sarkar et al. on the MIT-BIH AF database. . . . 803.9. Optimal window size by varying minNbeat. . . . . . . . . . . . . . . . . . 813.10. Performance metrics on the MIT-BIH NSR database by varying minNbeat. 823.11. Performance metrics on the MIT-BIH NSR database by varying minAF%. 823.12. Performance on the MIT-BIH Arrhythmia database, assuming minAF%=10%.

Performance metrics on the Arrhythmia database are reported for the se-ries 100 (top table) and the series 200 (bottom table). . . . . . . . . . . . 83

3.13. Performance on the MIT-BIH Arrhythmia database, assuming minNbeat=10.Performance metrics on the Arrhythmia database are reported for the se-ries 100 (top table) and the series 200 (bottom table). . . . . . . . . . . . 84

3.14. TP, TN, FP and FN in the MIT-BIH Arrhythmia database. Results referto minNbeat=10 beats. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

4.1. Optimal logistic regression coefficients for each value of minNbeat. . . . . 954.2. Range of values for Gamma (γ) and Capacity (C) to perform a grid search.

The choice of exponentially growing values is a practical method to searcha large space in a reasonable amount of computation time. . . . . . . . . . 100

4.3. Results of SVM on the MIT-BIH AF database compared to Lorenz per-formance. Results are shown for minNbeat equal to 10 beats (4.3a), 30beats (4.3b) and 50 beats (4.3c). SVM performance metrics are reportedfor the WLopt and for the WL which resulted optimal for Lorenz. . . . . . 103

4.4. Thresholds applied to the test sets, to round the probability estimates forLR, RF and SVM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

4.5. Comparison between LR, RF, SVM and each AF detector: CosEn, MAD,Lorenz. Performance metrics on the MIT-BIH NSR database are reportedby varying minNbeat. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

20

List of Tables

4.6. Comparative results for SVM, RF, LR and each AF detector, on the MIT-BIH Arrhythmia database. Results are distinguished between series 100and series 200 and are reported for minNbeat=10 beats. . . . . . . . . . . 112



4.9. Comparison between SVM and each AF detector (i.e. CosEn, MAD,Lorenz ). Performance metrics on the MIT-BIH NSR database are re-ported by varying minNbeat. . . . . . . . . . . . . . . . . . . . . . . . . . 114

4.10. Comparative results for SVM, CosEn, MAD, Lorenz on the MIT-BIHArrhythmia database. Results are distinguished between series100 andseries200 (see Section 3.2) for minNbeat=10 beats. . . . . . . . . . . . . . 115



5.1. Modified Lown criteria for Holter classification of ventricular extrasystoles[35]. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

5.2. Mean and standard deviation of simulated ectopy and correspondent tran-sition probability matrices. . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

5.3. AF detectors Specificity by varying the percentage of simulated ectopy forminNbeat=50 beats. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

5.4. Specificity for LR, RF, SVM by varying the percentage of simulated ectopyfor minNbeat=50 beats. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

A.1. Beat annotation keys. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136A.2. Rhythm annotation keys. . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

B.1. Results for LIBSVM, including MAD, on the MIT-BIH NSR database(B.1a), on the MIT BIH Arrhythmia database series100 (B.1b) and se-ries200 (B.1c) by using minNbeat=10 beats. . . . . . . . . . . . . . . . . . 138

21

Abbreviations and units

Acc Accuracy

AF Atrial Fibrillation

AFL Atrial Flutter

AR Auto Regressive

AUROC Area Under the Receiving Operating Curve

AV Atrio Ventricular

bpm Beat per minute

bs beat-segment

C Capacity (misclassification cost)

CE Conditional Entropy

Ceff Computational Efficiency

CosEn Coefficient of sample Entropy

CV Cross Validation

Cvar Coefficient of Variation

Cvel Conduction Velocity

db database

DWT Discrete Wavelet Transform

ECG Electrocardiogram

ERP Effective Refractory Period

22

List of Tables

Err Total Error

FA Fibrillazione Atriale

FN False Negative

FP False Positive

γ Gamma (kernel width)

HMM Hidden Markov Model

HR Heart Rate

HRV Heart Rate Variability

Hz Hertz

K Kernel function

KS Kolmogorov Smirnov

ll log-likelihood

LR Logistic Regression

MAD Median Absolute Deviation

MCCE Minimum of the Corrected Conditional Entropy

min Minutes

minAF% Minimum percentage of atrial fibrillation beats

minNbeat Minimum Number of atrial fibrillation beats

ms Milliseconds

NN Normal-to-Normal beat

NPV Negative Predictive Value

NSR Normal Sinus Rhythm

PAF Paroxysmal AF

PPV Positive predictive Value

23

List of Tables

PVC Premature Ventricular Contraction

RBF Radial Basis Function

RF Random Forests

ROC Receiving Operating Curve

RR R-peak to R-peak

s Seconds

SampEn Sample Entropy

Se Sensitivity

Sp Specificity

SVM Support Vector Machine

SVT Supraventricular Tachycardia

VEB Ventricular Ectopic Beat

VF Ventricular Fibrillation

VT Ventricular Tachycardia

TN True Negative

TP True Positive

Th Threshold

WL Window Length

WLopt Optimal Window Length

24

1. Atrial Fibrillation: mechanisms,incidence, diagnosis

1.1. Introduction

Atrial fibrillation is defined as a "tachyarrhythmia characterized by predominantly unco-ordinated atrial activation with consequent deterioration of atrial mechanical function"[16]. It is the most common sustained arrhythmia and its prevalence is increasing withthe aging of the population and the improvement of diagnostic tools to detect it. It isestimated that AF will affect more than 5 million Americans by 2050 [10]. Considerableefforts have been directed to understanding this progressive arrhythmia and very goodresults achieved over the past ten years. Nonetheless, it remains a widely diffused causeof hospitalization, while its treatment is one of the biggest issues in the arrhythmologyfield so far, and leads to an excessive healthcare resource utilization. This first chapter isaimed at introducing the reader to the Atrial Fibrillation burden. A brief look at atrialelectrophysiology is necessary to have the basis to better interpret both the current clin-ical knowledge about human AF and the reasons why many questions regarding AF stillremain unanswered. So, some basic information and properties of normal atrial electro-physiology and anatomy are reviewed. The fundamental concepts of atrial wavelengthand re-entry are introduced and the three crucial arrhythmic mechanisms held to be re-sponsible for AF are described: multiple-circuit, single-circuit and rapid ectopic activityhypothesis. These theories are not disconnected at all and contribute, to various degrees,when defining the clinical scenario. This chapter also considers AF incidence and risk fac-tors and explains the paradigm "AF begets AF". Clinical presentation and most commondiagnostic and therapeutic options are explored. Finally, major limitations of the currentstandard techniques for AF diagnosis are exposed and it is introduced the pressing needof accurate detection methods, suitable for screening applications, able to increase thenumber of additional AF cases detected and to face the major challenge of paroxysmalnature of AF. A screening detector must meet the requirements of cost-effectiveness,global acceptance, ease-of-usage: attention immediately turns to potentiality offered by

25

1. Atrial Fibrillation: mechanisms, incidence, diagnosis

mobile Health.

1.2. Atrial electrophysiology

In human physiology, the two upper-chambers of the heart muscle, called atria, receiveblood returning from the other areas of the body and prevent circulatory inertia byproviding uninterrupted venous flow to myocardium during ventricular systole. In fact,no inlet valves stop blood flow during atrial systole, and atrial contraction occurs gentlyenough to avoid significant back pressure, but it is enough effective and well-timed toforce any remaining blood into the ventricles, which are filled completely. The so called"atrial kick" increases efficiency of ventricular ejection thanks to an augmented preload.So if atrial contraction is too quickly or ill-timed, the overall ventricular functionality isheavily affected.

There are some anatomical and physiological properties of normal atria which play akey role in initiation and maintenance of Atrial Fibrillation:

• Atrial myocardium is almost composed by fast-channel tissue.The large size of Na+ currents that depolarizes the myocytes leads to an atrialconduction velocity of about 1 meter per second. Once an overshoot potential ofabout +30mV is reached, atrial cells remain inactivated for a time interval calledrefractory period, during which a successive action potential cannot be triggered.This property lets the development of very complex conduction patterns in theatria: if the action potential is short, reactivation can occur earlier, leading to anextremely rapid atrial rate, as observed in AF.

• Atrial chambers are characterized by a complex three-dimensional structure.The venae cavae, the muscle structures, the atrioventricular valves together withdiscrete electrical connections between the right and left atria all contribute to thedelicate functioning of the heart, but they also represent physical obstacles playinga major rule in atrial fibrillation initiation. In fact, atrial anisotropy which is acommon aspect of normal atria tissue, facilitates complex conduction patterns, thusresulting in an increased dispersion of atrial refractoriness.

• Atrial action potential morphology and duration are spatially heterogeneous.Such differences are both inter and intra atrial and show to be an important factorin Atrial Fibrillation .

26


1.3. Mechanisms of Atrial Fibrillation

AF has a complex and not fully understood pathophysiology. Once an AF episodebegins, it can be short or persistent, according to the presence of various initiators,acting as an engine that drives AF continuation (focal drivers). AF maintenance can bestrongly subject to the sustained firing of these foci or it can persist even in their absence,if it is favored by structural and electrical changes in the atria. Atrial size increase andrefractoriness shortening, for example, are likely to promote the generation of multiplereentrant wavelets, by decreasing the atrial wavelength of reentry. The atrial wavelength,a crucial concept in understanding atrial fibrillation, is defined as the product of EffectiveRefractory Period (ERP) and Conduction Velocity (Cvel) [4]. In other words, at a givenconduction velocity, it represents the distance covered by a depolarization potential inone refractory period. This means the mechanism of re-entry can take place, if thedistance an electrical impulse travels is at least equal to the wavelength. For example,if the circuit size is smaller than the wavelength, the impulse will arrive at the initialpoint, still in refractory period: if another action potential cannot be triggered, thesignal will die. Long wavelengths are associated with fewer and more stable wavefronts,while fragmented wavelengths result in smaller circuits which propagate and interact ina random fashion within the atria. It’s intuitive to understand that atrial chambers canaccommodate more circuits if wavelengths are smaller and atrial size is increased.

Long held theories explaining AF perpetuation imply the existence of:

1. Multiple-circuit reentry

2. Single-circuit reentry

3. Rapid ectopic activity

1.3.1. Multiple-circuit hypothesis

This leading hypothesis suggests that AF maintenance can be explained by multiplewavelets varying continuously in location and time within the atria: they can collide witheach other and thus disappear or split into smaller wavelets which re-excite the atria overand over again. Reentry persistence is possible only in the presence of excitable tissuein front of the head of propagating wavefronts: this is why an increased atrial size, theshortening of refractory period, the presence of anatomical obstacles all allow for smallerand more atrial reentry circuits to coexist. It should be quite intuitive why action poten-

27


Figure 1.1.: Mechanism of Atrial Fibrillation: multiple-circuit reentry. AF is character-ized by uncoordinated atrial activations, causing inconsistent impulse trans-mission through the AV node and a consequential irregular rapid ventricu-lar rate. According to Moe’s hypothesis, AF can be attributed to multiplewavelets varying continuously in location and time within the atria.

tial heterogeneity also promote multiple wavelets generation as well. The higher is thenumber of wavelets in the heart at any moment, the more the arrhythmia will persist: ifatria cannot work as a functional syncytium, the consequence is a fibrillatory response.Moe proposed this widely diffused theory for the first time in 1962 [27]. AF activationmapping during human AF has reinforced the knowledge about the initiation and main-tenance of the arrhythmia, both supporting and challenging Moe’s leading thought.

1.3.2. Single-circuit reentry

Mapping studies have showed that even a single re-entrant wavefront can produce AF[21]. The single-circuit reentry model is based on the existence of a single stable source(mother circuit), acting as a periodic background focus: in this case, it is the complexthree-dimensional structure of the atrium, with its lattice of pectinate muscles, to providean excellent substrate for auto-sustained circus movement reentry, breaking the wave-front into multiple meandering wavelets.

1.3.3. Rapid ectopic activity

Haissaguerre and colleagues [12] have also proved that Atrial Fibrillation is often trig-gered by ectopic activity arising from the pulmonary vein region (hyperectopia theory).When an ectopic beat meets a refractory zone, it can propagate through the faster re-

28


covering tissue in another direction, leading to the generation of abnormal "re-entrantrotors". It is important to remember that when electrical impulses spread rapidly, atrialtissue respond in a 1:1 fashion, producing, as expected, a regular tachycardia, up to athreshold rate. Beyond this critical point (e.g. the cycle length of the driver is shorterthan the ERPs of the tissue), a fibrillatory response will occur.

1.3.4. Atrial Fibrillation begets Atrial Fibrillation

The classic understanding of atrial fibrillation pathophysiology looked at all these theo-ries to explain potential AF mechanisms as alternative and mutually exclusive hypothesis.But clinical experience shows they can be strongly linked when analyzing the history of apatient and interact at various extent for different scenarios. It is well known that "AtrialFibrillation begets Atrial Fibrillation" [19]: up to 24% of patients with paroxysmal AF,will develop persistent AF. The progressive aspect of this arrhythmia led to the firm andlater on well-proved belief that AF itself induces irreversible electro-physiologic features’re-modelling. Functional, electrical, structural and biochemical changes occur in theatria such as decreased ERP, poorer ERP rate-adaptation with consequential contrac-tile dysfunction that causes atrial dilation[39]. Re-modelling links all the potential AFmechanisms. When paroxysmal AF is initiated by rapid ectopic activity or single-circuitreentry, it causes anatomic and electrical changes, which favor spatially heterogeneous re-fractoriness shortening. This tends to move AF towards multiple-circuit re-entry. Ectopicactivity, as well as an increased vagal tone play a key rule in triggering on pathologicalconditions capable of sustaining AF: heart failure, atrial stretch, age-related developmentof interstitial fibrosis all provide a prolific substrate for reentry. However, the nature ofinteraction between triggers and substrate in the genesis of AF is still not completelyunderstood.

1.4. A clinical view

1.4.1. Incidence and Risk Factors

Atrial Fibrillation is the most common heart rhythm disorder and leads to a reducedquality of life in patients with or without other heart diseases. Over 2 million peoplein Europe and two to three million Americans are currently diagnosed with AF and itsincidence is destined to raise with the aging of the population and the improvement oftechnology to detect this progressive arrhythmia [10]. In China, more than 9 million ofpeople suffer from AF [37] and it has been estimated a prevalence of 1% in the general

29


Figure 1.2.: Prevalence of Atrial Fibrillation (Rotterdam study [14])

population, with an higher age-adjusted incidence in men [10]-[31]. AF is associatedwith an increased risk of hospitalization, myocardial infarction, heart failure and all-cause death: studies show a doubling of mortality rates in people who suffer from AFcompared to who are not [39], and a major risk of stroke of about 5% per year [13]. Thisdiffused rhythm disorder is prevalent in patients with cardiovascular disease even if it’snot uncommon to diagnose AF when no particular clinical picture is present, speciallyin younger subjects, most frequently diagnosed with "lone AF" [39].

Classical risk factors include hypertension, valvular diseases, heart failure, inflamma-tion, diabetes [33]: any condition likely to provide changes in atrial electrophysiologywhose effect is to shorten the refractory period and increase its dispersion, thus facili-tating initiation and maintenance of AF. However, it’s worth to pay attention to morerecent and less mentioned risk factors, including obesity, sleep apnoea, high-dose steroidtherapies, which may all act as contributors to provide the substrate for reentry [8].Only recently there have also been important insights into genetic determinants: genemutations related to familial AF seem to be responsible for an increased activity ofre-polarizing K+ currents linked to shorter action potential duration, ERP and atrialwavelength [15].

1.4.2. Clinical presentation and treatment

AF occurs because electrical signals are not systematically triggered via the SA node

30


and follow an abnormal conduction pathway. They are generated from all over the atriaat a rate up to 300–600 beats per minute. The ventricles’ response depends on the atrialrate and on the filtering function of the atrioventricular (AV) node, which conducts onlyaccording to its refractory period. Irregular impulses to the lower chambers cause fastheart rate (up to 150 beats per minute) and less effective contractions. In fact, the lossof the "atrial kick" and its contribution to ventricles’ filling reduces cardiac output.

AF has heterogeneous clinical presentations and is traditionally classified on the baseof the episodes duration:

• Paroxysmal: episodes begin suddenly and then stop on their on, usually in lessthan 24 hours.

• Persistent: abnormal heart rhythm continues for more than a week.

• Permanent: a normal sinus rhythm cannot be restored without treatment.

Once the pattern of the disorder is characterized (e.g. persistent or paroxysmal) andunderlying causes are determined (e.g. hypertension, heart failure etc.), a suitable man-agement strategy can be applied. Anti-arrhythmic treatment has three main purposes:

• Prevention of thromboembolic complications.The thromboembolic risk increases significantly when AF duration exceeds 48 h.Thus, patient treatment should be completed, when possible, within this timeframe. In such cases, the main issues are when to start and when to interrupt phar-macological treatments. Drug-therapies may convert patients with symptomaticAF into patients with asymptomatic AF [4], leading physicians to erroneouslydiscontinue the treatment with devastating consequences or, vice versa, to treatpatients with life-long anticoagulation to avoid negative effects.

• Rate control strategy.Control of ventricular response is one of the initial objectives of pharmacologicalmanagement of AF. The aims are to minimize symptoms decrease thus, providinga better hemodynamic function. Digoxin, beta-blockers, and rate-limiting calciumantagonists are the most commonly applied [86] to achieve adequate rate control.Sometimes amiodarone or AV node ablation is indicated in refractory cases [4].

31


• Rhythm control.

This approach attempts to restore and maintain normal sinus rhythm. Pharma-cological treatment comprises class IC or III antiarrhythmic drugs. Both oral andintravenous administration routes can be performed effectively. Electrical car-dioversion for rhythm control is generally associated with an early response. Itis generally performed in patients with AF duration greater than 48 h under deepsedation requiring a short hospital stay. Biphasic shock is 98% effective in restoringsinus rhythm [4].

1.4.3. The under-detection problem

Actually AF is almost diagnosed by symptoms or revealed incidentally during othermedical evaluations: most patients suffering for AF complain of palpitations, chest pain,fatigue, breathlessness, dizziness and loss of consciousness. However, because of AF canbe asymptomatic, diagnosis by symptoms leads to a large under-representation of thedisturbance and may offer a biased view of the clinical epidemiology of AF, since only1 out of 3 patients may have been admitted to hospital [16]. The prevalence of symp-tomless AF is difficult to assess and under-detection has several negative consequenceson patients:

• Exposure to a significant risk of cardioembolic stroke before appropriate preventionmeasures are taken;

• Difficulty in assessing the efficacy of rhythm and rate control intervention.The large incidence or asymptomatic AF and the effect of pharmacological treat-ments to convert patients with symptomatic AF into patients with asymptomaticAF cast doubts on the best timing to interrupt drugs soministration, inducingphysicians to treat patients with life-long anticoagulation to avoid negative effects;

• Overestimation of successful maintenance of sinus rhythm.

Another important aspect to consider is that the type of AF, permanent or paroxysmal,doesn’t statistically modify the risk of stroke occurrence: the problem is that paroxysmalAF is more difficult to be revealed [8]. It was estimated that only 1 in 12 paroxysms aresymptomatic [10].

32


1.4.4. Identification and diagnosis

Atrial Fibrillation can present in subjects with suspicious clinical setting, but it canalso be symptomless. Some times it is revealed once serious AF complications havealready occurred, such as thromboembolism or heart failure: "many patients presentingwith stroke are also found to be in AF, indicating a missed opportunity to diagnosethe pre-existing AF and administer appropriate antithrombotic therapy" [16]. Most ofthe times it is diagnosed by associating symptoms (palpitations, dizziness, dyspnoea)or accidentally during other medical evaluations, performed for uncorrelated reasons."In patients with medical problems, such as heart failure, stroke or thromboembolism,coincidental AF is detected. Whether AF was the cause or effect of the acute problem(e.g. stroke or heart failure) may be uncertain"[16]. Currently, standard techniques fordiagnosis are ambulatory-ECG, Holter recordings of 24-48 hours during patient’s dailyroutine or event recorders (implantable or not), activated by the subject when symptomsoccur [16].

In particular, in patients with recurrent paroxysms, clinical guidelines suggest to per-form a 24-hour Holter monitor [16]. Clearly, this practice can become useless if parox-ysmal events occur at intervals larger than 24 hours. For this latter category, eventmonitors ("cardio-memos" or implantable systems) are used to diagnose AF.

Electrocardiogram is always highly recommended to confirm the diagnosis in all pa-tients, whether symptomatic or not, thanks to the large potential information containedthere in.

At this point, a simple question arises: how to face the problem of asymptomatic AF?How to choose the most appropriate strategy, in terms of diagnostic tools and treatmentif we don’t know in advance if a subject has AF and which type he is suffering for?

Several studies have demonstrated that patient triggered event recorders have an higherdiagnostic yield for diagnoses of AF than Holter monitors, while automatically-triggeredevent recorders have a higher diagnostic power than the patient-triggered event recorders[4]. Nevertheless, the problem of undiagnosed AF remains unmet with all its negativeconsequences.

1.4.5. AF detection and m-Health

To address this global challenge, screening of general population should be taken intoaccount. Systematic screening (e.g. in subjects aged 65 or over) would detect addi-tional AF cases compared to routine practice as shown in the randomized study byFitzmaurice et al. [7]. In particular, it would face the major issue of paroxysmal and

33


often asymptomatic nature of AF, associated to the same negative outcomes as persis-tent and permanent AF. Earlier detection would result in improved outcomes within thescreened population, since existing treatments are quite effective in managing symptomsand reducing consequential risks associated with this arrhythmia [30]. The pressing needis to provide clinicians accurate methods to detect Atrial Fibrillation whilst reducingcosts related to health care resource utilization. A monitor device suitable for a screen-ing application should be easily accessible, cost-effective and easy to use in order tobe largely accepted in a clinical scenario. The use of mobile phones is likely to meetthese requirements. In fact, mobile community offers a familiar way of communicationto users, a worldwide-connected network and it is committed to continuous developmentand innovation. Since significant proportions of global healthcare cost are related tocardiac arrhythmias, and incidence of Atrial Fibrillation is destined to raise as ages indeveloped countries and population in developing countries, m-Health could really en-counter governments, private insurers and individuals’ need to prevent and timely treatsuch debilitating conditions while supporting cost-effective and interoperable ways of de-livering health care. According to a global end-user research commissioned by GSMAbetween March-June 2012 to explore the perception of m-Health, 89% of caregivers, 75%of patients and 73% of consumers believe that m-Health solutions can convey significantbenefits.

A mobile application to reveal Atrial Fibrillation could be the right answer to theraising incidence of this global burden, given the increasingly aging of the general popu-lation. This work is intended to be a contribution, a drop in the wide ocean of severalsmart and effectively applicable solutions that can be offered to make healthcare moreefficient in the perspective of a world where over-stretched systems will have to copewith ever-increasing demands and under-served populations consequentially cared withpoorer resources.

34

2. Ventricular Response Analysis

2.1. Introduction

Electrocardiography is a powerful tool to reveal initiation, maintenance and termina-tion of AF, thanks to the high potential diagnostic information contained in the surfaceECG signal. ECG characteristics directly reflect the nature of pathophysiologic events oc-curring in both the cardiac chambers. Indeed, it is highly recommended to be performedwhenever symptomatic or asymptomatic AF is suspected, because all mechanisms in-volved in sustaining AF, such as ERP and wavelength shortening, variation in regulatorysystem and hyperectopia change the bio-signal appearance. Recording a surface ECGhas unquestionable advantages: it is painless for the patient, does not require specialrooms and there is the possibility to perform long-term recordings with minimal risk ifcompared to other invasive diagnostic techniques. For all these reasons, signal processingcontribution in developing methods to reveal AF by an ECG analysis is very active, asproved by the large number of works appeared in literature over the last years.

This chapter describes physiological concepts behind the development of an AF de-tection algorithm and presents some of the most recent signal processing techniques toreveal Atrial Fibrillation.

AF detectors can commonly follow two different approaches in revealing the arrhyth-mia: Atrial activity analysis and Ventricular response analysis. Pros and cons of thesetwo families are presented.

A considerable part of the chapter is dedicated to an overview of ten different al-gorithms to detect AF, with almost total focus on ventricular response analysis-basedmethods. The comparison between proposed detectors finally leads to the choice of threealgorithms whose theoretical performance makes them eligible for a further implementa-tion on smartphone.

The three methods are implemented in Matlab for successive evaluation and compari-son.

The chapter ends with some implementation details for one of the selected algorithms.

35


2.2. Physiological concepts for development of detectionalgorithms

Atrial Fibrillation is associated with rapid uncoordinated atrial activations: contrac-tion rate can reach 600 bpm in humans. On a surface ECG tracing, this may causea sudden disappearance of regularly-occurring P-waves, normally associated with syn-chronous activation of the atria. P-waves are replaced by a fluctuating baseline, com-posed by fibrillatory wavelets of low-amplitude varying from patient to patient. In somecases, this atrial pattern may look similar to Atrial Flutter, if characterized by a majorregularity and higher amplitude of baseline deflections. However, it is important to re-member that Atrial Fibrillation and Atrial Flutter represent distinct clinical conditions,since they are treated differently. Random impulses generated within the atria are thentransmitted through the AV node: the electrical isolation between atrium and ventricleprevents the complete block of heart’s pumping action, since the AV node conducts onlyaccording to its refractory period. However, ventricular activity becomes less effectiveand weakly predictable: RR interval series results irregular, hence the heart rate is bear-ing a signature of AF a trained observer can easily recognize. Unfortunately, ventricularresponse irregularity is neither 100% nor 100% sensitive, because other arrhythmias suchas multifocal atrial tachycardia, frequent premature atrial complexes and sinus arrhyth-mia have a similar irregular pattern. Vice versa, in the case of AV block with artificialventricular pacing, AF can be present in association with a regular ventricular rate. Astudy demonstrated AF under-detection occurs in paced-patients, because RR intervalsare regular even if in AF.

In literature, AF detectors can be separated into two major classes:

• Methods based on atrial activity analysis

• Methods based on ventricular response analysis

2.2.1. Atrial activity analysis

Methods based on atrial activity analyze the morphology of P-wave which can conveysignificant diagnostic information about AF. Unfortunately, this approach is challengedby the difficulty to detect f-waves automatically, due to the contemporary presence of theventricular activity which has a much larger amplitude and to the presence of potentialnoise and artifacts superimposed. For this reason, the ECG signal needs to be pre-

36


processed to cancel ventricular activity (i.e., the QRST complex) using, for example, atemplate matching algorithm.

Spectral overlap exists between atrial and ventricular activity which cannot be sep-arated using linear filtering. A recent study proposed a non-linear signal processingmethod to isolate the atrial signal: an echo state neural network which estimates thetime-varying, nonlinear transfer function between two leads, one containing atrial activ-ity and another lead without, for the purpose of obtaining the residual ECG of interest[32]. Of course, continuous adjustment of the set of weights has to be performed for eachnew processed sample.

For atrial analysis, it’s required a stable, high quality signal, difficult to obtain in real-time long-term recordings. Also, even supposing this is possible, sometimes a standard12-lead ECG systems may not be enough for studying atrial activity, because the numberof electrodes is too small and their location is not optimal to highlight f-waves peaks.New protocols have been proposed to emphasize the atrial activity: one is the Atriocar-diogram, which maintains the standard ECG equipment, repositioning some electrodes[17]. Another problem, derived from the study of the atrial activity, is that no overallconsensus exists when linking the amplitude of the f-waves to the etiology of AF [4]: theatrial pattern is different for paroxysmal, persistent or permanent AF, but it also showssignificant inter-patient variations while it tends to remain stable in the same patientover 24 h, if his conditions remain the same. Although highly effective, atrial activityanalysis is not definitely the most suitable approach for an automatic, screening appli-cation, since simplicity of use, global acceptance and robustness to noisy tracings areimportant requirements.

2.2.2. Ventricular response analysis

Methods based on the ventricular response, instead, are intended to capture the ir-regular, rapid nature of AF. The analysis is concentrated on the inconsistent AV nodeconduction patterns, which cause irregular, uncorrelated RR intervals. Limits and ad-vantages are specular if compared to the other type of algorithms: the QRS spike isthe most prominent feature of an ECG and so the least confounded by muscle noise,indeed, methods based on RR irregularity should be preferred for external devices. Itmust be said that this major robustness against artifacts and other potential disturbanceassumes an high accurate QRS detector to be used, since extra and missed beats wouldaffect algorithms’ performance. As previously said, chaotic ventricular response is notthe exclusive hallmark of Atrial Fibrillation: this aspect increase false-positive detection

37


Figure 2.1.: Physiological concepts for development of Atrial Fibrillation detectors. Ona surface ECG tracing, AF may cause a sudden disappearance of regularly-occurring P-waves, normally associated with synchronous activation of theatria. P-waves are replaced by a fluctuating baseline, composed by fibrilla-tory wavelets of low-amplitude varying from patient to patient. Moreover,ventricular activity becomes less effective and weakly predictable: as a con-sequence, RR interval series becomes irregular.

rate. Accuracy could be improved by using a combination of atrial and ventricular analy-sis, but this falls outside the leading purpose of this work which is finding an appropriatemethod to face the AF under-detection problem and to draw clinicians’ attention to ad-ditional AF cases, and not substituting them during the entire diagnostic process.

2.3. AF detection algorithms: description and comparison

This section summarizes the state of art in Atrial Fibrillation detection based onventricular response analysis, since this approach provides a better signal to noise dis-crimination.

Ten processing techniques are proposed: For each detector, the underlying principleto reveal Atrial Fibrillation and the results are briefly discussed.

2.3.1. Analysis of the dynamics of RR interval series for the detection of

atrial fibrillation episodes (Cerutti et al.)

The method proposed by Cerutti et al. was developed for automatic, real-time de-tection of AF. Changes in the RR interval series dynamics are captured by using twoindices derived by the Auto-Regressive modeling of the series (P and M) and Minimumof the Corrected Conditional Entropy (MCCE ). In fact, during AF, the physiological RRvariability is perturbed and the correspondent spectral features of the series change: it be-comes less predictable, producing a pseudo white-noise, more irregular pattern. MCCE,P and M encode this irregularity and can be used to reveal AF, since significant differ-

38


ences were found between Normal Sinus Rhythm (NSR) epoch and Atrial Fibrillation.2.3.1.1. Methods

RR sequence can be modelled by an auto-regressive series

RR (t) =p�

k=1

akRR (t− k) + n (t)

where p is the model order {ak} are the model coefficients and n (t) is a white noiseWN

�0,σ2

�.

Two parameteres are extracted from the AR modeling

P =�1− σ2

e/σ2rr

�· 100

P represents the percentage of power, which can be predicted by the model (σ2e is the

variance of the model prediction error e (t) and σ2rr is the variance of the RR series)

M = max {|zk|}

M is the maximum modulus among the model poles.The parameter MCCE is a corrected form of the Conditional Entropy (CE ) that can

be obtained from the Shannon Entropy (E)

CE (L) = E (L)− E (L− 1) = −�

L

pL log pL +�

L

pL−1 log pL−1

where pL is the probability of a given sequence among the whole sequences.Changes of these three parameters correlate significantly (P-value< 0.001) with rhythm

modifications. From NSR to AF:

• P decreases due to the reduction of the temporal series predictivity, because thevariance of the prediction error increases;

• M decreases because poles positioning varies (poles move towards the origin of thefrequency plane);

• MCCE increases because the series becomes more irregular.

A simple thresholding can be applied to discriminate between AF and NSR.

39


2.3.1.2. Discussion

The optimal threshold of each feature is crossed after tenths of beats after AF onset,making this algorithm suitable for real-time applications. The possibility to vary themodel order and another parameter of the AR modeling, the forgetting factor, makes themethod more flexible: as an example, we can choose a faster adaptation, but always atrobustness expense.

In the comparative study, Cerutti et al. showed the lowest computation time (0.36seconds per hour of data analyzed) and a very high Sensitivity: the algorithm resultseligible to be implemented in a screening system for AF detection.

2.3.2. Long-Term Monitoring for detection of Atrial Fibrillation (LinkerDT.)

The following algorithm was developed for long-term monitoring of AF on a portablemonitoring device. It approaches the detection problem by using a statistical frameworkwhich reveals the irregular nature of AF through a non-linear feature, MAD, defined asthe median of the variation in the absolute standard deviation from mean of heart rate inthree adjacent segments of the RR interval series, having the same length. MAD seemsto offer significant discrimination between AF and NSR.2.3.2.1. Methods

The ECG signal is used to derive the RR interval series. This sequence is successivelysplit up in adjacent segments of the same length. Three consecutive segments are usedto compute a particular feature, MAD, that is compared to a threshold to label thecorrespondent RR sequence as AF. The main steps used to compute MAD are displayedin the Figure 2.22.3.2.2. Discussion

The choice of MAD, as median of variations in the absolute standard deviation frommean of heart rate in consecutive RR segments, should increase the algorithm’s robust-ness to outliers: in contrast to the mean, that linearly changes with sample values of astatistical distribution, median is not sensitive to sample values at the extreme ends ofthe same distribution.

In the comparative study [23], this method showed the highest Sensitivity and thesmallest window length (10 s): two very important aspects which should help in detectingadditional AF cases, or else remained undiagnosed, such as paroxysmal events, whoseonset is often unexpected and of short duration.

40


Figure 2.2.: Steps of Linker DT. algorithm for AF detection.

41


Cvar test RR series �RR series

Sensitivity (%) 86.6 83.9Specificity (%) 84.3 83.7

PPV (%) 79.8 78.7NPV (%) 89.8 87.9

Table 2.1.: Results of the Cvar test applied to RR series and �RR series on the MIT-BIHAtrial Fibrillation database.

However, in this study, not many details are available about the evaluation protocoland no mention is made about the optimal threshold used to demonstrate algorithm’sperformance. No further information about it was found in the reference material con-taining the description of this algorithm [24].

Anyway, there is the possibility to vary the threshold value for the computation of thefinal feature: a receiver operating curve (ROC) may be helpful to choose the minimumtime window, while maximizing sensitivity and Specificity of this detector.

This method was developed to be easy-to-implement, simple and to have low memoryrequirements.

2.3.3. Automatic detection of atrial fibrillation using the coefficient of

variation and density histograms of RR and ∆RR intervals.(Tatenoet al.)

Tateno et al. proposed two methods for automatic detection of Atrial Fibrillation.The first one is based on the Coefficient of Variation (Cvar), which significantly changesduring AF: this feature is applied both to RR intervals and �RR intervals, with betterresults on the second sequence, and is defined as the ratio between the standard deviationof the intervals contained in a segment and the mean RR interval in the same segment.Results on the MIT-BIH Atrial Fibrillation database are not excellent as shown in theTable 2.1.

For this reason, Kolmogorov-Smirnov (KS) test is applied to ∆RR series (obtaineddifferentiating successive RR intervals) with very good performance on the MIT- BIHAF database. Since this test doesn’t reject rhythms with frequent Premature VentricleContractions (PVC), an additional improvement to be applied to RR intervals is proposedin order to increase algorithm’s Specificity.2.3.3.1. Methods

The KS test is applied to Nseg beat-segments centered on each beat of the records

42


KS test RR series ∆RR seriesSensitivity (%) 66.3 94.4Specificity (%) 99.0 97.2

PPV (%) 98.0 96.1NPV (%) 80.4 96.0

Table 2.2.: Results of the KS test applied to RR series and �RR series on the MIT-BIHAF database.

(Nseg = 20,50, 100, 200). For each beat, the density histogram of the ∆RR intervalsis determined. Then it is compared with standard density histograms using KS test:the greatest distance D between the cumulative probability distributions of the standardhistogram and the test histogram is measured: KS test returns a P − value, indicatingthe significance of dissimilarity between the distributions:

P − value = 2∞�

j=1

(−1)j−1 exp�−2j2λ2

�

where λ =�√

Ne + 0.12 + 0.11/√Ne

�·D, Ne = N1N2/(N1 + N2), N1 is the number

of data points in the standard distribution and N2 in the test distribution (N2 = Nseg).The best results are obtained applying KS test to ∆RR intervals as shown in the Table

2.2.This algorithm showed problems in rejecting rhythms with PVC. However, cumulative

probability distributions of the RR intervals segments in PVC have a distinctive mor-phology which can be used to distinguish AF from frequent PVC, obtaining an improvedPositive Predictive Value.2.3.3.2. Discussion

In the comparative study [23], KS test applied to ∆RR series showed the highestSpecificity (96.08%), the lowest error (5.32%) and best performance during the robust-ness test. Motion artifacts were added in order to have a range of SNR values from-30 to 30 dB: Tateno et al. obtained a Sensitivity = 85.79 and Specificity = 85.79

with 0dB of SNR. If Specificity is an important requirement, Tateno et al. should bepreferred. This algorithm also gave the lowest error (5.32%), still showing a good Sen-sitivity, making it a good choice for ambulatory conditions. The algorithm performanceis dependent on two parameters: P − value threshold (Pc) and Nseg: an increase in Pc

causes a Specificity enhancement at Sensitivity’s expense. One can also vary Nseg, butpaying attention to the fact that over short intervals, fluctuations can mimic the irregular

43


nature of AF increasing the percentage of False Positive detections. The implementationof this algorithm also requires determination of the standard density histograms to becompared with test distributions: this is a delicate point since standard histograms willconstitute the templates for Atrial Fibrillation detection.

2.3.4. A detector for a chronic implantable atrial tachyarrhythmia monitor

(Sarkar et al.)

The detector described in this paper was developed for an implantable device, ensuringan accurate long-term Atrial Fibrillation monitoring. The irregularity of RR intervalsis still the underlying detection principle: in particular, a certain number of ∆RR in-tervals, correspondent to a fixed time interval, is used to populate a scatter plot, calledLorenz plot: the points’ distributions are telltale sign of contingent rhythm modifica-tions. Figure 2.3 displays representative examples of distribution patterns for NSR andAF: points gathered around the origin mean no clinically significant HRV is occurring,since differences in consecutive RR intervals are small; a sparser distribution, instead,reveals significant changes in time of the instantaneous heart rate, which is measurableexpression of the irregular ventricular response observed during AF.

A numeric representation of the Lorenz plot, a two-dimensional histogram, is used toextract a feature, AFEvidence, which accounts for points positioning within the plot.This feature is compared with a threshold to reveal the heart disturbance.2.3.4.1. Methods

The Lorenz plot of ∆RR intervals is a scatter plot: on the x − axis we find the ac-tual ∆RR interval (∆RR (i) = RR (i) − RR (i− 1)), on the y − axis the previous one(∆RR (i− 1) = RR (i− 1) − RR (i− 2)) (see Figure 2.3). So, this plot depicts the un-correlated nature of RR intervals in the direction of change of 3 consecutive interbeatintervals: for each point, its magnitude encodes the type of irregularity occurring inRR (i), RR (i− 1), RR (i− 2), while magnitude coupled with phase encodes the inco-herence of change of the same intervals.

During normal sinus rhythm, most of the points will be gathered around the origin ofthe Lorenz plot, but, when AF occurs the distribution will become sparser.

Let’s consider a numeric representation of this plot, e.g. a two-dimensional histogram,which is divided into 13 discrete segments, as shown in the Figure 2.4: for each rhythm,there will be an higher probability of points’ positioning within specific segments.

As an example, in case of NSR, points will be almost concentrated around the origin:therefore, the origin bin of the histogram, denoted by “0”, will contain a large number of

44


Figure 2.3.: Distribution patterns in a Lorenz plot of �RR intervals for NSR and AF.On the x axis there is the actual �RR interval, on the y axis the previous�RR interval.

points, while bins in other segments will be nearly empty. On the contrary, during AF,points will spread here and there across the whole histogram, filling an higher number ofbins in each segment from “1” to “12”.

The counts of bins populated for each segment are used to compute AFEvidence, adetection metric that quantifies the possibility to be in presence of Atrial Fibrillation

AFEvidence = IrregularityEvidence−OriginCount− 2 ∗ PACEv (2.1)

IrregularityEvidence =12�

n=1

BCn (2.2)

PACEv =4�

n=1

(PCn −BCn) +�

n=5,6,10

(PCn −BCn)−�

n=7,8,12

(PCn −BCn) (2.3)

where OriginCount corresponds to the number of points contained in the origin bin,BCn is the number of non empty bins contained in the nth segment , PCn is the numberof points contained in the nth segment.2.3.4.2. Discussion

The method was evaluated on the MIT-BIH AF database and showed Sensitivity=97.5%

45


dRR ( i - 1 )

dRR ( i )

Figure 2.4.: The two-dimensional histogram, numeric representation of a Lorenz plot of�RR intervals. The histogram is divided into 13 discrete segments: for eachrhythm, there will be an higher probability of points’ positioning withinspecific segments.

and Positive Predictive value=99%, computed by comparing detection and clinical truthin 2 minutes blocks and labelling a target segment as AF if at least the 60% of beatswere in AF.

Detector performance is worsened by significant amounts of Atrial Tachycardia (AT).This condition can be improved by adding a supplemental detector to distinguish ATwith regular ventricular response from AF. However, results showed a reduction of theoverall accuracy by 6% caused by overestimation of AT in patients with low AF/ATburden.

Sarkar et al. is also designed to discriminate quite well AF from other kind of arrhyth-mias which are not characterized by an irregular ventricular response. The excellentperformance of this algorithm makes it eligible for a further implementation on phone:it must be said that the optimal window size proposed by the authors is quite long (2minutes, at least), so it would be interesting to test how detection power varies reducingthe window length and the minimum percentage of AF in a target segment (which iscurrently 60%), since a higher resolution is needed to reveal paroxysmal events of shortduration and unexpected onset.

The description of the algorithm steps in the correspondent paper [36] reveals some lackof details. As an example, the ∆RR histogram (see Figure 2.4) spans from [−600;+600]

46


Figure 2.5.: Histograms of RR intervals computed over 10s windows. Significant dif-ferences in RR distribution can be seen between AF (top figure) and NSR(bottom figure) in terms of standard deviation.

ms. How are out of bounds points treated?

2.3.5. Robust detection of Atrial Fibrillation for a long-term

telemonitoring system (Logan et al.)

This AF detector was designed to provide an automatic, robust detection of AtrialFibrillation. It has been conceived for ambulatory monitoring situations, where arbitrarylead placements, muscle artifact and potentially changing morphology of the signal canrepresent a challenge for an AF detector. It is simply based on RR intervals variance,which is a good indicator for AF: key words for this method are supposed to be detectionreliability and ease of implementation. Evaluated by the authors on the MIT-BIH AFdatabase, it showed Sensitivity = 96% and Specificity = 89%, which constitute asatisfactory result for AF screening, but the same metrics in the comparative study [23]are not confirmed.

2.3.5.1. Methods

This method uses an open source QRS detector (Zong et al.[42]) to derive the RR seriesfrom the ECG tracing: it is a morphology-independent detector which provides an accu-

47


rate R-wave spike detection regardless of lead placement. This preliminary fundamentalphase is then followed by the following simple steps:

• RR intervals are normalized according to the equation

RRnorm =RR

RR· 100

to compensate for different patient resting heart rates, where RR = 0.75RR +

0.25RR where RR is the current RR interval.

• Variance of the RRnorm statistic is computed over 10s sliding windows. An initialAF detection is computed according to whether the variance over each 10s windowis greater than a settable threshold. Initial classifications are then smoothed using asimple majority voting scheme over 600 beat windows in order to eliminate spuriousmeasurements.

• A simple thresholding is applied to detect AF. A typical threshold is 200ms.

2.3.5.2. Discussion

In the comparative study [23], the detector showed Sensitivity=87.30%, Positive Pre-dictive Value=85.72%, Specificity=90.31% and a total Error=10.89% with an optimalwindow size of 120s on the MIT-BIH AF database. When applying a noise test, byadding different levels of noise to ten clean ECG signals (1 hour in duration each) [23],Sensitivity resulted increased for high levels of noise: this is because baseline fluctuationsmimic AF and are detected as AF, thus resulting in an increment in Sensitivity whileSpecificity decreases importantly.

This algorithm proposed by Logan et al. also requires the initial calculation of RR.In the comparative study [23], an initial input signal (of ∼1000 seconds) was required toadjust this parameter and obtain the expected performance. So it has a start-up delayof more than 10 minutes: even if AF is not immediately life threatening, this aspecttogether with unsatisfactory results on the AF database contributes to discard this de-tector from the list of potential candidates.

48


Figure 2.6.: Discrete Wavelet Transform of the RR series at different scales.The ampli-tude of coefficients highlights onset and offset of high heart rate variabilityregions. Pattern changes are more evident at the highest scales.

2.3.6. High accuracy of automatic detection of atrial fibrillation using

wavelet transform of heart rate intervals (Duverney et al.)

Identification of AF is performed by using a cascade of two different analysis tech-niques: discrete wavelet transform (DWT) is applied to the RR interval series to findregions characterized by high heart rate variability (HRV). Successively, fractal analysisis applied to discriminate between physiological and abnormal patterns. In fact, even ifthe ventricular response can be described as irregular, studies [2] demonstrated it doesn’tmean it is exactly a random process, since preferential conduction patterns and variousdegrees of short-term predictability exist. For these reasons, fractal analysis is likely tohave promising detection power.2.3.6.1. Methods

DWT of the signal is used to reveal onset and offset of high variability events withhigh accuracy time localization (of about two RR intervals) for potentially target events.The mother wavelet, ψ used for this purpose, was a quadratic spline of the third order:HRV increase, at each scale, created by different dilatation and translation of ψ, dependson the amplitude of the correspondent DWT coefficients .

High HRV loci identification is not a sufficient condition for AF, because even NSRcan reach high variability levels. A second processing of the data is indeed needed, to

49


Figure 2.7.: Power Spectral Density of the HRV signal on a log − log plot. NSR (on theleft) has a single linear downsloping pattern, AF (on the right) results intwo different slopes. Computation of Hurst exponent allows to discriminatebetween the two stages.

quantify the level of randomness within an HRV zone. For this purpose, fractal analysisis used: while NSR has a general trend in 1/fβ , with a single linear downsloping patternwhen observing a log− log plot, the Power Spectral Density of AF is characterized bytwo different slopes. The first slope, in high frequency band, is higher (β ≈ 0) than theone obtained during NSR (β ≈ 1). So, the Hurst exponent H, related to β throughthe equation β = 2H + 1 is computed only for the five highest scales of the DWT andcompared to a threshold for AF identification.2.3.6.2. Discussion

This algorithm was not evaluated on any standard database present in Physionet, buton a set of 50 subjects comprising Chronic, Paroxysmal AF and NSR with both highSensitivity and Specificity, which would suggest this method for screening exams. DWTand fractal analysis are also sensitive to other arrhythmias such as supraventricular ex-trasystoles or supraventricular tachycardia: this can increase false positive detection.However, false-negative detection, more problematic during a screening phase, resultsvery infrequent. The evaluation of parameters used for detection, variability index andHurst exponent, needs at least 64 consecutive beats to be established: this could be alimitation for detection of short episodes. Moreover, this algorithm is quite computation-ally intensive, since it requires DWT of each segment for five different scales and PSD toestablish the Hurst exponent.

50


2.3.7. Atrial fibrillation detection algorithms for very long term ECG

monitoring (Petrucci et al.)

Two algorithms were proposed by Petrucci et al. for accurate assessment of the truesuccess rate of AF treatments in long terms (weeks) ECG monitoring. These meth-ods use meaningful descriptors to discriminate AF, extracted from RR prematurity and∆RR histograms, intuitive representations of short-term variability of cardiac rhythm.In fact, during an AF episode, the ∆RR distribution is sparser compared to normal sinusrhythm and always unimodal compared to other arrhythmias, such as Iterative AtrialTachycardia.2.3.7.1. Methods

Both algorithms require preliminary detection, timing and classification of QRS com-plexes. In the RR series derived from the ECG tracing, only Normal-to-Normal (NN)intervals are considered for successive processing.

For the first method, ∆RR histogram is obtained by differentiating successive NN in-terbeat intervals. The only geometric feature computed to characterize the histogramis the difference between the first empty bins on the right and on the left of the modalvalue in the distribution (MDW).

For the second method, Prematurity (P ) histogram is built, where P of each NNinterval is the percentage variation from the current heart rate

P (i) =NN (i)−NNmean

NNmean· 100

where NN (i)is the actual NN and NNmean is a running average of NN over the lastn intervals; n is shorter for rhythmic beats (fast update) and longer for premature ordelayed beats (slow update).

This histogram is characterized by:

• Number of non empty bins (NEB)

• MDW

• Difference between mean and median

• Geometric test of bimodality (search for a secondary significant modal peak andcomputation of the distance from the main mode)

51


Figure 2.8.: Prematurity histograms during Normal Sinus Rhythm, Iterative AtrialTachycardia and Atrial Fibrillation. In these graphs, differences in distribu-tions are evident: in Atrial Fibrillation the histogram is sparser than NormalSinus Rhythm and unimodal with respect to Iterative Atrial Tachycardia.

Table 2.3.: AF vs. Not-AF rhythms transitions rules.

The last two parameters are used to distinguish unimodal from bimodal distributions.These indexes are combined to design a more complex rule for detecting AF accordingto the rhythms transition rules displayed in Table 2.3.

2.3.7.2. Discussion

Both algorithms were evaluated on MIT-BIH AF database, considering only episodeslonger than two minutes. In particular the second one obtained good results in terms ofSensitivity (91%) and Positive Predictive Value (92%).

Optimal window lengths, obtained by searching the value that minimized the over-

52


lapping region of MDW distributions in AF and not-AF rhythms, are 40 s and 60 srespectively.

The final purpose of this work was to design detectors for on-board implementation re-quiring reliable, simple and easy-to-implement methods with low memory requirements.For this reason, the first algorithm was designed as simple as possible and obtained ac-ceptable performance. For the second method, complexity was increased including moredetection parameters and a more complex decision rule: the last approach, justified bya significant performance improvement, caused only little increased expenses in terms ofcomputational burden.

2.3.8. Accurate estimation of entropy in very short physiological time

series: the problem of atrial fibrillation detection in implanted ven-

tricular devices (Lake and Moorman)

This algorithm was conceived for automatic detection of Atrial Fibrillation in veryshort RR intervals series obtained by a single lead to address the challenge of paroxys-mal episodes detection with unexpected onset. The aim is providing a rapid, accuratediagnosis of AF burden with minimal calculations. The coefficient of sample entropy(COSEn) is the non-linear index used to distinguish AF and Atrial Flutter (AFL) fromsinus rhythm and other arrhythmias. COSEn is an optimized combination of

• Sample entropy (SampEn), able to encode the irregular nature of short RR intervalssegments during AF.

• Mean heart beat interval (RR), that adds further independent information to thediscrimination.

The underlying detection principle is that AF is characterized by rapid heart rate andsignificant irregularity, indeed high entropy: this is the reason why COSEn results agood classifier for this arrhythmia as shown in the Figure 2.4.

2.3.8.1. Methods

The feature involved in AF identification is:

COSEn = SampEn− ln (2r)− ln�RR

�(2.4)

53


Table 2.4.: COSEn as function of age (years). COSEn values show significant changesbetween NSR and AF. Slight variations are also related to age, due to thephysiological decrease of complexity in cardiac activity that occurs with age-ing. However, it does not significantly affect AF detection power.

SampEn is the negative natural logarithm of the conditional probability that two shorttemplates of length m matching within an arbitrary tolerance r will continue to matchat the next point

SampEn = − ln (A/B)− ln (B)− ln (A) (2.5)

where A is the total number of matches of length m+ 1 and B is the total number ofmatches of length m.

The matching tolerance r has an initial value of 30 ms but it is allowed to increase untila minimum numerator count (A) is achieved. This is to ensure consistency in SampEncomputation.

This non-linear index is computed using short segments of the RR interval series. Asimple thresholding is then applied to diagnose AF.2.3.8.2. Discussion

This method uses a very small window length (only 12 beat-segment).In the paper [22], when comparing a target segment (true diagnosis for that segment)

with prediction returned by the AF detector, authors label a target segment as AF evenif just one true positive beat is contained there in.

54


In the evaluation protocol Atrial Flutter is labelled as AF: this choice appears to bequite controversial, since AF and Atrial Flutter are treated differently and is likely toworsen detector’s performance, because Atrial Flutter is most of the time characterizedby a regular pattern with low entropy.

This method was validated on MIT-BIH database and obtained Sensitivity=91% andSpecificity=94%. It was successively evaluated on Holter monitoring recordings fromthe University of Virginia (UVa) with lower performance, explained by the decision toinclude AFL with AF and by the presence of other arrhythmias which challenge AFidentification. A high number of false positives was also due to recordings with frequent,complex ventricular ectopy or electronic pacemakers.

2.3.9. Improvements in atrial fibrillation detection for real-time

monitoring (Babaeizadeh et al.)

The authors presented an algorithm for automatic real-time detection of AF. It is basedon both ventricular response analysis and atrial activity analysis. The purpose was toobtain better results than previously published algorithms, particularly in terms of Speci-ficity and Positive Predictive Value. Minimizing false positive rate becomes mandatoryto evaluate success of pharmacological and surgical therapies, thus avoid over-treatment.2.3.9.1. Methods

This method comprises a first classifier which is applied to the RR intervals series:a Markov approach is used to calculate a score reflecting the likelihood of observing aRR series in AF episodes versus making the same observation outside AF. The score iscompared with a fixed threshold. The second classifier uses two parameters, P-wave loca-tion (P-R interval variation) and P-wave morphology to detect AF. During AF episodes,P-wave location cannot be measured because the P-wave is absent. At the same time,P-wave template matches poorly with the signal.

A statistical approach based on decision trees is then used to classify test data, usingthe described features.

A post-classification corrector is added to the decision tree to modify the decision logicin special cases: as an example, if a good P-wave is detected, the analyzed portion of thesignal should not be recognized as AF, no matter how irregular the RR interval sequenceis.

Moreover, in order to reduce the number of short false-positive episodes that causefalse alarms: a hysteresis counter is used that begins an episode only if a few consecutivesets of parameters have been classified as AF. The counter is non-symmetric to be less

55


sensitive at beginning an episode than terminating.2.3.9.2. Discussion

In the comparative study, Babaeizadeh et al. showed Sensitivity=87.27%, Specificity=95.14%and Positive Predictive Value=92.29%. It could be argued that the computational com-plexity of this algorithm is not well reflected by the correspondent results on the MIT-BIHAF database, although Specificity is quite high. Anyway, it would be interesting to usethe MIT-BIH Arrhythmia database as test set to assess the the false prediction rate inthe presence of all kinds of atrial and ventricular arrhythmias.

Use of hysteresis counter reduces false AF alarms and makes this algorithm user-friendly in the clinical environment.

Authors underline some limitations of this method: an AF episode characterized bya regular RR series may be misclassified because of complete atrioventricular block orventricular pacing. Furthermore, if the patient has an irregularly irregular but non-AFrhythm with significant noise in P wave regions, it can cause a false positive detection.

However, this method is definitely not the most suitable for screening applications,primarily because of its computational complexity and quite an unsatisfactory sensitiv-ity to AF episodes.

2.3.10. Comparison table and algorithms selection

In the Table 2.5, all the proposed algorithms are compared on the base of their

• Computational efficiency (Ceff)H=High, M=Medium, L=Low.

• Window length (WL)WL is expressed in seconds (s) or beat-segment (bs).

• Performance metrics.Sensitivity (Se = TP/(TP + FN)), Specificity (Sp = TN/(TN + FP )), PositivePredictive Value (PPV = TP/(TP + FP )), Negative Predictive Value (NPV =

TN/(TN + FN)), Accuracy (Acc = (TP + TN)/N) and Error (Err = (FP +

FN)/N). Performance metrics are displayed using the same number of significantdigits reported in the correspondent papers to be as authentic as possible.

As the Table 2.5 shows, test sets used for each detector are often different, the windowlength function can be expressed in terms of seconds or number of RR intervals contained

56


in each analyzed segment (bs). Moreover, not all Performance metrics are available foreach method.

For all these reasons, the choice of three algorithms to be implemented is not an easytask.

It was taken into account the importance to end up with three methods, which usedifferent approaches in revealing AF: the independence of features, in fact, plays a keyrole when a machine learning technique is used to enhance the power of classification.

Another important selection criteria was the availability of results on open-sourcestandard databases, such as the MIT-BIH AF database and the MIT-BIH NSR database.This is why Duverney et al. was excluded.

Babaeizadeh et al. is not suitable for a screening application on phone, because it isbased on the atrial analysis, which is computationally intensive and does not providesufficient robustness to artifacts and noise.

Logan et al. showed unsatisfactory Sensitivity on the MIT-BIH AF database.Between the remaining candidates, Sarkar et al. was chosen because of its excellent

performance on the MIT-BIH NSR database, the MIT-BIH NSR RR interval databaseand the MIT-BIH AF database.

Lake and Moorman was preferred to Tateno et al. and Petrucci et al., because despiteof a lower sensitivity, it uses a very small window size and a completely different approachin revealing AF as compared to the other ones, which are more similar to Sarkar et al.,based on �RR distributions analysis as well.

Finally, Linker DT. was selected in the place of Cerutti et al., since Sensitivity is higherand the optimal window size used is much smaller.

Selected algorithms are listed in Table 2.6.From this point forward, the acronyms CosEn, MAD and Lorenz will be used to refer

to the methods respectively described in the papers by Lake and Moorman [22], LinkerDT. [24] and Sarkar et al. [36].

57


Algorithm Databases Performance Ceff WL

Se (%) Sp (%) PPV (%) NPV (%) Acc (%) Err (%)Lake and Moorman MIT-BIH AF db 91.0 94.0 - - - - H 12 bs

Cerutti et al.* MIT-BIH AF db 96.10 81.55 75.76 - - 16.62 H 90sLinker DT.* MIT-BIH AF db 97.64 85.55 81.81 - - 9.61 H 10sTateno et al.* MIT-BIH AF db 91.20 96.08 90.32 - - 5.32 M 50s

MIT-BIH Arrhythmia db s2001 88.2 87.6 62.4 - - - - 100 bsMIT-BIH Arrhythmia db s1001 - 99.0 - - - - - 100 bs

Sarkar et al. MIT-BIH AF db 97.5 - 99.0 - - - - 120sAF db2 99.3 - 99.4 - - - - 120s

All NSR db3 - 99.4 - - - - - 120sLogan et al.* MIT-BIH AF db 87.30 90.31 85.72 - - 10.89 M 120s

Duverney et al. SR group4 - 99.9 - - - - - 64 bsCAF group4 99.2 - - - - - - 64 bsPAF group4 96.1 92.6 - - 94.1 - - 64 bs

Petrucci et al.5 MIT-BIH AF db 92 - 78 - - - - 60sPetrucci et al.6 MIT-BIH AF db 91 - 92 - - - - 40s

Babaeizadeh et al.* MIT-BIH AF db 87.27 95.47 92.75 - - 7.80 L 40s

Table 2.5.: AF detectors comparison table.*Performance metrics on the MIT-BIH AF database are reported in the com-parative study by Larburu et al. [23].1s200 and s100 are series 100 and series 200 of the MIT-BIH Arrhythmiadatabase (see Section 3.2 for more details).2AF database (81 patients 476 h of AF, 76 h of Atrial Tachycardia, 1397 h ofNSR) is not available online.3MIT-BIH NSR database and MIT-BIH RR interval NSR database, availableon Physionet.org.4A validation set of 50 subjects was selected with these characteristics: 19with chronic AF 13 (CAF), 15 with paroxysmal AF (PAF) and 16 with per-manent normal sinus rhythm (NSR).5Algorithm based on RR histogram analysis. 6Algorithm based on Prematu-rity histogram.

Algorithm Authors

CosEn Lake and MoormanMAD Linker DT.Lorenz Sarkar et al.

Table 2.6.: Selected algorithms based on ventricular response analysis.

2.3.11. Lorenz: implementation details

The article describing Lorenz algorithm [36] shows some lack of details, as previously saidin Section 2.3.4. With respect to the Figure 2.4, missing information concerns segments

58


precedence on the Lorenz plot, treatment of out-of-bounds points and compensationfor very high and very low heart rates. These issues, which led to inevitable drop inperformance when comparing my results with results reported in the paper [36] for theMIT-BIH AF database (see Table 2.5), were solved thanks to the additional materialgenerously provided by the author S. Sarkar, and are detailed below.

Lorenz1

Lorenz2

Se (%) 76.72 98.27Sp (%) 99.60 98.40Acc (%) 91.31 98.35PPV (%) 99.44 97.81NPV (%) 83.10 98.73Err (%) 8.69 1.65

Table 2.7.: Performance metrics on the MIT-BIH AF database before (1) and after (2)applying frequency compensation and the criteria for outlier points.

Segments precedence on the Lorenz plot.

Some of the 13 segments composing the 2D histogram (see Figure 2.4) are overlapped(they share some bins), so that there are points which simultaneously belong to morethan one segment. In such cases, common points must be considered part of the segmentwith higher precedence: the diagonal segments have higher precedence than the alongaxis segments.

Out-of-bounds points.

The ∆RR histogram (see Figure 2.4) spans from [−600;+600] ms. A point defined by(∆RRi,∆RRi−1) is considered an outlier if ∆RRi or ∆RRi−1 is outside the boundary[−1500;+1500] ms. An outlier is ignored (it is not counted in a histogram bin at all). Onthe other hand, points falling outside the histogram range [−600;+600] ms but withinthe outlier boundary [−1500;+1500] ms are counted in the appropriate “edge” bin alongthe outer range.

Frequency compensation.

Before storing a point in the 2D histogram, the coordinate ∆RRi (∆RRi−1) is mul-tiplied by the constant factor k1 = 0.5, if RRi > 1000ms (RRi−1 > 1000ms) or by theconstant factor k2 = 2, if RRi < 500ms (RRi−1 < 500ms). This is to compensate forvery fast HRs, which can generate relatively dense cloud of points over a small range ofbins, and for very low HRs which have exactly the opposite effect.

59


Table 2.7 displays performance metrics on the MIT-BIH AF database before and af-ter applying frequency compensation and the criteria for outlier points to the algorithm(my own implementation).

60

3. Algorithms evaluation

3.1. Overview

Chapter 2 reviewed different ventricular response analysis techniques for Atrial Fibril-lation detection, with particular focus on methods capable to meet the requirements ofan AF screening application to be implemented on phone. Among them, three methodswere selected, taking into account the promising results reported in literature.

These three algorithms, CosEn, MAD and Lorenz, were implemented in Matlab forfurther evaluation and comparison.

In this chapter, the AF detectors are tested and compared by using the same evaluationprotocol. Databases used for this purpose are the standard databases of Physionet: theMIT-BIH Atrial Fibrillation database, the MIT-BIH Normal Sinus Rhythm database andthe MIT-BIH Arrhythmia database.

Methods are retrained by varying the window length and by exploring different as-sumptions on target data of the MIT-BIH Atrial Fibrillation database.

They are successively evaluated on the MIT-BIH Normal Sinus Rhythm database andthe MIT-BIH Arrhythmia database by using the same underlying conditions.

The aim is providing a common analysis substrate, thus permitting more objectiveconsiderations on their performance.

3.2. General steps for AF detection

Detectors chosen for evaluation and comparison are all based on the ventricular responseanalysis, aimed to capture the irregular, rapid nature of AF. It is possible to identify thesteps leading to the AF diagnosis, starting from a fragment of ECG signal. These stepsare shared by all the selected algorithms and are summarized in this flowchart:

61


Starting from the ECG, it is derived the RR interval series. Non overlapping segments ofthis series, corresponding to a certain window size, are analyzed. Finally, an AF feature,dependent on the method used, is extracted from each segment and compared with athreshold to return AF diagnosis.

3.3. Databases

Data for the evaluation protocol of the AF detectors were obtained from standard datasetsin Physionet [11], which offers free web access to large collections of multi-channel ECGsignals.

These databases, such as the Massachusetts Institute of Technology Division of HealthScience and Technology’s MIT BIH arrhythmia database, conform to the American Na-tional Standards Institute (ANSI) guidelines, developed by the Association for the Ad-vancement of Medical Instrumentation (AAMI).

They are widely used in literature; for this reason, they encourage a comparison be-tween existent AF methods and also represent a reliable starting point to apply improve-ments to the same methods or develop different solutions.

62


For the purposes of this work, the MIT-BIH Atrial Fibrillation (AF) database was usedas training and validation set. Successively, the MIT-BIH Normal Sinus Rhythm (NSR)database and the MIT-BIH Arrhythmia database were used to test the algorithms.

The underlying idea was trying to simulate a real scenario: methods’ optimization isperformed on the MIT-BIH AF database, because it contains a sufficiently large numberof AF episodes to build up robust classification models. Trained detectors are thenevaluated on the NSR database to investigate their capability to reject correctly a normalECG tracing, with at least some ectopic beats, and the arrhythmia database to verify howperformance is affected by the presence of other arrhythmias, in many cases, characterizedby an irregular ventricular response as well.

All records are provided with human annotations, indicating onset/end for a particularrhythm, e.g.

(AFIB onset of an atrial fibrillation episode

AFIB) end of an atrial fibrillation episode

(N onset of normal sinus rhythm

AFL) end of an atrial flutter episode

A complete list of annotation key can be found in the Appendix A.Clinician-annotated files obviously constitute the “golden standard”, which all results

in this work refer to, and can be easily read by using libraries available in Physionet,even for Matlab1 [28].

The precise timing of each QRS complex is determined by an automatic detector: RRintervals series was derived using the QRS annotations, already available at this stage,with no need to apply an external peak detector.

An event of interest, such as the onset of an Atrial Fibrillation or a Normal Sinusrhythm episode, and the correspondent human annotation, can occur at any point withinthe signal, as displayed in Figure 3.1, where the beat label “·” is not paired with therhythm label “(N”. For this reason, it was necessary to convert reference rhythm annota-tions to beat resolution: the functions “rdann” and “rdsamp” of the WFDB toolbox [28]were used for this purpose. A detailed description of these functions can be found at[28].

1WFDB Toolbox for Matlab

63


Figure 3.1.: 10s ECG for record 08215 of the MIT-BIH Atrial Fibrillation database. Thereference rhythm annotation “(N” and the unaudited beat annotations “·”are displayed.

Once each beat of the RR sequence is labelled as 1, if AF, as 0, if non-AF, results fromAF detectors can be then compared to these target binary data.

3.3.0.1. MIT-BIH Atrial Fibrillation database

The MIT-BIH AF database is used in later sections of this chapter as training andvalidation set: it is a large collection of data, including 25 long-term ECG recordings ofhuman beings with Atrial Fibrillation (mostly paroxysmal).

Each record consists of two ECG signals, 10 hours in duration, sampled at 250 Hz with12-bit resolution over a range of ±10 millivolts.

A detailed description for each patient record can be found in Table 3.1.For two records out of 25 (00735 and 03665), the ECG signal is missing: only the

rhythm and the unaudited beat annotation files are available: it was still possible to usethese records during the evaluation protocol, since proposed AF detectors are all basedon ventricular response analysis, i.e. it doesn’t require any morphological information.

3.3.0.2. MIT-BIH Normal Sinus Rhythm database

The MIT-BIH Normal Sinus Rhythm database is used to test the trained algorithms.It is of particular interest to see if each AF detector is able to correctly classify NSRsegments with ectopic beats as non-AF. Although the purpose of this work is to findthe most appropriate algorithm to be implemented on phone for a screening application,accuracy in discriminating AF from NSR is required to be quite high, since these twostates have completely different patterns, while a wider tolerance can be allowed forother kinds of arrhythmias with an irregular ventricular response, which are more likely

64


Record AF episodes (n) AF dur. (s) AFL episodes (n) AFL dur. (s) Total dur. (s)0735 1 264 - - 360003665 7 5959 - - 360004015 7 237 - - 692234043 82 7920 1 13 368234048 7 361 - - 368234126 7 1378 - - 368234746 5 19553 - - 368234908 8 3082 4 254 368234936 36 26561 1 3391 368235091 8 87 - - 368235121 20 23209 - - 368235261 11 479 - - 368236426 26 35114 1 118 368236453 6 371 - - 333006995 6 17358 2 13 368237162 1 36822 - - 368237859 1 36823 - - 368237879 2 22203 - - 368237910 5 5869 1 486 368238215 2 29674 1 50 368238219 39 7949 - - 368238378 5 7698 3 1553 368238405 2 26590 - - 368238434 3 1424 - - 368238455 2 25471 - - 36823Total25 299 342456 14 5878 947806

Table 3.1.: MIT-BIH Atrial Fibrillation database profile: for each record, the number ofAtrial Fibrillation and Atrial Flutter episodes and duration are reported.

65


Record SNR beats (n) Ectopic beats (n) Total duration (min)16265 100216 21 152716272 87757 1 150016273 89840 5 147816420 102061 6 143916483 104330 4 155716539 108265 17 147516773 81962 27 143816786 101605 8 146916795 86872 0 141517052 87354 2 138817453 100655 3 146318177 115908 3 155718184 102313 0 142519088 97957 4 142819090 81382 9 145119093 75100 6 139419140 96596 0 145019830 109329 3 1393Total18 1729502 119 26247

Table 3.2.: MIT-BIH Normal Sinus Rhythm database profile: for each record, the numberof normal and ectopic beats and the total duration for each record is reported.

to confound an AF detector.The NSR database is composed by 18 long-term ECG recordings (384 hours of NSR) of

subjects referred to the Arrhythmia Laboratory at Boston’s Beth Israel Hospital (BIH).No significant arrhythmias are diagnosed, except the presence of ectopic beats. Subjectsinclude 5 men, aged 26 to 45, and 13 women, aged 20 to 50.

A detailed description of the MIT BIH NSR database profile is shown in the Table 3.2.

3.3.0.3. MIT-BIH Arrhythmia database

The MIT-BIH Arrhythmia database contains 48 short-term (30 minutes) ECG record-ings, chosen from a set of 4000 24-hour ambulatory ECGs from subjects recorded at theBIH Hospital.

Subjects can be divided in two categories: the series 100 (patients labelled 100-124)includes 23 subjects with normal sinus rhythm, paced rhythm, bigeminy, trigeminy andsupraventricular tachycardia, but it does not have Atrial Fibrillation.

The series 200 (patients labelled 200-234) includes 8 atrial fibrillation subjects over 25years old: it also contains atrial and ventricular bigeminy, ventricular trigeminy, atrial

66


Record N SBR BII PREX AB SVTA AFL AFIB P NOD B T IVR VT VFL100 30:06 - - - - - - - - - - - - - -101 30:06 - - - - - - - - - - - - - -102 1:22 - - - - - - - 28:44 - - - - - -103 30:06 - - - - - - - - - - - - - -104 3:52 - - - - - - - 26:13 - - - - - -105 30:06 - - - - - - - - - - - - - -106 22:36 - - - - - - - - - 7:15 0:13 - 0:02 -107 - - - - - - - - 30:06 - - - - - -108 30:06 - - - - - - - - - - - - - -109 30:06 - - - - - - - - - - - - - -111 30:06 - - - - - - - - - - - - - -112 30:06 - - - - - - - - - - - - - -113 30:06 - - - - - - - - - - - - - -114 30:01 - - - - 0:05 - - - - - - - - -115 30:06 - - - - - - - - - - - - - -116 30:06 - - - - - - - - - - - - - -117 30:06 - - - - - - - - - - - - - -118 30:06 - - - - - - - - - - - - - -119 22:36 - - - - - - - - - 3:55 3:34 - - -121 30:06 - - - - - - - - - - - - - -122 30:06 - - - - - - - - - - - - - -123 30:06 - - - - - - - - - - - - - -124 28:36 - - - - - - - - 0:30 - 0:22 0:37 - -200 15:58 - - - - - - - - - 13:52 - - 0:15 -201 12:57 - - - - 0:02 - 10:06 - 0:24 - 6:37 - - -202 19:31 - - - - - 0:48 9:46 - - - - - - -203 2:43 - - - - - 5:14 21:32 - - - 0:04 - 0:33 -205 29:43 - - - - - - - - - - - - 0:23 -207 22:20 - - - - 0:52 - - - - 2:38 - 1:49 0:03 2:24208 24:43 - - - - - - - - - - 5:22 - - -209 28:23 - - - - 1:42 - - - - - - - - -210 - - - - - - - 29:30 - - 0:23 0:07 - 0:06 -212 30:06 - - - - - - - - - - - - - -213 29:01 - - - - - - - - - 1:00 - - 0:04 -214 28:53 - - - - - - - - - - 1:08 - 0:05 -215 30:03 - - - - - - - - - - - - 0:02 -217 - - - - - - - 4:12 25:10 - 0:42 - - 0:02 -219 6:01 - - - - - - 23:47 - - 0:08 0:10 - - -220 29:50 - - - - 0:16 - - - - - - - - -221 - - - - - - - 29:17 - - 0:03 0:42 - 0:04 -222 15:57 - - - 1:28 0:08 7:03 1:44 - 3:45 - - - - -223 23:23 - - - - - - - - - 4:19 0:38 - 1:46 -228 24:17 - - - - - - - - - 5:48 - - - -230 17:45 - - 12:21 - - - - - - - - - - -231 18:26 - 11:40 - - - - - - - - - - - -232 - 30:06 - - - - - - - - - - - - -233 28:03 - - - - - - - - - 1:48 0:04 - 0:11 -234 29:40 - - - - 0:26 - - - - - - - - -

Table 3.3.: MIT-BIH Arrhythmia database profile: rhythm type for each half-hour recordis reported in its entirety.

flutter, supraventricular tachyarrhythmia, ventricular flutter, ventricular tachycardiaand supraventricular tachycardia.

67


Each record, digitized at 360 Hz with 11-bit resolution over a 10 mV range, was anno-tated independently by two or more cardiologists.

A description of the MIT BIH Arrhythmia database profile is shown in the Table3.2. However, for more details, one should make reference to the MIT BIH Arrhythmiadatabase directory [26].

3.4. Practical considerations

3.4.1. Window size

As described in the second chapter, each AF detector considers consecutive, non over-lapping segments of a certain window size, which are analyzed to extract a particular AFfeature, compared with a threshold.

The window length used is an important parameter to take into consideration whencomparing different AF methods.

A small window length allows for faster calculations and is more suitable to addressthe arduous challenge of paroxysmal AF events, which usually have unpredictable onsetand short duration. On the other hand, a larger window length provides a more robustestimation of the RR segment content, since more data are available to resolve the infor-mation present. Computational cost and memory capability consequently raises, as wellas the difficulty in revealing brief episodes.

Actually, AF events whose duration is less than 30 seconds are usually not considered tohave clinical relevance when determining whether the patient has been free of arrhythmiasor not. However, it is still of interest to see how few beats are required for correct AFdetection as clinicians may become more interested in episodes shorter than 30s in thefuture.

To meet these requirements, during the evaluation protocol, the window size was variedfrom 10 up to 300 beats, in order to find out which size provides the best performancefor each detector and to assess how performance drops when decreasing the requestednumber of beats to detect AF.

MAD feature can be computed for window lengths whose size is a multiple of 3 only(see Section 2.3.2): for this algorithm, the space of computable results was necessarilyshrunk to the lower resolution range [12 : 3 : 300] beat-segment.

68


AF detector AF feature Threshold range

CosEn CosEn [-3:0.1:0]MAD MAD [0.03:0.002:0.18]Lorenz AFEvidence [5:1:75]

Table 3.4.: Threshold range of variation for each AF detector.

Figure 3.2.: MinAF% represents the minimum percentage of AF-beats a reference seg-ment must contain to classify the entire segment as True Positive. For per-centages of AF beats lower than minAF%, a segment is classified as a TrueNegative.

3.4.2. Threshold

For each method, a wide range of thresholds was also explored contextually to thewindow size variation. The Table 3.4 lists the threshold ranges for the AF features ofthe three algorithms CosEn, MAD and Lorenz.

For Lorenz, the threshold is just a count of points with a specific positioning within thetwo-dimensional histogram summarizing the �RR distribution, and is likely to increaseas the number of points populating the Lorenz plot raises.

In MAD, the correspondent AF feature is a slightly more complex version of the heartrate standard deviation applied to adjacent temporal segments: it is expected to raisefor segments of larger dimension.

For CosEn, the threshold is supposed to be quite stable around the value proposed bythe authors, as shown in the paper [22], since the AF feature is the logarithm of the ratiobetween non matching and matching intervals within the same segment, normalized bythe mean heart rate.

69


Table 3.4 reports threshold range used during the training phase.

3.4.3. Reference data: minimum percentage of AF and minimum number

of AF beats

Each AF detector requires a certain number of beats to resolve the information presentwithin: diagnosis returned for every n beat-segment must be compared to the “truth”,i.e. to a segment of the same length which has been previously assigned a “1” or “0” label,depending on the annotations available. But, how to assign the labels?

The Figure 3.2 may help to clarify this problem: if one imagines to have a 10 beat-segment containing two AF beats and eight NSR beats, that segment should be consid-ered as a true positive or a true negative?

The issue of converting a beat-by-beat diagnosis in a window-by-window diagnosisto be used as target is anything but trivial. It also represents one breaking point incomparability of different algorithms and related results published in literature: in thepaper by Sarkar et al., for example, a segment is labelled as a true positive if it containsat least 60% of AF burden, while in CosEn, even a single AF beat is sufficient to producea positive diagnosis, which seems to be quite a strict assumption. For MAD, in thecomparative study [23], no mention is made on the criterion used.

This choice has relevant effects, in particular concerning the minimum duration of AFa detector is able to reveal. If, for example, we are interested in revealing episodes of,at least, 30s, the criterion used by Lorenz, mentioned above, using a window length of 2minutes in duration, could not be the right choice to accomplish our goal. Also becausea 30s episodes could be split in two adjacent analyzed segments.

Use of minimum percentages of AF burden is quite common in literature and suchresults can be easily computed. For this reason, this kind of analysis was adopted in thiswork, at least to show a comparison with literature. Percentages equal to 10%, 50% and60% were considered.

The problem with this approach is it is not so meaningful and results are not well inter-pretable, since the minimum amount of AF which is really detectable becomes window-length dependent. Moreover, what really happens in a clinical scenario is a mixture ofthese percentages.

Instead, if we impose a minimum fixed number of beats (minNbeat) to claim a truepositive event, results are more intuitive in terms of minimum amount of AF we maywant to search; in the second place, larger window sizes will not take advantage of everincreasing percentages, since each window length is called to detect the same quantity of

70


minAF% minNbeat

[10; 50; 60] [10; 30; 50]

Table 3.5.: Minimum percentage and minimum number of AF used to build the reference.

AF burden. A minimum number of 10, 30 and 50 beats was used to train and test eachdetector: window lengths smaller than the minimum number of positive beats requiredin a certain beat-segment have not been included in the correspondent analysis.

3.5. K-Fold Cross Validation

Cross Validation (CV) is a widely used technique for model assessment and selection.The interesting aspect of CV is the possibility of measuring generalization error throughthe use of held-out data, possibly avoiding the over-fitting problem, while efficientlymaximizing the amount of data used for model development.

There are different CV techniques: the general procedure comprises data partition intotraining and testing sets: training is the process of fitting the model of interest, opti-mizing some parameters, while testing consists in evaluating the fitted model throughmeasuring the prediction error.

The K-Fold Cross Validation was chosen to train each AF detector on

• window size

• threshold

using different assumptions on target data:

• minAF%

• minNbeat

This choice was motivated by the intrinsic benefits of the general CV procedure in termsof augmented consistency and robustness in selecting the optimal parameters, and thecomputational efficiency of the K-Fold technique compared to the Leave-One-Out pro-cedure [18].

71


1. Train

The MIT-BIH AF database, used as training and validation set, was divided intoK = 5 partitions of roughly equal size , each termed as a fold of the dataset.A 5 Fold cross-validation was performed to assess the optimal window size acrossthe range [10; 300] beat segments.The best window size was selected as the one having the highest averaged areaunder the receiver operating characteristic curve (AUROC) across the 5 partitions(5 folds), ensuring the corresponding standard deviation was within reasonablevalues (±6%).AUROC was preferred at this stage, since it allows optimization of the window sizeacross all thresholds (AUROC prevents from prematurely applying a threshold tothe features which would mean to lose lots of information on the classification).Once the window is picked, each algorithm is retrained at that optimal segmentlength, this time using the entire training set to get the best threshold, as measuredby the highest Accuracy.

2. Test

The optimal final model is then applied to the test set, which was not (and shouldnot be) used in model development. Test set consists of the MIT BIH NSR databaseand the MIT BIH Arrhythmia database.

3.5.1. Folds generation

This section describes how each of the 25 records in the MIT-BIH AF database wereassigned to the 5 folds to perform Cross Validation.

This dataset is not strongly unbalanced, but, as summarized in the Table 3.1, somerecords contain almost sinus rhythm, while other records contain few AF episodes of verylong duration (e.g, record 07859: 1 episode all-record long) or a large number of episodesof very short duration (e.g, record 04043: 82 episodes, 7960s total duration). For thisreason, in order to avoid completely unbalanced folds, all records were ranked accordingto the index

AFburden =AF episodes

Total Duration−AF Duration(3.1)

where Total Duration is the total duration of a record in s, AF Duration is the totalduration of the AF episodes (s) within a record and AF episodes is the total numberof AF episodes within a record. This is to account for records which may have a largenumber of AF episodes, but to be composed almost by NSR and viceversa.

72


As an example, a record will be high ranked if it contains several AF episodes, even ofrelatively short duration, but also if it contains only few episodes whose total durationrepresents a considerable part of the entire record duration.

Ranked records are then grouped into 5 clusters corresponding to decreasing values ofAF burden. Records from each cluster are then randomly assigned to the 5 folds, thusensuring each folds contains one randomly selected record from every cluster.

This procedure was believed to be important to avoid strongly unbalanced folds andcontextually maintaining randomness of AF “type”, during records’ assignment to thefolds, to ensure generalizability of models developed using these folds.

3.6. Performance metrics

The AF detectors are compared on the base of the following performance metrics:

• Sensitivity: Se = TP/(TP + FN)

• Specificity: Sp = TN/(TN + FP )

• Accuracy: Acc = (TP + TN)/N

• Positive Predictive Value: PPV = TP/(TP + FP )

• Negative Predictive Value: NPV = TN/(TN + FN)

• Total error: Err = (FP + FN)/N

where TP is the number of True Positive, TN is the number of True Negative, FP is thenumber of False Positive, FN is the number of False Negative and N is the total numberof observations.

3.7. Results

3.7.1. Performance on training set

The Figure 3.3 shows the area under the receiver operating characteristic curve (AU-ROC) averaged across the 5 folds of the MIT-BIH AF database,obtained by varying mi-nAF%: as we can see, the standard deviation for MAD is significantly higher (3%− 4%)

73


Figure 3.3.: AUROC values across the 5 folds of the MIT-BIH AF database for CosEn(Lake and Moorman), MAD (Linker DT.) and Lorenz (Sarkar et al.): meanAUROC values (solid line) and correspondent standard deviations (shadedregions) across the 5 folds CV.

74


Figure 3.4.: Number of changes in True Negatives varying minAF% from 10% to 60% asa function of the window length. The colorbar encodes the number of TPsegments which turned into TN segments changing minAF% from 10% to60%.

compared to the other methods. This major variability within the dataset partitions maybe explained by the fact that MAD is quite sensitive, but not that specific: therefore, theattempt of both maximizing Sensitivity and minimizing the False Positive Rate causes anumber of False Positives, linked to the effective amount of NSR within each fold.

It is also evident how, increasing minAF%, AUROC values raise and optimal pointsmove towards larger window sizes, since an augmented amount of AF allows for a morerobust classification. In the Figure 3.4, it is shown the absolute number of changes inTrue Negatives for each of the 25 records in the MIT-BIH AF database for each segmentlength, when minAF% varies from 10% to 60%: record 04043 undergoes the greatest in-crease in True Negatives, since it is composed by several AF episodes of short duration.For remaining subjects, there are no more than 50 changes (∼< 7%) in target segmentsfrom True Positive to True Negative.The following Sections 3.6.1.1, 3.6.1.2 and 3.6.1.3 show results obtained by each AF de-tector individually on the MIT-BIH AF database. Training performances of each methodare successively discussed and compared in Section 3.6.1.4.

75


Figure 3.5.: Accuracy on the MIT-BIH AF database for CosEn. Accuracy is plotted asa function of window length and threshold when minNbeat= 10 beats.

3.7.1.1. CosEn (Lake and Moorman)

Figure 3.5 shows Accuracy of Cosen on the whole MIT-BIH AF database whenminNbeat = 10 beats: performance is almost above 90% for each window size whenthreshold spans in the optimal sub-range [−2,−1] .Within this sub-range, Accuracyreaches a peak around small-medium windows and then slightly decreases.

As previously explained, the best threshold was determined by evaluating the metric onthe entire training dataset and selecting the value which optimized the Accuracy. Resultsare summarized in Tables 3.6: even when minAF%=10%, Se=94.92% and Sp=97.46%.

As expected, performance raises as the minAF% or minNbeat increase as well as win-dow size increases: transitions are definitely much more gradual when using minNbeatinstead of percentages: as an example, a minAF% of 60% in a 200 beat-segment re-quires a minimum number of true positive beats equal to 120 and equal to 12 in a 20beat-segment. When using minNbeat, instead, all windows are tested under much moresimilar conditions.

76


Table 3.6.: Performance metrics for CosEn on the MIT-BIH AF database.

3.7.1.2. MAD (Linker DT.)

Results from MAD show a very high Sensitivity, 98.95%, when at least half of thebeats are required to be in AF.

With minimum percentages 50% and 60%, Specificity is below 90% as displayed in theTable 3.6 and it undergoes a slight worsening as the strict assumptions on reference dataare released (i.e., minAF% increases); in fact, a raising in minAF% produces a relevantadditional number in True Negatives, thus leading to more False Positives.

For the record 04043, which has the largest number of changes in reference diagnosis,when minAF% goes from 10% to 60% (see Figure 3.4) the number of False Positives goesfrom 0 (10%) to 26 (60%) over 251 beat-segments, assuming Th = 0.1080.

From the result of the training phase for MAD, the Positive Predictive Value, whichaccounts for the number of FP, is plotted as a function of threshold and window lengthin Figure 3.6 for minNbeat=10. Best performance is obtained for the very large windowsizes and thresholds.

77


Table 3.7.: Performance metrics for MAD on the MIT-BIH AF database.

Figure 3.6.: Positive Predictive value on the MIT-BIH AF database for MAD. PPV isplotted as a function of window length and threshold when minNbeat= 10beats.

78


Figure 3.7.: Accuracy on the MIT-BIH AF database for Lorenz. Accuracy is plotted asa function of window length and threshold when minNbeat= 10 beats.

In Table 3.7, AUROC standard deviation is always above 4%.

3.7.1.3. Lorenz (Sarkar et al.)

The method based on the histogram characteristics of AF shows outstanding perfor-mance on the MIT-BIH AF database. When minAF%= 60%, which is the assumption re-ported in the paper [36], Se=98.27% and PPV=97.81%, but even with a minAF%=10%,the total error is 3.32% with a much smaller optimal window size.

Table 3.8 presents AF detection metrics for the MIT-BIH AF database.When consid-ering a minimum number of beats in the place of percentages, all the metrics are quitehigh and the window length is always below 100 beat-segment.

The optimal AUROC values are above 0.99 and standard deviation is significantlysmaller than 1% in every case, thus demonstrating a great robustness and stability tothe variability of records across 5 folds.

Figure 3.7 depicts Accuracy of Lorenz by varying window size and threshold; it offersthe opportunity for some considerations: Accuracy trend is not gradual at all, but showsa sudden discontinuity around 30 beat-segment (also depending on the threshold). Lorenzis not suitable to be used for very small window size ([10; 20] beat-segment), since its

79


Table 3.8.: Performance metrics for Sarkar et al. on the MIT-BIH AF database.

discrimination capability requires that a sufficient number of points are positioned in thetwo-dimensional histogram (see Section 2.3.4). The estimation using such a low numberof RR intervals is not that robust, as evident by the paper [36] reporting on a 2 minuteswindow.

3.7.1.4. Discussion and comparison

From results on the training phase, MAD showed the highest Sensitivity, for differentpercentages of AF within a segment (Se=98.66%, assuming minAF%=60%). MAD alsohas a larger variability across the folds (3-4%) compared to Lorenz and CosEn (<1% ).

Lorenz always shows the best Accuracy and the minimum number of False Positivesand False Negatives.

When varying minNbeat, CosEn is the most sensitive with a minimum number ofbeats equal to 10 (Se=95.57%), but Lorenz achieves the best Sensitivity if minNbeatraises. For these same assumptions on target data, CosEn always requires the smallestwindow size.

80


Figure 3.8.: Sensitivity and Total Error on the MIT-BIH Atrial Fibrillation database byvarying minNbeat.

minNbeat CosEn WL MAD WL Lorenz WL

10 41 282 8230 60 282 7650 91 282 99

Table 3.9.: Optimal window size by varying minNbeat.

3.7.2. Performance on test sets

Each AF detector was tested on the MIT-BIH NSR database and the MIT-BIH Ar-rhythmia database.

Results are available for different minAF% and minNbeat values. Window size andthreshold are the optimal values established during the training phase separately for ev-ery method.

3.7.2.1. MIT-BIH NSR database

Specificity and Total Error results on the NSR database are presented in Table 3.10by varying the minNbeat, and in Table 3.11 by varying minAF%.

Other performance metrics were omitted, since the MIT-BIH NSR database does notcontain AF or other significant arrhythmias, except some ectopic beats.

All detectors show a good capability in rejecting normal ECGs. Lorenz Specificity is99.39%, assuming WL=125 beat-segment when minNbeat=50 beats. It is followed byMAD (Sp=98.37%), which uses a very large window size (WL=282) and CosEN, whichcauses a major number of False Positives, but using the smallest window length, thus

81


Table 3.10.: Performance metrics on the MIT-BIH NSR database by varying minNbeat.

providing a faster detection.

Table 3.11.: Performance metrics on the MIT-BIH NSR database by varying minAF%.

For the MIT-BIH NSR database, an increase in performance is less dependent on thepercentage of beats required for a window to be classified as AF since all beats are la-belled True Negatives. Performance depends more on the window size: e.g., for Lorenz,

82


Table 3.12.: Performance on the MIT-BIH Arrhythmia database, assuming mi-nAF%=10%. Performance metrics on the Arrhythmia database are reportedfor the series 100 (top table) and the series 200 (bottom table).

the optimal WL of 76 beat-segment, when minNbeat is 30 beats, gives slightly worseresults compared to the optimal WL of 82 beat-segment when minNbeat=10.

3.7.2.2. MIT-BIH Arrhythmia database

Results for the MIT-BIH Arrhythmia database are reported for minAF%=10% inTable 3.12 and for minNbeat=10 beats in Table 3.13.

The MIT-BIH Arrhythmia database is composed by two series of records: series 100does not contain Atrial Fibrillation, but does contain NSR and other common arrhyth-mias; series 200 contains Atrial Fibrillation and both supraventricular and ventriculararrhythmias, some of which are characterized by an irregular ventricular response, suchas Atrial Bigeminy, Ventricular Bigeminy and Ventricular Trigeminy.

All methods show a good capability in revealing the presence of True Positive episodes,which is an important requirement in the perspective of a screening AF application: forthe series 200 of the Arrhythmia database, CosEn has the highest Sensitivity (98.97%),followed by Lorenz (96.05%) and MAD (92.00%), with the strict minimum AF burdenof 10 beats per segment. Lorenz is the most accurate (84.68%) and the most specific(81.76%).

83


Table 3.13.: Performance on the MIT-BIH Arrhythmia database, assumingminNbeat=10. Performance metrics on the Arrhythmia database arereported for the series 100 (top table) and the series 200 (bottom table).

It is evident in Table 3.13, the elevated number of False Positives in this dataset, asrevealed by the PPV metric. The FP problem concerns all methods indifferently, withbetter performance for Lorenz (PPV =57.48%).

Table 3.14 shows more in detail the number of TP, TN, FP and FN on the MIT-BIHArrhythmia database for CosEn, MAD and Lorenz : there are common difficulties inrejecting True Negative episodes for record 106, which contains a very high number ofPVC (∼ 26% of all beats) and Ventricular bigeminy, Ventricular trigeminy and Ven-tricular tachycardia; and records 200, 201, 207, 208, 214, 222, 228. These records areall characterized by the presence of Ventricular bigeminy and/or Ventricular trigeminyand/or Atrial bigeminy and/or Supraventricular tachycardia.

CosEn also shows a significant number of FP for records 115 and 123, containing si-nus arrhythmias, and record 113, with NSR consistently, but characterized by a highheart rate variability, probably caused by a wandering atrial pacemaker. Also MADhave difficulties in rejecting non-AF segments for records 233, which presents a consis-tent number of PVC, APC, Ventricular bigeminy, Ventricular trigeminy and Ventriculartachycardia and record 232. MAD also misdiagnoses AF for record 119, which has thesame characteristics as 106 and 232.

84


minNbeat=10 CosEn MAD Lorenzrecord TP TN FP FN TP TN FP FN TP TN FP FNs 100 0 55 0 0 0 8 0 0 0 27 0 0s 101 0 45 0 0 0 6 0 0 0 22 0 0s 102 0 53 0 0 0 7 0 0 0 26 0 0s 103 0 50 0 0 0 7 0 0 0 25 0 0s 104 0 53 1 0 0 7 0 0 0 27 0 0s 105 0 62 0 0 0 9 0 0 0 31 0 0s 106 0 29 20 0 0 2 5 0 0 19 4 0s 107 0 52 0 0 0 7 0 0 0 26 0 0s 108 0 35 7 0 0 6 0 0 0 21 0 0s 109 0 61 0 0 0 8 0 0 0 30 0 0s 111 0 51 0 0 0 7 0 0 0 25 0 0s 112 0 61 0 0 0 9 0 0 0 30 0 0s 113 0 11 32 0 0 6 0 0 0 21 0 0s 114 0 45 0 0 0 6 0 0 0 22 0 0s 115 0 32 15 0 0 6 0 0 0 22 1 0s 116 0 58 0 0 0 8 0 0 0 29 0 0s 117 0 37 0 0 0 5 0 0 0 18 0 0s 118 0 55 0 0 0 8 0 0 0 27 0 0s 119 0 43 5 0 0 3 4 0 0 22 2 0s 121 0 45 0 0 0 6 0 0 0 22 0 0s 122 0 60 0 0 0 8 0 0 0 30 0 0s 123 0 25 12 0 0 5 0 0 0 18 0 0s 124 0 39 0 0 0 5 0 0 0 19 0 0s 200 0 6 57 0 0 1 8 0 0 2 29 0s 201 23 9 15 0 4 0 2 0 12 6 5 0s 202 21 27 1 3 3 3 0 1 12 14 0 0s 203 59 0 13 0 9 0 1 0 30 0 6 0s 205 0 63 1 0 0 9 0 0 0 32 0 0s 207 0 45 11 0 0 5 3 0 0 20 8 0s 208 0 19 53 0 0 1 9 0 0 19 17 0s 209 0 72 1 0 0 8 2 0 0 36 0 0s 210 64 0 0 0 9 0 0 0 31 0 0 1s 212 0 67 0 0 0 9 0 0 0 33 0 0s 213 0 79 0 0 0 11 0 0 0 39 0 0s 214 0 19 36 0 0 5 3 0 0 26 1 0s 215 0 81 1 0 0 11 0 0 0 41 0 0s 217 11 36 6 0 1 3 0 3 5 18 1 2s 219 46 5 1 0 7 0 0 0 25 0 1 0s 220 0 49 0 0 0 7 0 0 0 24 0 0s 221 59 0 0 0 8 0 0 0 25 0 0 3s 222 5 8 47 0 5 2 1 0 6 4 19 0s 233 0 59 4 0 0 9 0 0 0 29 1 0s 228 0 11 39 0 0 2 5 0 0 4 20 0s 230 0 55 0 0 0 7 0 0 0 27 0 0s 231 0 37 1 0 0 5 0 0 0 19 0 0s 232 0 43 0 0 0 0 6 0 0 21 0 0s 233 0 58 17 0 0 0 10 0 0 37 0 0s 234 0 67 0 0 0 9 0 0 0 33 0 0

Table 3.14.: TP, TN, FP and FN in the MIT-BIH Arrhythmia database. Results referto minNbeat=10 beats.

85


3.8. Conclusions

The AF detectors have been retrained and compared, by using different values ofminimum percentages (minAF%) and minimum number of AF beats (minNbeat) todiagnose a n beat-segment as AF.

When considering a minimum of 10%, MAD showed the best Sensitivity (95.65%) onthe training phase compared to CosEn (94.92%) and Lorenz (94.55%), but the largestwindow length, 282 beat-segment, versus only 44 beat-segment for CosEn.

An increase in this percentage, as previously discussed, generally causes a raisingin performance, since predictions become more robust. For 50% and 60%, Lorenz isalways the most accurate in discriminating Atrial Fibrillation from Normal Sinus Rhythmand Atrial Flutter and shows the highest Specificity on the training set. Results arewell confirmed on the test sets: on the NSR database, Lorenz Specificity is above 99%,assuming AFbeat%=10% and WL=125 beat-segment. CosEn, with a much smallerwindow size has a worse Specificity (93.34%) on the NSR database.

When looking at results on the training phase for different values of minNbeat, whichare more easily interpretable since all window sizes are called to reveal the same minimumamount of AF burden, MAD becomes the less sensitive (96.42%, assuming minNbeat=50beats), Lorenz is always the most accurate and the most specific, but CosEn alwaysrequires the smallest window length, thus allowing a faster detection and addressing theproblem of paroxysmal AF episodes, which can be of very short duration and unexpectedonset.

On test sets, for different minNbeat values, Lorenz confirms a great capability inrejecting True Negative episodes both on the MIT-BIH NSR database and the series 100of the MIT-BIH Arrhythmia database.

For the series 200 of the MIT-BIH Arrhythmia database, CosEn reaches a Se=98.97%in revealing AF segments, followed by Lorenz (96.05%) and MAD (92.00%). Lorenz hasthe highest Specificity (81.76%) and Accuracy (84.68%).

This series of the Arrhythmia database is quite critical, since it contains regularlyirregular rhythms, such as Atrial and Ventricular bigeminy and Ventricular trigeminy,which are very likely to confound the AF detectors. This is testified by the low PPV,caused by the number of False Positive which are concentrated in records, where andonly these kinds of rhythms occur as shown in Table 3.3 and Table 3.14.

It is worth to remember that for a screening application the number of False Negativerepresents a much more severe problem compared to the problem of False Positive detec-tion, since the purpose of this study if to find the most appropriate method to address

86


the problem of undiagnosed AF cases, especially paroxysmal.Moreover, the number of false positive detections is almost entirely limited to records

containing other arrhythmias and not normal sinus rhythm.Ventricular Trigeminy, Ventricular Bigeminy and Atrial Bigeminy are regularly irreg-

ular rhythms, with lower incidence and prevalence on population compared to AtrialFibrillation. Short episodes can be harmless but it is still important to assess them andbring them to the clinician attention, since frequent ventricular ectopic beats can signifya susceptibility towards more threatening arrhythmias and an increased risk of compli-cations.

In general, better performance for stricter assumptions on target data (as minNbeat=10)and/or smaller window size are related to a major capability in revealing short episodes,but the AF detector is expected to be more easily confounded by abnormalities, such asa significant amount of PVC. In chapter 5, this hypothesis is investigated and differentpercentages of PVC are introduced in records with normal sinus rhythm to quantify thecontingent drop in performance.

87

4. Classification of AF: a machinelearning approach

4.1. Introduction

In this chapter, the problem of AF detection is approached by using complex mathemat-ical methods for classification which combine the independent predictive power of someAF features, to improve performance on the standard databases of Physionet, obtainedby each AF detector, individually, as it was illustrated in chapter 3.

In primis, the structure of a general supervised learning problem is presented and thefeatures involved in the classification are described.

After the explanation of some choices taken, concerning the assumptions on targetdata and the selection of the window sizes, a practical description of each classificationmethod is presented: the hypothesis underlying its usage, the description of internalparameters of the classifier and techniques used to tune these parameters and train theclassifier on the MIT-BIH AF database.

Results on the training set are reported together with the feature importance analysis.Finally, results on the test sets are presented for each classifier and compared with

results of each single AF detector.

4.2. Supervised learning setting

In a classification problem, a set of data D is composed by N instances, or observations.Each instance is described by M explanatory variables (X ), called features or attributes,and the correspondent target class (outcome), as shown in the Figure 4.1.

The goal of a classification method consists in identifying potential relationships amongthe features describing an instance as belonging to a categorical attribute (Y ).

By using observations whose target class is a priori known (training set), classifiers suchas Support Vector Machines, Random Forest, Logistic Regression, are trained to generatesome classification rules: these rules can be successively used for class prediction of futureobservations, whenever the outcome is unknown.

88

4. Classification of AF: a machine learning approach

Figure 4.1.: A dataset D for classification problems counts N observations. Each obser-vation is composed by M predictive variables (features) and, when known,the outcome Y. The classification problem is binary, when instances belongto two classes only (Y ∈ [0, 1]).

From a mathematical point of view, we can indicate with F a class of functionsf (x) : Rm �→ H called hypothesis, that describe potential relationships between X andY. A supervised learning method for classification has to identify f∗ ∈ F that optimallycaptures attributes’ predictivity for the target class.

In this study, the generic observation corresponds to one ECG segment of a certainwindow length. For this segment, the following features are available to predict presenceor absence of Atrial Fibrillation:

CosEn Coefficient of sample entropy (see eq. 2.4).

SampEn Sample Entropy (see eq. 2.5).

meanRRinterval Mean RR interval within a n beat-segment.

minRRinterval Minimum RR interval within the n beat-segment.

medianFrequency Median frequency within the n beat-segment.

AFEvidence Atrial Fibrillation evidence (see eq.2.1).

PACEvidence PAC evidence (see eq.2.3).

IrregularityEvidence Irregularity evidence (see eq.2.2).

OriginCount Number of points contained in the origin bin of the Lorenz plot (see Section2.3.4).

89


As one can notice, the feature MAD [24] was not included within the set of explanatoryvariables, aimed to describe instances as belonging to a certain class. There are threemain reasons explaining this choice:

• MAD can be computed for window lengths whose size is a multiple of 3 only: thisaspect is particularly limiting when we want to make comparisons between resultsobtained by using a single AF detector and results achieved with classificationmodels, when using the same window length: the space of computable results wouldbe necessarily shrunk to the lower resolution range [12 : 3 : 300] beat-segment.

• MAD showed the lowest Sensitivity in revealing Atrial Fibrillation on the trainingset, the MIT-BIH Atrial Fibrillation database, by varying minNbeat. The sameresults were confirmed on the testing set, the MIT-BIH Arrhythmia database (seeresults in Table 3.13 of Chapter 3).

• MAD always shows much larger optimal window sizes compared to CosEn andLorenz (by varying both minAF% and minNbeat), often obtaining even worseresults.

Therefore, MAD was excluded to make comparison possible between Machine learningand single AF detectors for the same window length, also considering that MAD doesnot include any orthogonal information from the other metrics, i.e., it captures RRirregularity, but does it in a worse fashion than the other features (low Sensitivity to AFand large window length are also two negative aspects in the perspective of a screeningapplication). However, appendix B contains SVM results on the MIT-BIH NSR databaseand on the MIT-BIH Arrhythmia database obtained including MAD for minNbeat=10beats: results confirms what was hypothesized above.

4.3. Practical considerations

4.3.1. Reference selection

Training and testing of the classifiers were conducted by using minNbeat= [10, 30, 50]

only. This choice was taken not to show huge amount of results without any particularmeaning. In fact, as previously explained (see Section 3.4.3), results obtained by usingminNbeat are more interpretable, in terms of minimum amount of AF we may wantto reveal. In fact, for minNbeat, the minimum amount of AF required to label a targetsegment as a true positive is independent from the window size, differently from minAF%(see Section 3.4.3 for more details).

90


4.3.2. Window size selection

Training and Testing were performed by using a subset of window sizes, due to the longcomputational time required for optimizing models’ parameters with a grid search andtraining the classifiers. For this purpose, an exponential extraction in the range [10 : 300]

was preferred to downsampling with a constant factor D.The underlying idea of exponential selection is that for small window sizes, little varia-

tions in length have effects that are more relevant on performance compared to the samevariations in larger windows: e.g., if we consider 12 beat-segments in the place of 13 beat-segments, performance is more likely to change than considering a 299 beat-segment inthe place of a 300 beat-segment.

Segment lengths, resulted optimal during the training phase of each AF detector, wereadded to the exponentially separated windows. This was done to make comparison ofperformance possible, when using machine learning instead of a single detector to revealAF with the same window size.

4.4. Methods

4.4.1. Logistic Regression

Logistic regression (LR) is a statistical model for classification, which identifies the impactof M independent variables in predicting the membership of one of the two dependentcategories. It can be considered an extension of linear regression, which struggles withdichotomous problems. This difficulty is overcome by applying a mathematical transfor-mation of the output of the classifier, transforming it into a bounded value between 0and 1 more appropriate for binary predictions.

In our case, the response variable Y is a positive (1) or negative (0) diagnosis for AF:the posterior probability P (y|x) is modelled by a logistic function, as follows:

P (Y = 0|X) =1

1 + exp (wTX)(4.1)

P (Y = 1|X) =exp

�wTX

�

1 + exp (wTX)(4.2)

where w is the vector of the regression coefficients.The standard logistic function S (t), commonly known as sigmoid, is defined by the

formula

91


S (t) =1

1 + exp (−t)(4.3)

Its “S ” shape is shown in the Figure 4.2.

Figure 4.2.: Logistic curve.

If we now consider the natural logarithm of the expressions 4.1 and 4.2 ratio, calledodds ratio or likelihood ratio that the dependent variable is 1, we establish a lineardependence between conditional probabilities and predictive variables:

lnP (Y = 1|X)

P (Y = 0|X)= wTX. (4.4)

In fact, by setting wTX = z, the binary classification problem is transformed in thesearch of a linear regression model, whose coefficients must be estimated.

4.4.1.1. Model selection methods

One aim of logistic regression is to predict a binary target using the most parsimo-nious model, in order to ensure that the model does not overfit the training data. Toaccomplish this goal, different techniques can be used to generate the final model:

• Backward stepwise regression

The search begins with a saturated model (all predictors are included), then pre-dictors are eliminated from the model in an iterative process. The deletion of eachvariable is tested to guarantee that the model still fits the data. The analysis iscomplete, when no more variables can be eliminated according to a model compar-ison criterion.

92


Figure 4.3.: The LARS algorithm in the case of M = 2 features. The model starts inµ0 (x1 and x2 are the variables and y is the outcome).The residual (thegreen line) makes the least angle with x1 (x1 is the most correlated withthe residual), so the model starts moving in x1 direction. At µ1, there isequiangularity/equicorrelation of the residual with x1 and x2: so, we startmoving in the direction that preserves this property.

• Forward selection

Initially there are no variables in the model: the addition of each variable is testedusing a model comparison criterion. The limit with this method is that it canbe too greedy: variables are fully added at each step, so correlated predictors areunlikely to be included in the model.

• Forward stagewise regression

This method remediates the limits of forward selection by partially adding thefeatures: it finds the variable with the highest predictive power and updates itsweight by only epsilon in the correct direction. The problem with forward stagewiseregression is it is quite slow and inefficient, as we do lots of updates.

In this study, the method used to predict the class labels, contextually avoiding theoverfitting problem, is a faster implementation of the forward stagewise approach, i.e.the Least Angle Regression (LARS), detailed in the next Section.

4.4.1.2. Training on the MIT-BIH AF database

For the training of the logistic regression model on the MIT-BIH AF database, opti-mization of the regression coefficients {wi} and feature importance analysis were com-bined by using the usual 5 folds Cross Validation. The Least Angle Regression (LARS)was employed for this purpose.

93


Compared to the forward stagewise regression, that makes tiny jumps in the directionof each regressor at a time, LARS makes the optimal sized steps in optimal directions,chosen to make equal angles (equal correlations) with each of the variables currently inthe model. Figure 4.3 illustrates the algorithm steps in the case of only 2 predictors.

The coefficients estimate is defined by

w =argmin

N�

i=1

(yi −�

j

wjxij)2 + λ

�

j

|wj |

(4.5)

where λ is a tuning parameter which controls the amount of shrinkage applied to theestimates.

More specifically, increasing values of λ, increase the penalty on the features: λ valueswhich are too high end up shrinking the model to contain no coefficients (a model con-taining no coefficients is called the null model, because it will just predict the mean ofthe targets every time).

This internal parameter was optimized, for each window length, on the training setusing a 5 folds CV.

The search range of λ values was chosen in this way:

1. The largest value of lambda which still contains non-zero coefficients is determined(lambdaMax );

2. lambdaMax is multiplied by 10−4 , the resulting value is lambdaMin;

3. A 100 element vector is created, which spans [lambdaMax,lambdaMin].

By using this approach, there is not a fixed range for λ, as it depends on the dataset.Therefore, ranges of λ slightly differ, when using minNbeat equal to 10, 30 and 50 beats.

The parameter λ was optimized on the MIT-BIH AF database by using a 5 folds CrossValidation for each window length (tuning phase) and it was selected by maximizing theAUROC.

Once the optimal λ was found for each window length, the model was retrained usingthe entire training dataset to obtain the regression coefficients {wj}.

Table 4.1 shows the optimal coefficients w for each value of minNbeat.A logistic regression model outputs the probabilities that an observation belongs to

a certain class (AF vs. non-AF). For this reason, the last step was to determine thebest threshold across the 5 folds of the MIT-BIH AF database to round the probabilitiestowards 0 (negative diagnosis) or 1 (positive diagnosis).

94


Feature 10 beats 30 beats 50 beats

CosEn 0.1474 0.1735 0.1887SampEn 0 0 0.0236

meanRRinterval 0 0 0minRRinterval 0.1865 0.2990 0.3376

medianFrequency 0.1118 0.1426 0.1589OriginCount 0 0 0

IrrEv 0.0117 0.0114 0.0085PACEv -0.0095 -0.0094 -0.0070

AFEvidence 0 0 0

Table 4.1.: Optimal logistic regression coefficients for each value of minNbeat.

In particular, thresholds across the folds were selected as the ones giving a minimumSensitivity of 99.00% on each held out fold of the training set during the cross-validation.Thresholds were successively averaged to obtain the threshold value (called Thopt ) usedon the test sets.

4.4.1.3. Feature importance analysis

Feature importance analysis is combined with the training of the classifier.An easy estimate of the features importance can be obtained by counting the number

of “times” (i.e., the number of folds) one feature is used in the final model, as depicted inFigure 4.4. As an example, if one feature is contained in just 1 fold, as its coefficient washugely shrunk by λ, it gets a score of 1; while it gets a score of 5, if it is always present.So, higher score means higher importance: the problem is that, if two attributes are verycorrelated, e.g. CosEn and SampEn, then the score could be low, just because half ofthe times the model picks CosEn and the other half, it picks SampEn, even though bothare important.

For this reason, the log likelihood (ll) was also used to compute the contribution ofeach feature to the predictive power of the final model:

ll =N�

i=1

−yi ln (yi)− (1− yi) ln (1− yi) . (4.6)

The metric ll measures how likely predictions (Y ) are if we assume targets derive froma binomial distribution.

Likelihood was used as it can be theoretically linked with the complexity (number offeatures) in the model. It also incorporates both model calibration (how close to the

95


average occurrence rate the model’s predictions are) and discrimination (how well themodel separates positive and negative classes), so it also provides better estimates offeature importance.

The feature importance analysis is performed following the steps below. For each ofthe 5 folds:

1. One feature of the data is randomly reshuffled.

2. The ll of the predictions is evaluated.

3. The ll is divided by the ll of the original model.

4. Previous steps are repeated for each feature.

The likelihood is averaged across the 5 folds on the held out test set of the MIT-BIH AF.Results are displayed in Figure 4.5. It can be noticed that reshuffling PACEv, CosEnand IrrEv has a massive effect on the predictive power of the final model: for IrrEv andCosEn the drop involves all window sizes, while PACEv importance is more significantfor smaller window sizes. A less evident, but still considerable drop in the likelihood ofpredictions is caused by SampEn for small window lengths.

96


Figure 4.4.: Number of folds containing each feature into the final model for each expo-nentially separated window size (minNbeat=50).

97


Figure 4.5.: Average feature importance across the 5 folds on the held out test set of theMIT-BIH AF, using LR (minNbeat=50).

4.4.2. Support Vector Machines

Support Vector Machines (SVMs) are a powerful technique for data classification.Linearly separable data require to find the optimal separating hyperplane in the studied

feature space (which is not necessarily the original feature space): if we consider theexample in Figure 4.6, we can draw an infinite number of lines to correctly separate blueand red circles, but the green line results intuitively the best choice. This is because the“margin”, i.e. the distance between the line and the nearest data samples on each side,is as wide as possible.

SVMs maximize that margin between a selection of data points, known as support

98


Figure 4.6.: Possible separating hyperplanes for a set of linearly separable data.

Figure 4.7.: Support vectors, decision boundary and margin. The support vectors arethe most critical points to be classified in the dataset: they are the closestto the separation plane, thus having direct bearing on its location.

vectors, and the decision boundary, as shown in Figure 4.7.Given a training set of instance-label pairs (xi, yi) , i = 1, ..., l where xi ∈ Rn and

y ∈ {1,−1}l, and a hyperplane (w · x) + b = 0, the optimization problem is solved byfinding the saddle point of the Lagrangian

L(w, b,α) =1

2||w||2 −

l�

i=1

αi {[(xi · w) + b] yi − 1} . (4.7)

where αi are the Lagrange multipliers. Lagrangian is maximized with respect to αi ≥ 0

and minimized with respect to w and b.

However, real data should not be treated as linearly separable (they could be in ahigher dimensional space, but it does not make sense trying to perfectly separate a finite

99


γ Range�2(−15:2:3)

�

C Range�2−5:2:15

�

Table 4.2.: Range of values for Gamma (γ) and Capacity (C) to perform a grid search.The choice of exponentially growing values is a practical method to search alarge space in a reasonable amount of computation time.

training set, potentially corrupted by noise).4.4.2.1. Non-linear SVM

For non-linear SVM the concept of “soft margin” is introduced: samples on the wrongside of the decision boundary are allowed, but penalized in proportion to their distancefrom the boundary. The optimization problem then becomes:

minw,b,ξ1

2wTw + C

l�

i=1

ξi (4.8)

subject toyi�wTφ (xi) + b

�≥ 1− ξi, ξi ≥ 0. (4.9)

During the training, instances xi are mapped into a higher dimensional space by thefunction φ.C > 0 is the penalty parameter of the error term; it controls the cost of the misclassifi-

cation: high C means we are heavily penalizing samples on the wrong side of the decisionboundary. The suitable value of C is not a priori known, since it is problem-dependentand must be optimized using a grid search over a range of values.

4.4.2.2. The kernel trick

In non-linear SVMs, data are linearly separated by being mapped to a higher dimensionalspace. Consider Figure 4.8, there is no way the upper dataset can be linearly separated;but, if we embed the data in a higher dimensional space, e.g. x �→ x2, in this new domain,a line can be easily drawn to divide red and green points.

The interesting aspect is that the feature mapping in a higher domain is not actuallyneeded: the function φ (·) does not have to be specified. One can define a kernel function(K ) which performs the necessary calculations within the original feature space, resultingin significant computational cost savings.

K(xi, xj) ≡ φ(xi)Tφ(xj). (4.10)

100


Figure 4.8.: Non linear SVM. There upper dataset cannot be linearly separated; but, ifwe embed the data in a higher dimensional space, e.g. x �→ x2, in this newdomain, a line can be easily drawn to divide red and green points.

Several kernel functions are available: for this work, the radial basis function (RBF)was chosen, because it has fewer numerical difficulties and a lower number of hyperpa-rameters compared to the polynomial kernel, because the linear and the gaussian kernelsare special cases of the RBF, and because it can deal with a non linear relation betweenattributes and class labels. The RBF kernel is defined as:

K(xi, xj) = exp(−γ||xi−xj ||2), γ > 0 (4.11)

where γ controls the kernel width.C and γ must be optimized simultaneously by combining a grid search over a range of

values and a 5 folds Cross Validation in order to prevent the overfitting problem.Table 4.2 shows range of values for γ and C used to perform grid search on the training

set, the MIT-BIH AF database. Both ranges are exponentially growing sequences ofvalues, as this is a practical method to search a large space in a reasonable amount ofcomputation time.

Data pre-processing

For training and testing SVM, each data instance was represented by a vector of realnumbers (attributes/features values). Successively linear scaling was performed on eachattribute to the range [0, 1].

Scaling data has a double purpose: it simplifies numerical calculation and prevents

101


attributes in larger numerical ranges from dominating attributes in smaller ranges.Only the training set was used for rescaling: inferred scaling parameters were then

applied to the test sets, since scaling all data together would illegitimately make resultslook better.


For training SVM, the integrated software LIBSVM was used [5].The model training was preceded by a grid search (tuning phase) with the aim of

optimizing γ and C using a 5 folds CV across the range of values summarized in Table4.2 for each window length.

The model, correspondent to the window length WLopt (i.e., the one having the highestaveraged AUROC value across the 5 folds) — γopt(WLopt) and C opt(WLopt), given —was retrained on the entire training set.

Results on the MIT-BIH AF database are shown in Table 4.3. SVM is compared toLorenz, the AF detector having the best performance above the others (see Section 4.4).

Metrics are reported for SVM optimal window length (WLopt) and for Lorenz WLopt.When minNbeat is only 10 beats, performance of SVM is stunning and the window size(65 beat-segment) is considerably reduced with respect to Lorenz (82 beat-segment).For higher values of minNbeat, SVM performance metrics are even more promising andWLsopt are larger than Lorenz WLsopt: however, SVM is still better even for smallerwindow sizes (i.e., the same as Lorenz).

102


minNbeat SVM SVM Lorenz10 WLopt=65 WL=82 WLopt=82

Se (%) 98.12 98.01 95.04Sp (%) 98.66 98.59 98.02Acc (%) 98.42 98.34 96.72PPV (%) 98.26 98.18 97.40NPV (%) 98.55 98.46 96.21Err (%) 1.58 1.66 3.28

(a)



(b)



(c)

Table 4.3.: Results of SVM on the MIT-BIH AF database compared to Lorenz perfor-mance. Results are shown for minNbeat equal to 10 beats (4.3a), 30 beats(4.3b) and 50 beats (4.3c). SVM performance metrics are reported for theWLopt and for the WL which resulted optimal for Lorenz.

4.4.3. Random Forests

Random Forests (RF) are a mathematical method for classification based on decisiontrees.

Decision trees are a popular learning technique, thanks to their conceptual simplicity,

103


computational speed and interpretability. The growth of the tree is regulated by arecursive, heuristic procedure, called top-down induction of decision trees: one instanceis initially located in the root node and is successively placed in descending nodes, usingthe splitting rules and checking any stopping criterion has not been met yet [18].

In the original paper by L. Breiman, RF are defined as consisting of a “collection oftree-structured classifiers h(X,Θk), k = 1, ... where the Θk are independent identicallydistributed random vectors and each tree casts a unit vote for the most popular class atinput X” [3].

Essentially, RF build a series of trees. Each tree makes a prediction, good or bad, andeach prediction votes to make RF final predictions: if the majority of the trees classi-fies one observation as belonging to class 0 (1), then RF classify that observation as 0 (1).

Considering our training set of N observations, composed by M attributes, for eachtree:

• N instances are sampled randomly, but with replacement, from the original data.

• A number m � M of features is randomly selected to generate the splitting rules(once m is chosen, it remains fixed during the whole forest growth).

• Pruning criteria are not applied as RF does not overfit.

A robust classifier is characterized by a low error rate. In RF, the error rate depends on:

• Correlation of any two trees in the forest.Increasing correlation, increases the forest error rate.

• The strength of each individual tree in the forest.Stronger trees decrease the forest error.

The parameter m has effect on both the correlation and the strength. Increasing it,increases both, causing an inevitable trade-off between them.


RF require the tuning of parameter m, i.e. the number of features randomly selectedfor each tree in the forest. This line search was performed across the range [1, 2, 3, 4, 5]

with the usual 5 folds CV.Once m was optimized for each tree, for each window length, the best model was

chosen using AUROC, and retrained using the entire training set.

104


The final RF model gives the probabilities that an observation belongs to a certainclass (AF vs. non-AF).

Figures 4.9 and 4.10 depict how performance metrics vary by changing the threshold.As we can see, for low thresholds, Negative Predictive Value and Sensitivity have thehighest values. In fact, the number of TP is maximized, thus causing an increase in FPwhich reduce Specificity and Positive Predictive Value of the model. As the thresholdraises, Sensitivity and Negative Predictive Value are slightly reduced, while the othermetrics progressively get better.

Therefore, the last step was to determine the best threshold across the 5 folds of theMIT-BIH AF database to round the probabilities towards 0 (negative diagnosis) or 1(positive diagnosis).

In particular, thresholds across the folds were selected as the ones giving a minimumSensitivity of 99.00% on the training set and were successively averaged to obtain thethreshold value Thopt used on the test sets.

4.4.3.2. Feature importance analysis

As for LR, the log likelihood was used to perform the feature importance analysis.Therefore, the “importance” estimate of each attribute was obtained by computing the

ratio between the ll, after randomly shuffling the data for that attribute, and the ll ofthe initial model (the theoretical best of this ratio would be 0).

Figures 4.11, 4.12, 4.13 show the results for minNbeat equal to 10, 30 and 50 beatsrespectively. In Figure 4.11, CosEn, AFEvidence and IrrEv have the greatest impacton model predictivity: CosEn results more “important” for smaller windows, then its llratio gradually increases, still remaining significantly low. One can do pretty much thesame considerations for AFEvidence (a bit less important than CosEn for small windowsand a bit more for larger WLs). When reshuffling IrrEv, instead, the model gets muchworse. SampEn contribution results quite important at very small windows. All the otherfeatures contribute at different, but less considerable extents, to the predictive power ofthe model.

If we compare Figure 4.11 with Figure 4.12, we can notice that the effect of reshufflingCosEn, SampEn, AFEvidence and IrrEv is even more emphasized at small windows.Also the ll ratio for the feature meanRRinterval undergoes an evident reduction between174 and 228 beat-segment.

Finally, when minNbeat=50 beats (Figure 4.13), the attribute medianFrequency showsto give a good contribution for a correct prediction as well.

In general, it is confirmed the predictive importance of the features extracted from an

105


(a)

(b)

Figure 4.9.: Performance metrics as a function of window length and threshold, acrossthe 5 folds on the held out test set of the MIT-BIH AF (minNbeat=10).Thresholds are specified in the y axis: 0.2 (4.9a) and 0.4 (4.9b).

106


(a)

(b)

Figure 4.10.: Performance metrics as a function of window length and threshold, acrossthe 5 folds on the held out test set of the MIT-BIH AF (minNbeat=10).Thresholds are specified in the y axis: 0.6 (4.10a) and 0.8 (4.10b).

107


Figure 4.11.: Average feature importance across the 5 folds on the held out test set ofthe MIT-BIH AF, using RF (minNbeat=10).

ECG segment by using the algorithms CosEn and Lorenz. Surprisingly, IrrEv appearsto have a huge impact on the correct classification, even more than AFEvidence. Thesetwo attributes are components of each other (see eq. 2.1): AFEv also incorporatesinformation from PACEv and OriginCount (see Section 2.3.4), important to correctlyclassify a segment as non-AF, but probably, for the purpose of discriminating betweenNSR and AF only, IrrEv encodes better the irregularity nature of AF, thus resulting inbetter predictions.

Other two simple features showed to give a good, but smaller contribution to thediscrimination, meanRRinterval and medianFequency, in particular when the amount ofAF within a segment is considerable (at least 30 and 50 beats).

108



109



4.5. Comparative results

4.5.1. Sensitivity-weighted results

This section compares performance obtained on the test sets (MIT-BIH NSR and Ar-rhythmia databases) by each AF detector and by each classifier (SVM, RF, LR).

Thresholds for SVM, RF and LR, applied to round the probability estimates, wereobtained imposing a minimum Sensitivity=99.00% on the training set; threshold values,used on the test sets, are shown in Table 4.4.

As we can see in Table 4.5, Sp of LR, RF and SVM is lower than Lorenz, whenminNbeat=10 beats: in fact, the thresholds selection criterion, mentioned above, tendsto minimize the number of FN, possibly increasing FP. This could explain why the ability

110


of LR, RF and SVM to correctly identify patients without AF is worsened. However,when increasing minNbeat, RF and SVM show better capability than the other AFdetectors in rejecting a non-AF event, but with larger window sizes.

On the MIT-BIH Arrhythmia database, we can not establish an undisputed winner,as shown in Tables 4.6, 4.7, 4.8. When minNbeat=10 beats, LR shows the highest Se(99.35%) on the series 200, followed by CosEn (98.97%) and RF (98.69%).

On the contrary, Lorenz detects the largest number of positive events, when minNbeat=30and WL= 76 beat-segment only. Finally, when minNbeat=50 beats, SVM reaches 100%Sensitivity on the series 200 and 100% Specificity on the series 100, using 140 beat-segments.

minNbeat LR RF SVM

10 0.3209 0.0515 0.238130 0.4326 0.4131 0.340950 0.4697 0.4858 0.4087

Table 4.4.: Thresholds applied to the test sets, to round the probability estimates for LR,RF and SVM.

Table 4.5.: Comparison between LR, RF, SVM and each AF detector: CosEn, MAD,Lorenz. Performance metrics on the MIT-BIH NSR database are reported byvarying minNbeat.

111


Table 4.6.: Comparative results for SVM, RF, LR and each AF detector, on the MIT-BIH Arrhythmia database. Results are distinguished between series 100 andseries 200 and are reported for minNbeat=10 beats.


112



4.5.2. Results with LIBSVM

Table 4.9 shows comparative results for SVM and the other AF detectors on the NSRdatabase, obtained by using the class label predictions returned by the integrated softwareLIBSVM (Section 5.4.2.3 explains how internal SVM parameters are optimized). WhenminNbeat is equal to 30 and 50 beats, SVM Specificity is higher than Lorenz. On thecontrary, when considering a minimum of 10 beats, the Specificity is slightly lower (-0.24%), but with a significantly smaller window length (65 beat segment).

Still, on the arrhythmia database, SVM improves performance obtained by the otherAF detectors. Tables 4.10 and 4.11 show performance metrics for minNbeat equal to 10and 30 respectively: even when the Sensitivity is slightly lower than Lorenz or CosEn,the total error is much lower.

Finally, when minNbeat=50 beats (Table 4.12), SVM Se, Sp and Acc are respectively100%, 82% and 85.45%: SVM can detect all the True Positive events, still maintaining

113


an overall Accuracy which is higher than using the optimal AF detector, i.e. Lorenz,only.

Table 4.9.: Comparison between SVM and each AF detector (i.e. CosEn, MAD, Lorenz ).Performance metrics on the MIT-BIH NSR database are reported by varyingminNbeat.

114


Table 4.10.: Comparative results for SVM, CosEn, MAD, Lorenz on the MIT-BIH Ar-rhythmia database. Results are distinguished between series100 and se-ries200 (see Section 3.2) for minNbeat=10 beats.


115



116

5. Ectopy simulation

5.1. Introduction

Ectopic beats are frequently seen in the clinical practice. In absence of underlyingpathological conditions, they are usually harmless and no treatment is usually recom-mended. However, since ectopic beats occur early or late with respect to the timing of abeat generated via SA node, they introduce changes in tachogram spectral content, thuschallenging sinus beats rejection for AF detectors, based on ventricular response analysis.

In this chapter a method based on Hidden Markov modelling is used to add differ-ent percentages of Premature ventricular contractions on the MIT-BIH Normal SinusRhythm database.

Each AF detector and each classifier is tested again on this dataset and results areshown by varying minNbeat = [10, 30, 50] (minNbeat def.: minimum number of AFbeats a RR segment must contain to consider that segment as a true positive), to seehow sporadic abnormalities in a normal ECG tracing affect the capability of each methodto reject a non-AF temporal segment.

5.2. Identifying the problem

The heart of an healthy subject typically exhibits normal sinus rhythm, characterizedby regular waves of electrical activity which originate in the specialized cardiac cellsof the SA node and travel in a normal fashion within the heart. This process repeatsitself and could be described as a quasiperiodic function, since consecutive beats aremorphologically similar and small variations in RR intervals occur slowly.

An ectopic beat is defined as a beat which occurs abnormally in relation to the prevail-ing rhythm. It is generated by ectopic foci of the heart and reveals an aberration in theelectrical signalling pathway, e.g. a conduction block along the normal route SA node -AV node - bundle of His - Purkinje fibres.

As an example, if the AV node is not activated by the SA junction for any reason(i.e. block or arrest), it will become the actual cardiac pacemaker. Similarly, ventricular

117


Figure 5.1.: A ventricular ectopic beat (4th from the left) for record 109 of the MIT-BIHArrhythmia database.

foci will take over, if transmission through the AV node fails. In such cases, knownas escape ectopic automatic rhythms, the resulting QRS will be expanded and shiftedbecause ectopic pacemakers fire at their intrinsic rate, slower than that of the SA node.

Vice versa, it may happen that normal activity of ectopic foci results enhanced (possiblecauses: hypoxemia, ischemia, increased sympathetic tone), so that their rate of dischargeis faster than expected: in this circumstance, a subsidiary pacemaker is able to overdrivethe SA node and the resulting beat will be premature, as shown in Figure 5.1. A typicalanticipation interval is 20% of the previous beat [20].

Ectopic beats are classified as Atrial, Atrio-Ventricular junctional and Ventricular. Apremature ventricular contraction (PVC), for example, is commonly characterized by awide QRS complex, absence of P-wave and a compensatory pause, caused by the conse-quential ventricular refractoriness that does not allow the normal sinus beat transmission.In case of R on T ectopics (PVC and T-wave of the previous beat are overlapped) PVCcan trigger ventricular arrhythmias. PVCs can also occur in patterns: one prematurecomplex followed by one (bigeminy) or two (trigeminy) normal sinus impulses (regularlyirregular rhythms).

Ectopic beats are commonly seen in clinical practice and have been considered benign

118


Class Description

0 No ventricular ectopic beats1 Occasional, isolated PVC2 Frequent PVC (> 1/min or > 30/hour)3 Multiform PVC4 Repetitive PVC: couplets, salvos5 Ventricular tachycardia

Table 5.1.: Modified Lown criteria for Holter classification of ventricular extrasystoles[35].

for a long time. But research has shown that frequent extrasystoles are markers of unsat-isfactory recovery from traumatic events and a sign of underlying pathologic substrate forarrhythmias, such as Atrial Fibrillation (see Section 1.3) [41]. In particular, an increasedoccurrence of ventricular ectopic beats is associated with an augmented and statisticallysignificant risk of cardiovascular disease [38, 1, 40, 41].

The Lown criteria [25] are prognostic criteria, which aim to classify extrasystoles, onthe base of their occurrence rate (Table 5.1). Frequency of ectopic events is thoughtto predict eventual outcomes. Class 0 and Class 1 are associated with a low risk ofdegeneration into a pathologic condition; Class 2 is associated with an intermediaterisk; Class 3 and onwards have a high risk of degeneration into potentially dangerousdysrhythmias.

5.3. Methods

Ectopic beats are simulated by modifying the records of the MIT-BIH NSR database.For this purpose, it was hypothesized that RR series can be modelled by using a three-symbol Markov chain sequence, as described in Mark and Moody’s article [29].

The Markov model is characterized by a set of states: short beat (S), regular beat (R),long beat (L).

Each interval of an RR series is representative of one of these three states (S-R-L), asit can be classified as Short, Regular or Long beat.

During the Markov process, for each discrete time step t, the system is in exactlyone of the possible three states S1 (short beat), S2 (regular beat), S3 (long beat). Inparticular, the current state determines the probability distribution of the next state.For our special case of a discrete, first order, Markov process, this probability is formallydefined as [34]:

119


Figure 5.2.: State sequence and observations for record 16265 of the MIT-BIH NSRdatabase. Top figure shows states’ transitions: short beat (state index 1),regular beat (state index 2), long beat (state index 3). Central figure depictsthe original RR intervals series. Bottom figure shows the RR sequence, whenectopy is added, by using the transition probability matrix (eq. 5.5).

120


P (qt+1 = Sj |qt = Si, qt−1 = Sk, ...) = P (qt+1 = Sj |qt = Si) (5.1)

Therefore, only considering processes so that the right side of the formula 5.1 is time-independent, we can define the set of state transition probabilities aij as [34]:

aij = P (qt+1 = Sj |qt = Si) , i ≥ 1, j ≤ 3 (5.2)

subject to

aij ≥ 0 (5.3)

3�

j=1

aij = 1. (5.4)

For the purpose of the ectopy simulation, the prototypic stationary transition matrix,associated to the hypothesis of a non-AF sequence, is the ordered array of the probabilities{aij} of transition between different states as in [9]:

STMSRLnonAF =

0.21 0.02 0.21

0.21 0.96 0.48

0.58 0.02 0.31

(5.5)

where each element aij represents the probability that, given the actual state j, thenext one is going to be i.

Therefore, the first column gives the probability of having a next short, regular or longRR, if the current RR is short. The second column gives the same probabilities whenthe current RR is regular and the last one gives the probability when the current RR islong.

As we can see in the eq. 5.5, if the current RR is regular, the probability of having ashort or a long beat is the same; but, when the current beat is long, the next one is morelikely to be regular; while in case of a short beat, the next one is more likely to be long.This accounts for compensatory pauses which typically follow an abnormal beat duringaccelerated ectopic automatic rhythms.

Using the transition matrix shown in eq. 5.5 and defining an initial state, regular (R)for example, we can compute a sequence of states, as long as each record of the MIT-BIHNSR database.

Moreover, by changing the central elements of the transition probability matrix in eq.

121


Mean (%) Standard deviation (%) [a12, a22, a32]

0.26 0.038 [0.0005, 0.9990, 0.0005]0.54 0.053 [0.0010, 0.9980, 0.0010]1.05 0.048 [0.0020, 0.9960, 0.0020]9.62 0.16 [0.0200, 0.9600, 0.0200]13.91 0.20 [0.0300, 0.9400, 0.0300]21.16 0.24 [0.0500, 0.9000, 0.0500]34.93 0.26 [0.1000, 0.8000, 0.1000]

Table 5.2.: Mean and standard deviation of simulated ectopy and correspondent transi-tion probability matrices.

5.5, i.e. a12, a22, a32, though maintaining mutual equivalence between a12, a32, differentpercentages of abnormal changes may be introduced in the original RR series.

When a non-regular beat occurs (i.e., short or long beat), the correspondent interval ofthe RR series is multiplied by an adjustment, computed as a random number belongingto the normal distribution N

�µ,σ2

�, where µ is the mean of the distribution and σ2 the

variance of the distribution.

1. µ = 0.66 and σ = 0.02 when the state is S

2. µ = 1.33 and σ = 0.02 when the state is L

Chosen values for µ and σ allow for reasonable shortening (µ = 0.66) and broadening(µ = 1.33) of correspondent RR intervals.

An example of states sequence and resulting observations, computed by using previ-ously described adjustments, is shown in Figure 5.2 for the record 16265 of the MIT-BIHNSR database.

5.4. Results

Different percentages of changes in the original RR intervals series were added bychanging elements a12, a22, a32 of the transition probability matrix 5.5. Mean and stan-dard deviation of RR intervals changes across the MIT-BIH NSR database are reportedin the Table 5.2, coupled with a12, a22, a32 elements of the correspondent transitionprobability matrix used to simulate ectopy.

122


Figure 5.3.: Cumulative error for different percentages of simulated ectopic beats. Dropin performance for each AF detector is depicted by varying minNbeat: from10 beats (left figure), 30 (central figure) to 50 (right figure) on the MIT-BIHNSR database.

123


Figure 5.3 shows the cumulative error, computed as progressive decrease in Specificity,on the MIT-BIH NSR database for each AF detector by varying minNbeat (10, 30, 50):for percentages of non-normal events, ranging up to 1.05 ± 0.048, performance is veryslightly affected and no significant differences exist between candidate methods.

As the amount of ectopic events increases up to 21.16%, drop in Specificity is relevant:MAD shows the minimum increase of total error, whatever the assumption on minNbeatis. This is predictable, since a very large window size (282 beat-segment) ends up to bemuch more robust to abnormal events, compared to a WL of 41 beat-segment (CosEn),which unsurprisingly gives worse results than Lorenz window size, equal to 82 beat-segment, when minNbeat is 10 beats.

For even higher percentage of ectopy (34.93%), Error raises over 35% for each method,with better performance for Lorenz when minNbeat=50 beats and the largest decreasein performance for MAD when minNbeat=10 beats.

Figure 5.4 shows cumulative decrease in Specificity, on the MIT-BIH NSR databasefor SVM, LR and RF, by varying minNbeat (10, 30, 50): for percentages of non-normalevents, ranging up to 1.05 ± 0.048, performance is not affected significantly, as for thesingle AF detectors.

As the amount of ectopic events increases up to 34.93%, drop in Specificity is consid-erable: for minNbeat=10 beats, SVM shows the minimum increase of total error amongthe classifiers. On the contrary, for minNbeat equal to 30 and 50 beats, SVM has thehighest cumulative error, even if absolute values of Specificity are higher than RF andLR, as listed in Table 5.4 up to 1.05% of ectopy.

For very high percentages of ectopy (34.93%) it must be noticed that the total errorfor RF is quite low, when minNbeat is equal to 30 and 50 beats. In particular, it ismuch lower than correspondent errors for both the classifiers and the AF detectors, asconfirmed in Tables 5.3 and 5.4.

5.5. Conclusions

Table 5.3 summarizes results for each AF detector obtained by introducing differentamount of ectopy on the MIT-BIH NSR database, when minNbeat is equal to 50 beats.

Table 5.4 shows the same results, i.e. the drop in Specificity, for the classifiers LR, RFand SVM: for RF and SVM results are improved if compared to the single AF detectorsand the correspondent cumulative error is lower. When the percentage of ectopy is equalto 34.93%, RF shows a Specificity that is still very high (75.03%) and extraordinarybetter, compared to the other classifiers and AF detectors.

124


Ectopy (%) LR (Sp %) RF (Sp %) SVM (Sp %)

0.00 94.58 98.37 99.390.26 94.50 98.32 99.330.54 94.42 98.25 99.321.05 94.26 98.14 99.209.62 91.03 96.26 96.2013.91 88.41 94.69 93.4421.16 81.31 88.00 86.3434.93 52.65 56.72 63.82

Table 5.3.: AF detectors Specificity by varying the percentage of simulated ectopy forminNbeat=50 beats.

Ectopy (%) LR (Sp %) RF (Sp %) SVM (Sp %)

0.00 98.31 99.66 99.770.26 98.30 99.64 99.740.54 98.35 99.63 99.761.05 98.27 99.58 99.729.62 96.80 98.58 98.1713.91 94.96 97.56 95.8821.16 88.65 94.61 86.2034.93 58.56 75.03 50.92

Table 5.4.: Specificity for LR, RF, SVM by varying the percentage of simulated ectopyfor minNbeat=50 beats.

125


Figure 5.4.: Cumulative error for different percentages of simulated ectopic beats. Dropin performance for each classifier (i.e. SVM, LR, RF) is depicted by varyingminNbeat: from 10 beats (left figure), 30 (central figure) to 50 (right figure)on the MIT-BIH NSR database.

In general, percentages lower than 1% do not significantly affect single detectors andclassifiers’ s performance. It must be said that 1% is a common upper limit of acceptancein Holter analysis: no higher amounts of abnormal events are typically used to test amethod robustness [6]; also because, for larger percentages, abnormalities are no longerconsidered occasional.

However, it was worthwhile to explore the impact on performance when more frequentamounts of ectopic beats were introduced, i.e. 9.62%, 21.16%, 34.93%. In such cases, theprobability to have transitions from abnormal to abnormal beats is quite high. As anexample, the transition probability matrix, correspondent to the 9.62% of mean changesin the original RR sequence, leads to the generation of repetitive PVCs (couplets andsalvos), as shown in the Figure 5.2, for the beats interval 150−200. Percentages of 21.16%

126


and 34.93% are even more likely to mimic multiform PVC, bigeminy, trigeminy: for sure,they introduce, in each 24 hours record, a number of ectopic events that is enormouslyhigher than the maximum number of events requested to classify ectopy complexity asGrade 0 or Grade 1 with respect to Lown criteria.

Ventricular extrasystoles of Class > 2 are associated with a high risk of degenerationin potentially fatal arrhythmias.

Atrial Fibrillation initiated by ectopic focal activity is, moreover, one of the threeleading theories (i.e., rapid ectopic activity theory) explaining AF pathophysiology (seeSection 1.3). Therefore, in the perspective of a screening application, false positivedetections, caused by large percentages of abnormalities, should not be considered anissue. In fact, the purpose of this study is to find the most appropriate AF detector toaddress the problem of undiagnosed AF: this type of non-AF events would eventuallytake to the clinician attention subjects that are at high risk of dysrhythmias anyway andwho may require treatment.

127

6. Summary, considerations and futurework

6.1. Summary and considerations

The ECG signal contains high potential diagnostic information for detection of AF. Forthis reason, the signal processing contribution is very prolific in this field, as proved bythe large number of publications, which have appeared recently regarding AF detectorsutilizing ECG recordings.

Unfortunately, these methods are very rarely comparable, as claimed results are oftenobtained by using different evaluation protocols and/or different sets of data.

This thesis has investigated several techniques to reveal AF, with particular focus onventricular response analysis-based detectors, the methods which are least confoundedby artifacts and noise.

Three algorithms (CosEn, MAD, Lorenz ) were selected for their promising performanceand implemented in Matlab for further retraining and testing on standard databases pro-vided by Physionet. The final aim is implementing the most appropriate on a smartphonefor a screening application.

In chapter 3 the AF detectors are retrained on the MIT-BIH AF database, by using awide range of window lengths. This particular dataset was preferred, because it mainlycontains AF episodes, mostly paroxysmal, and NSR. As the final goal is trying to detectas many true AF events as possible while still maintaining good performance in rejectingnormal sinus rhythm, this dataset provided the exact diversity of heart rhythms required.

Training results are very promising: CosEn shows the best Sensitivity (95.04%) witha minimum AF amount of just 10 beats, Lorenz always presents the highest Accuracy(97.80%) and Positive Predictive Value (97.58%). When varying the minimum numberof AF beats in a target segment, MAD encounters the highest Total Error (7.02%) onthe MIT-BIH AF database and has the largest window size. On the contrary, CosEnalways presents the smallest window lengths (only 41 beat-segment, for minNbeat=10beats). Good performances obtained during training are confirmed on the test sets:

128

6. Summary, considerations and future work

the MIT-BIH NSR database, only containing NSR and sporadic ectopic events, andthe MIT-BIH Arrhythmia database, containing AF and other kinds of ventricular andsupraventricular arrhythmias. Lorenz reaches the highest performance (99.39%) in re-jecting a normal sinus rhythm segment, followed by MAD and CosEn with very goodSpecificity as well. Also on the MIT-BIH Arrhythmia database, Lorenz reaches a Sensi-tivity of 99.12% and Specificity of 81.73%, followed by CosEn (Se=99.19%, Sp=74.64%)and MAD (Se=97.62%, Sp=66.67%). As expected, for this database there is a highnumber of False Positives, mostly caused by regularly irregular rhythms such as AtrialBigeminy, Ventricular Bigeminy and Ventricular Trigeminy. This is not to be consideredan issue in the perspective of a screening application, because even though these arrhyth-mias are less common than AF, they are also likely to degenerate and give complications.Therefore, it is still of interest to bring them to the attention of the clinician.

For further implementation on phone, it must be said that Lorenz has globally thebest performance, in terms of both Sensitivity, Accuracy and Positive Predictive Value.However, CosEn requires smaller window lengths, thus allowing a faster detection andaddressing the problem of paroxysmal AF episodes, which can be of very short durationand have unexpected onset.

In chapter 4 the features extracted from the HRV signal by using CosEn and Lorenzare combined with other simple AF predictors, in the attempt to improve the predictivepower of single detectors. Three classifiers for supervised learning are adopted for thispurpose: Logistic Regression, Random Forests, Support Vector Machines. The modelsare trained on the MIT-BIH AF database and feature importance analysis is performedon the held out data during a five folds cross validation: features of Lorenz and CosEnshow to have the most important role in identifying an AF segment.

By using the classifiers, results are generally improved on the test sets, in particularin terms of Sensitivity: SVM and LR reach a Se=100% on series200 of the MIT-BIHArrhythmia database, while RF and SVM Specificity on the MIT-BIH NSR database is99.66% and 99.77% respectively.

In chapter 5 different percentages of ectopic beats are added on records of the MIT-BIH NSR database, using a Hidden Markov Model approach. Drop in performance isconsequently quantified.

For percentages of ectopy within the common upper limits of Holter acceptance, Speci-ficity for each detector and each classifier is very slightly affected: no significant differ-ences exist among them in terms of increased total error. For very high percentages ofectopy (up to 34.93%), the drop in performance is considerable for all the methods, SVMin particular. RF, instead, show a surprising capacity to reject non-AF ectopic events,

129

6. Summary, considerations and future work

with an extremely high Specificity when ectopy percentage is 34.93%, much higher thanthe other methods.

6.2. Future work

Concerning the single algorithms, CosEn presents a very high Sensitivity on both trainingand test sets and a very small optimal window size. This would address the problem ofparossistic AF episodes, whose duration can be very short and have unexpected onset.On the other side, Lorenz presents stunning Sensitivity, Specificity, Accuracy but forslightly larger window lengths. Furthermore, MAD and Lorenz have higher Specificitythan CosEn on the NSR database.

The classifiers generally acquire a better Sensitivity on the test sets when compared tothe component AF detectors. Regardless, the complexity of the supervised classificationmodel should be taken into account when implementing it on a phone. Factors whichcontrol computational cost, such as for example the number of support vectors in anSVM, would need to be incorporated in the model selection process. Future work shouldeventually include a feature selection step, based on the feature importance analysisalready performed.

Also, it must be reminded that all the results, reported in this thesis, were obtainedby extracting features from ECG tracings with available QRS annotations and with nosignificant sources of noise present within.

In the perspective of a smartphone application, it would be worth testing again eachalgorithm and each classifier with RR series obtained by using a QRS detector insteadof the QRS annotations and performing a noise stress test. Quantifying the sensitivityof each algorithm to ECG noise is an important step prior to practical application of thealgorithms.

Another interesting aspect would be using simple signal quality indices. This wouldallow for an evaluation of the impact of signal quality on the AF detection performance,while also quantifying the minimum signal quality required for accurate AF detection.

After the appropriate noise stress test by using a QRS detector on the ECG tracingsand quantifying performance as a function of the signal quality, the final step would beimplementation on a mobile platform and testing on “real” patients.

130

Bibliography

[1] Kelley P Anderson, J Thomas Bigger, Roger A Freedman, et al. Electrocardio-graphic predictors in the esvem trial: unsustained ventricular tachycardia, heartperiod variability, and the signal-averaged electrocardiogram. Progress in cardiovas-cular diseases, 38(6):463–488, 1996.

[2] Andreas Bollmann and Federico Lombardi. Electrocardiology of atrial fibrillation.Engineering in Medicine and Biology Magazine, IEEE, 25(6):15–23, 2006.

[3] Leo Breiman. Random forests. Machine learning, 45(1):5–32, 2001.

[4] Sergio Cerutti, Luca Mainardi, and Leif Sörnmo. Understanding atrial fibrillation:the signal processing contribution. Morgan & Claypool Publishers, 2009.

[5] Chih-Chung Chang and Chih-Jen Lin. Libsvm: a library for support vector ma-chines. ACM Transactions on Intelligent Systems and Technology (TIST), 2(3):27,2011.

[6] Valentina DA Corino, Frida Sandberg, Federico Lombardi, Luca T Mainardi, andLeif Sörnmo. Atrioventricular nodal function during atrial fibrillation: Model build-ing and robust estimation. Biomedical Signal Processing and Control, 2012.

[7] David A Fitzmaurice, FD Hobbs, Sue Jowett, Jonathon Mant, Ellen T Murray,Roger Holder, JP Raftery, S Bryan, Michael Davies, Gregory YH Lip, et al. Screeningversus routine practice in detection of atrial fibrillation in patients aged 65 or over:cluster randomised controlled trial. bmj, 335(7616):383, 2007.

[8] Apoor S Gami, Dave O Hodge, Regina M Herges, Eric J Olson, Jiri Nykodym,Tomas Kara, and Virend K Somers. Obstructive sleep apnea, obesity, and therisk of incident atrial fibrillation. Journal of the American College of Cardiology,49(5):565–571, 2007.

[9] Will Gersch, David M Eddy, and Eugene Dong. Cardiac arrhythmia classification:A heart-beat interval-markov chain approach. Computers and Biomedical Research,3(4):385–392, 1970.

131

Bibliography

[10] Alan S Go, Elaine M Hylek, Kathleen A Phillips, YuChiao Chang, Lori E Henault,Joe V Selby, and Daniel E Singer. Prevalence of diagnosed atrial fibrillation in adults.JAMA: the journal of the American Medical Association, 285(18):2370–2375, 2001.

[11] Ary L Goldberger, Luis AN Amaral, Leon Glass, Jeffrey M Hausdorff, Plamen ChIvanov, Roger G Mark, Joseph E Mietus, George B Moody, Chung-Kang Peng, andH Eugene Stanley. Physiobank, physiotoolkit, and physionet: Components of a newresearch resource for complex physiologic signals. Circulation, 101(23):e215–e220,2000.

[12] Michel Haissaguerre, Pierre Jaïs, Dipen C Shah, Atsushi Takahashi, Mélèze Hocini,Gilles Quiniou, Stéphane Garrigue, Alain Le Mouroux, Philippe Le Métayer, andJacques Clémenty. Spontaneous initiation of atrial fibrillation by ectopic beats orig-inating in the pulmonary veins. New England Journal of Medicine, 339(10):659–666,1998.

[13] Robert G Hart, Oscar Benavente, Ruth McBride, and Lesly A Pearce. Antithrom-botic therapy to prevent stroke in patients with atrial fibrillationa meta-analysis.Annals of internal medicine, 131(7):492–501, 1999.

[14] Jan Heeringa, Deirdre AM van der Kuip, Albert Hofman, Jan A Kors, Gerard vanHerpen, Bruno H Ch Stricker, Theo Stijnen, Gregory YH Lip, and Jacqueline CMWitteman. Prevalence, incidence and lifetime risk of atrial fibrillation: the rotterdamstudy. European heart journal, 27(8):949–953, 2006.

[15] Kui Hong, David R Piper, Aurora Diaz-Valdecantos, Josep Brugada, Antonio Oliva,Elena Burashnikov, José Santos-de Soto, Josefina Grueso-Montero, Ernesto Diaz-Enfante, Pedro Brugada, et al. De novo kcnq1 mutation responsible for atrial fib-rillation and short qt syndrome in utero. Cardiovascular research, 68(3):433–440,2005.

[16] M Hughes and GY Lip. Guideline development group, national clinical guideline formanagement of atrial fibrillation in primary and secondary care. National Institutefor Health and Clinical Excellence. Stroke and thromboembolism in atrial fibrilla-tion: a systematic review of stroke risk factors, risk stratification schema and costeffectiveness data. Thromb Haemost, 99(2):295–304, 2008.

[17] Z Ihara, V Jacquemet, J-M Vesin, and A Van Oosterom. Adaption of the standard12-lead ecg system focusing on atrial electrical activity. In Computers in Cardiology,2005, pages 203–205. IEEE, 2005.

132

Bibliography

[18] Business Intelligence. Data mining and optimization for decision making c.

[19] Ziad F Issa, Ziad Issa, John M Miller, and Douglas P Zipes. Clinical Arrhythmologyand Electrophysiology: A Companion to Braunwald’s Heart Disease. WB SaundersCompany, 2012.

[20] MV Kamath and EL Fallen. Correction of the heart rate variability signal forectopics and missing beats. Heart rate variability. Armonk: Futura, pages 75–85,1995.

[21] KT Konings, CJ Kirchhof, JR Smeets, HJ Wellens, Olaf C Penn, and Maurits AAllessie. High-density mapping of electrically induced atrial fibrillation in humans.Circulation, 89(4):1665–1680, 1994.

[22] Douglas E Lake and J Randall Moorman. Accurate estimation of entropy in veryshort physiological time series: the problem of atrial fibrillation detection in im-planted ventricular devices. American Journal of Physiology-Heart and CirculatoryPhysiology, 300(1):H319–H325, 2011.

[23] N Larburu, T Lopetegi, and I Romero. Comparative study of algorithms for atrialfibrillation detection. In Computing in Cardiology, 2011, pages 265–268. IEEE, 2011.

[24] David Thor Linker. Long-term monitoring for detection of atrial fibrillation, De-cember 8 2009. US Patent 7,630,756.

[25] Bernard Lown and MARSHALL WOLF. Approaches to sudden death from coronaryheart disease. Circulation, 44(1):130–142, 1971.

[26] R Mark and G Moody. Mit-bih arrhythmia database directory. Cambridge: Mas-sachusetts Institute of Technology, 1988.

[27] Gordon K Moe. On the multiple wavelet hypothesis of atrial fibrillation. Arch IntPharmacodyn Ther, 140:183, 1962.

[28] George B Moody. Wfdb applications guide.

[29] George B Moody and Roger G Mark. A new method for detecting atrial fibrillationusing rr intervals. Computers in Cardiology, 10:227–230, 1983.

[30] Patrick S Moran, Martin J Flattery, Conor Teljeur, Mairin Ryan, and Susan MSmith. Effectiveness of systematic screening for the detection of atrial fibrillation.The Cochrane Library, 2012.

133

Bibliography

[31] Niamh F Murphy, Colin R Simpson, Pardeep S Jhund, Simon Stewart, MichelleKirkpatrick, Jim Chalmers, Kate MacIntyre, and John JV McMurray. A nationalsurvey of the prevalence, incidence, primary care burden and treatment of atrialfibrillation in scotland. Heart, 93(5):606–612, 2007.

[32] Andrius Petrenas, Vaidotas Marozas, L Sornmo, and A Lukosevicius. An echo stateneural network for qrst cancellation during atrial fibrillation. 2012.

[33] Bruce M Psaty, Teri A Manolio, Lewis H Kuller, Richard A Kronmal, Mary Cush-man, Linda P Fried, Richard White, Curt D Furberg, and Pentti M Rautaharju.Incidence of and risk factors for atrial fibrillation in older adults. Circulation,96(7):2455–2461, 1997.

[34] Lawrence R Rabiner. A tutorial on hidden markov models and selected applicationsin speech recognition. Proceedings of the IEEE, 77(2):257–286, 1989.

[35] Andrew N Redington, Glen van Arsdell, and Robert H Anderson. Congenital dis-eases in the right heart. Springer, 2008.

[36] Shantanu Sarkar, David Ritscher, and Rahul Mehra. A detector for a chronic im-plantable atrial tachyarrhythmia monitor. Biomedical Engineering, IEEE Transac-tions on, 55(3):1219–1224, 2008.

[37] Lisong Shi, Cong Li, Chuchu Wang, Yunlong Xia, Gang Wu, Fan Wang, Chengqi Xu,Pengyun Wang, Xiuchun Li, Dan Wang, et al. Assessment of association of rs2200733on chromosome 4q25 with atrial fibrillation and ischemic stroke in a chinese hanpopulation. Human genetics, 126(6):843–849, 2009.

[38] Michael O Sweeney, Arthur J Moss, and Shirley Eberly. Instantaneous cardiac deathin the posthospital period after acute myocardial infarction. The American journalof cardiology, 70(18):1375–1379, 1992.

[39] Thomas J Wang, Martin G Larson, Daniel Levy, Ramachandran S Vasan, Eric PLeip, Philip A Wolf, Ralph B D Agostino, Joanne M Murabito, William B Kannel,and Emelia J Benjamin. Temporal relations of atrial fibrillation and congestive heartfailure and their joint influence on mortality. Circulation, 107(23):2920–2925, 2003.

[40] Peter H Whincup, Goya Wannamethee, Peter W Macfarlane, Mary Walker, andA Gerald Shaper. Resting electrocardiogram and risk of coronary heart disease inmiddle-aged british men. Journal of cardiovascular risk, 2(6):533–543, 1995.

134

Bibliography

[41] AC Wilson and JB Kostis. The prognostic significance of very low frequency ventric-ular ectopic activity in survivors of acute myocardial infarction. bhat study group.CHEST Journal, 102(3):732–736, 1992.

[42] W Zong, GB Moody, and D Jiang. A robust open-source algorithm to detect onsetand duration of qrs complexes. In Computers in Cardiology, 2003, pages 737–740.IEEE, 2003.

135

A. Appendix

A.1. MIT-BIH database

A.1.1. The annotation definitions

NORMAL N Normal beatLBBB L Left bundle branch block beatRBBB R Right bundle branch block beatBBB B Bundle branch block beat (unspecified)APC A Atrial premature beat

ABERR a Aberrated atrial premature beatNPC J Nodal (junctional) premature beatSVPB S Supraventricular premature or ectopic beat (atrial or nodal)PVC V Premature ventricular contraction

RONT r R-on-T premature ventricular contractionFUSION F Fusion of ventricular and normal beatAESC e Atrial escape beatNESC j Nodal (junctional) escape beatSVESC n Supraventricular escape beat (atrial or nodal)VESC E Ventricular escape beatPACE P Paced beatPFUS f Fusion of paced and normal beat

UNKNOWN Q Unclassifiable beatLEARN ? Beat not classified during learning

Table A.1.: Beat annotation keys.

136

A. Appendix

(AB Atrial bigeminy(AFIB Atrial fibrillation(AFL Atrial flutter(B Ventricular bigeminy

(BII 2° heart block(IVR Idioventricular rhythm(N Normal sinus rhythm

(NOD Nodal (A-V junctional) rhythm(P Paced rhythm

(PREX Pre-excitation (WPW)(SBR Sinus bradycardia

(SVTA Supraventricular tachyarrhythmia(T Ventricular trigeminy

(VFL Ventricular flutter(VT Ventricular tachycardia

Table A.2.: Rhythm annotation keys.

137

B. Appendix

B.1. SVM results including MAD

The following results on the MIT-BIH NSR database and on the MIT-BIH Arrhythmiadatabase were obtained by using all the features listed in Section 4.2, also includingMAD. These results were achieved in the same way as explained in Subsection 4.5.2.The optimal WL applied is 126 beat-segment. The window size space searched was[12 : 3 : 300].

minNbeat=10 beatsSp (%) 98.91Err (%) 1.09

(a)

series100 minNbeat=10 beatsSp (%) 99.41Err (%) 0.59

(b)

series200 minNbeat=10 beatsSe (%) 95.10Sp (%) 79.68Acc (%) 82.95PPV (%) 55.75NPV (%) 98.37Err (%) 17.05

(c)

Table B.1.: Results for LIBSVM, including MAD, on the MIT-BIH NSR database (B.1a),on the MIT BIH Arrhythmia database series100 (B.1b) and series200 (B.1c)by using minNbeat=10 beats.

138

implementation and testing of atrial fibrillation ... · implementation and testing of atrial...

Documents