machine learning for healthcare: infectious and ... · 5/29/2019  · introductionexisting...

18
Introduction Existing methods Proposed Datasets Results Conclusion References Machine Learning for Healthcare: Infectious and Cardiovascular Diseases AI for Good Global Summit, Geneva, Switzerland Girmaw Abebe Tadesse Post-doctoral Researcher, University of Oxford 29 May 2019

Upload: others

Post on 22-Jul-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Machine Learning for Healthcare: Infectious and ... · 5/29/2019  · IntroductionExisting methodsProposedDatasetsResultsConclusionReferences Datasets and GGH-MI Collected in southern

Introduction Existing methods Proposed Datasets Results Conclusion References

Machine Learning for Healthcare:Infectious and Cardiovascular Diseases

AI for Good Global Summit, Geneva, Switzerland

Girmaw Abebe Tadesse

Post-doctoral Researcher, University of Oxford

29 May 2019

Page 2: Machine Learning for Healthcare: Infectious and ... · 5/29/2019  · IntroductionExisting methodsProposedDatasetsResultsConclusionReferences Datasets and GGH-MI Collected in southern

Introduction Existing methods Proposed Datasets Results Conclusion References

Infectious diseases

Tetanus and hand, foot and mouth (HFM) diseases still poselife-threatening risks in low and middle income countries

The WHO estimated that 34,019 newborns died from neonataltetanus alone in 2015

Autonomic nervous system dysfunction (ANSD) is the main cause ofdeath by infectious diseases

Early detection of ANSD may allow more efficient use of resources

Page 3: Machine Learning for Healthcare: Infectious and ... · 5/29/2019  · IntroductionExisting methodsProposedDatasetsResultsConclusionReferences Datasets and GGH-MI Collected in southern

Introduction Existing methods Proposed Datasets Results Conclusion References

Wearable devices and physiological waveforms

(a) E-patch (b) Pulse Oximeter

(c) ECG waveforms (d) PPG waveforms

Figure: Wearable devices and collected ECG and PPG waveforms.

Page 4: Machine Learning for Healthcare: Infectious and ... · 5/29/2019  · IntroductionExisting methodsProposedDatasetsResultsConclusionReferences Datasets and GGH-MI Collected in southern

Introduction Existing methods Proposed Datasets Results Conclusion References

Methods ...

Main steps: acquisition and preprocessing, feature extraction andclassification

Page 5: Machine Learning for Healthcare: Infectious and ... · 5/29/2019  · IntroductionExisting methodsProposedDatasetsResultsConclusionReferences Datasets and GGH-MI Collected in southern

Introduction Existing methods Proposed Datasets Results Conclusion References

DatasetsTetanus: collected in Hospital for Tropical Diseases (2a)Hand, foot and mouth disease (HFM): collected in Children HospitalNo. 1 (2b)Both in Ho Chi Minh City, Vietnam

(a) Hospital forTropical Diseases

(b) Children HospitalNo. 1

Page 6: Machine Learning for Healthcare: Infectious and ... · 5/29/2019  · IntroductionExisting methodsProposedDatasetsResultsConclusionReferences Datasets and GGH-MI Collected in southern

Introduction Existing methods Proposed Datasets Results Conclusion References

Mild-Severe Tetanus classification

Table: Mild and Severe level classification of tetanus patients.

SVM (%) - Gaussian Kernel

Data A P R S F1

Baseline-ECG[8] 73.9 ± 0.9 75.48 ± 1.5 77.5 ± 0.4 67.73 ± 3.1 76.48 ± 0.6

IP 65.7 ± 1.3 63.2 ± 1.0 94.7 ± 0.2 27.8 ± 2.8 75.8 ± 0.7PPG 70.2 ± 1.0 70.4 ± 0.8 92.6 ± 0.3 29.5 ± 2.9 80.0 ± 0.5ECG 80.2± 0.7 78.4± 0.9 95.3 ± 0.5 53.4± 2.5 86.0± 0.4ECG+PPG 78.2 ± 1.0 75.3 ± 1.0 98.1± 0.0.3 43.1 ± 3.3 85.2 ± 0.6

Page 7: Machine Learning for Healthcare: Infectious and ... · 5/29/2019  · IntroductionExisting methodsProposedDatasetsResultsConclusionReferences Datasets and GGH-MI Collected in southern

Introduction Existing methods Proposed Datasets Results Conclusion References

Cardiovascular diseases

Cardiovascular diseases (CVDs) cause 17.7 million of deaths annuallyacross the globe, according to the WHO

CVD is especially prevalent in developing countries such as China,accounting for 45% of all deaths annually, with myocardial infarction(MI) as the most common cause

CVDs are commonly diagnosed by cardiologists via inspectingelectrocardiogram (ECG) waveforms

Data-driven approaches can help decision making processes in lessresource-constrained settings

Page 8: Machine Learning for Healthcare: Infectious and ... · 5/29/2019  · IntroductionExisting methodsProposedDatasetsResultsConclusionReferences Datasets and GGH-MI Collected in southern

Introduction Existing methods Proposed Datasets Results Conclusion References

Handcrafted features

Previous automatic approaches involve the extraction ofdomain-specific features on ECG waveforms [2-5]Lack robustness and generalisability

Deep frameworks

Recently become effective for CVDs [6-10]Features learned from data, and avoid feature engineeringGiven enough data, can generalize betterProvides transfer learning capabilityFusion of multiple leads?

Page 9: Machine Learning for Healthcare: Infectious and ... · 5/29/2019  · IntroductionExisting methodsProposedDatasetsResultsConclusionReferences Datasets and GGH-MI Collected in southern

Introduction Existing methods Proposed Datasets Results Conclusion References

Proposed framework

Figure: Overview of the proposed approach

Existing deep netowrk: Inception module (parallel streams)

Page 10: Machine Learning for Healthcare: Infectious and ... · 5/29/2019  · IntroductionExisting methodsProposedDatasetsResultsConclusionReferences Datasets and GGH-MI Collected in southern

Introduction Existing methods Proposed Datasets Results Conclusion References

Stacked spectrogram generation

Figure: Stacking of spectrograms from multiple leads

Page 11: Machine Learning for Healthcare: Infectious and ... · 5/29/2019  · IntroductionExisting methodsProposedDatasetsResultsConclusionReferences Datasets and GGH-MI Collected in southern

Introduction Existing methods Proposed Datasets Results Conclusion References

Spectrogram stacking orders

(a) Order-I (b) Order-II

(c) c©ecgpedia.org (d) Order-III

Figure: G1 = (I , II , III ), G2 = (V 1,V 2,V 3), G3 = (V 4,V 5,V 6) andG4 = (aVL, aVR, aVF ). Order-I = (G1,G2,G3,G4) andOrder-II = (G1,G4,G2,G3). Order-III as Order-II but stacked horizontally

2012 AlexNet

Page 12: Machine Learning for Healthcare: Infectious and ... · 5/29/2019  · IntroductionExisting methodsProposedDatasetsResultsConclusionReferences Datasets and GGH-MI Collected in southern

Introduction Existing methods Proposed Datasets Results Conclusion References

Spectrogram stacking orders

(a) Order-I (b) Order-II

(c) c©ecgpedia.org (d) Order-III

Figure: G1 = (I , II , III ), G2 = (V 1,V 2,V 3), G3 = (V 4,V 5,V 6) andG4 = (aVL, aVR, aVF ). Order-I = (G1,G2,G3,G4) andOrder-II = (G1,G4,G2,G3). Order-III as Order-II but stacked horizontally

2012 AlexNet

Page 13: Machine Learning for Healthcare: Infectious and ... · 5/29/2019  · IntroductionExisting methodsProposedDatasetsResultsConclusionReferences Datasets and GGH-MI Collected in southern

Introduction Existing methods Proposed Datasets Results Conclusion References

Datasets and

GGH-MI

Collected in southern areas of China12-lead (+3) ECG waveforms from 15578 MI and 5663 normal caseswith a duration of 10 s sampled at 500 HzAfter preprocessing: 11,853 MI and 5,528 normal patients

Parameter setup

Spectrogram: Hamming window, STFT, 1 s chunk, 95% overlappingCNN: pool 3:0 of Inception v3 (2048-D), ReLU activated dense layer(10-D), learning rate = 0.001; Adam optimiser, batch size= 128Split: 80%-train and 20%-testEnvironment: Tensorflow (Python 3.5) and MATLAB 2017 (SVM)

Page 14: Machine Learning for Healthcare: Infectious and ... · 5/29/2019  · IntroductionExisting methodsProposedDatasetsResultsConclusionReferences Datasets and GGH-MI Collected in southern

Introduction Existing methods Proposed Datasets Results Conclusion References

Table: Order-III stacking validated with SVM.

Methods Accuracy (%)

SVM (Linear) 80.1 ± 0.0SVM (Gaussian) 82.8 ± 0.1EE (Softmax) 85.8± 0.9EE (HL+Softmax) 85.5 ± 0.4

Figure: Different stacking orders.

Page 15: Machine Learning for Healthcare: Infectious and ... · 5/29/2019  · IntroductionExisting methodsProposedDatasetsResultsConclusionReferences Datasets and GGH-MI Collected in southern

Introduction Existing methods Proposed Datasets Results Conclusion References

Conclusion

Proposed an end-to-end trainable pipeline for the diagnosis of CVDs(from ECG waveforms)

Stacked spectrogram representation enables transfer learning

Experimented the effect of different stacking orders

Limitation: High-dimension of Inception feature

Future work: Auto-encoders, data augmentation, etc.

Page 16: Machine Learning for Healthcare: Infectious and ... · 5/29/2019  · IntroductionExisting methodsProposedDatasetsResultsConclusionReferences Datasets and GGH-MI Collected in southern

Introduction Existing methods Proposed Datasets Results Conclusion References

Acknowledgements

(a) Prof. David Clifton

(b) Dr Tingting Zhu

(c) Dr LouiseThwaites

(d) Dr Le Nhan

Page 17: Machine Learning for Healthcare: Infectious and ... · 5/29/2019  · IntroductionExisting methodsProposedDatasetsResultsConclusionReferences Datasets and GGH-MI Collected in southern

Introduction Existing methods Proposed Datasets Results Conclusion References

References (1)

[1] World Health Organization (WHO), Cardiovascular diseases (CVDs),www.who.int/news-room/fact-sheets/detail/cardiovascular-diseases-(cvds), Last accessed on 23 April 2019.[2] Heart Rate Variability, Standards of measurement, physiological interpretation, and clinical use. task force of the europeansociety of cardiology and the north american society of pacing and electrophysiology, Circulation, vol. 93, no. 5, pp. 10431065,1996.[3] Steven A Guidera and Jonathan S Steinberg, The signal-averaged P wave duration: a rapid and noninvasive marker of risk ofatrial fibrillation, Journal of the American College of Cardiology, vol. 21, no. 7, pp. 16451651, 1993.[4] Sarabjeet Mehta, Nitin Lingayat, and Sanjeev Sanghvi, Detection and delineation of P and T waves in 12-leadelectrocardiograms, Expert Systems, vol. 26, no. 1, pp. 125143, 2009.[5] S Dash, KH Chon, S Lu, and EA Raeder, Automatic real time detection of atrial fibrillation, Annals of biomedicalengineering, vol. 37, no. 9, pp. 17011709, 2009.[6] Pranav Rajpurkar, Awni Y Hannun, Masoumeh Haghpanahi, Codie Bourn, and Andrew Y Ng, Cardiologist-level arrhythmiadetection with convolutional neural networks,arXiv preprint arXiv:1707.01836, 2017.[7] Z. Zhao, S. Sarkka, and A. B. Rad, Spectro-temporal ECG analysis for atrial fibrillation detection, in In Proc. of IEEEWorkshop on Machine Learning for Signal Processing (MLSP), 2018, pp. 16.[8] X. Fan, Q. Yao, Y. Cai, F. Miao, F. Sun, and Y. Li, Multiscaled fusion of deep convolutional neural networks for screeningatrial fibrillation from single lead short ecg recordings, IEEE Journal of Biomedical and Health Informatics, vol. 22, no. 6, pp.17441753, 2018.

Page 18: Machine Learning for Healthcare: Infectious and ... · 5/29/2019  · IntroductionExisting methodsProposedDatasetsResultsConclusionReferences Datasets and GGH-MI Collected in southern

Introduction Existing methods Proposed Datasets Results Conclusion References

References 2)

[9] Bahareh Pourbabaee, Mehrsan Javan Roshtkhari, and Khashayar Khorasani, Deep convolutional neural networks and learningECG features for screening paroxysmal atrial fibrillation patients, IEEE Transactions on Systems, Man, and Cybernetics:Systems, , no. 99, pp. 110, 2017.[10] Zhenjie Yao, Zhiyong Zhu, and Yixin Chen, Atrial fibrillation detection by multi-scale convolutional neural networks, inProc. of IEEE International Conference on Information Fusion (Fusion), 2017, pp. 16.[11] Ming-Zher Poh, Yukkee Cheung Poh, Pak-Hei Chan, Chun-Ka Wong, Louise Pun, Wangie Wan-Chiu Leung, Yu-Fai Wong,Michelle Man Ying Wong, Daniel Wai-Sing Chu, and Chung-Wah Siu, Diagnostic assessment of a deep learning system fordetecting atrial fibrillation in pulse waveforms, Heart, 2018.[12] Supreeth Prajwal Shashikumar, Amit J Shah, Qiao Li, Gari D Clifford, and Shamim Nemati, A deep learning approach tomonitoring and detecting atrial fibrillation using wearable technology, in Proc. of IEEE EMBS International Conference onBiomedical & Health Informatics (BHI), 2017, pp. 141144.[13] Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li FeiFei, ImageNet: A large-scale hierarchical image database,in Proc. of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2009, pp. 248255.[14] Girmaw Abebe and Andrea Cavallaro, Inertial-vision: cross-domain knowledge transfer for wearable sensors, in Proc. ofInternational Conference on Computer Vision Workshops (ICCVW), Venice, Italy, 2017, vol. 7.[15] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, VincentVanhoucke, and Andrew Rabinovich, Going deeper with convolutions, in Proc. of the IEEE conference on computer vision andpattern recognition, 2015, pp. 19.