a deep convolutional neural network approach to classify

1

A Deep Convolutional Neural Network Approach toClassify Normal and Abnormal Gastric Slow Wave

Initiation from the High ResolutionElectrogastrogram

Anjulie S. Agrusa, Student Member, IEEE, Armen A. Gharibans, Member, IEEE, Alexis A. Allegra, StudentMember, IEEE, David C. Kunkel, and Todd P. Coleman, Senior Member, IEEE

Abstract—Objective: Gastric slow wave abnormalities havebeen associated with gastric motility disorders. Invasive studiesin humans have described normal and abnormal propagation ofthe slow wave. This study aims to disambiguate the abnormallyfunctioning wave from one of normalcy using multi-electrodeabdominal waveforms of the electrogastrogram (EGG). Methods:Human stomach and abdominal models are extracted from com-puted tomography scans. Normal and abnormal slow waves aresimulated along stomach surfaces. Current dipoles at the stom-achs surface are propagated to virtual electrodes on the abdomen,with a forward model. We establish a deep convolutional network(CNN) framework to classify normal and abnormal slow wavesfrom the multi-electrode waveforms. We investigate the effects ofnon-idealized measurements on performance, including shiftedelectrode array positioning, smaller array sizes, high body massindex (BMI), and low signal-to-noise ratio (SNR). We compare theperformance of our deep CNN to a linear discriminant classifierusing wave propagation spatial features. Results: A deep CNNframework demonstrated robust classification, with accuracyabove 90% for all SNR above 0dB, horizontal shifts within3cm, vertical shifts within 6cm, and abdominal tissue depthwithin 6cm. The linear discriminant classifier was much morevulnerable to SNR, electrode placement, and BMI. Conclusion:This is the first study to attempt, and moreover succeed, inusing a deep CNN to disambiguate normal and abnormal gastricslow wave patterns from high-resolution EGG data. Significance:These findings suggest that multi-electrode cutaneous abdominalrecordings have potential to serve as widely deployable clinicalscreening tools for gastrointestinal foregut disorders.

Index Terms—Artificial Intelligence, Convolutional NeuralNetwork, Electrogastrogram, Forward Model, Gastroenterology,Gastric Slow Wave, Machine Learning, Neural Network, VideoClassification

I. INTRODUCTION

GASTRIC motility and functional disorders are abun-dantly common, with functional dyspepsia and gastro-

paresis, the most common of such disorders, affecting 10%

Submitted for review on 11/26/2018. This work was supported in part bythe NSF Center for Science of Information (CSoI) under grant agreementCCF-0939370. A. S. Agrusa, A. A. Gharibans, and T. P. Coleman are withthe Department of Bioengineering, University of California, San Diego, LaJolla, CA 92093, USA. A. A. Allegra is with the Department of Electrical andComputer Engineering, University of California, San Diego. D. C. Kunkel iswith the Division of Gastroenterology, Department of Medicine, Universityof California, San Diego. Copyright (c) 2017 IEEE. Personal use of thismaterial is permitted. However, permission to use this material for any otherpurposes must be obtained from the IEEE by sending an email to [email protected].

and 1.5-3% of the population, respectively [1], [2]. One wayin which these disorders are currently classified is basedon symptom criteria, an insufficient metric due to patientsubjectivity and symptom overlap between varying diseaseetiologies [3]. The clinical gold standard for diagnosing upperGI disorders is gastric emptying, which typically involvesimaging after ingestion of a meal containing a radioactivetracer. Gastroparesis, or delayed gastric emptying, is diagnosedif an insufficient amount of the tracer has emptied out of thestomach, and occurs in a majority of Parkinsons and diabetespatients [4], [5]. However, published findings have had limitedsuccess in demonstrating correlation between gastric emptyingand symptoms [6] or symptom improvement [7]. As somedrugs improve symptoms but not gastric emptying and viceversa [8], [9], [10], the NIH Gastroparesis Consortium hasrecently recommended that improvement in gastric emptyingnot be considered a requirement for clinical drug trials ingastroparesis [6].

It has been proposed that gastric motility disorders (includ-ing gastroparesis) should be sub-typed to improve treatmentefficacy [11], [12]. One such etiology arises from abnor-malities in the neuromuscular patterns of the gut. Muchlike the electrical rhythms of the brain and heart, the gas-troenterological (GI) tract has electrical activity, governed bythe enteric nervous system. The electrical wave propagatingalong the serosal (outer) surface of the stomach, the gastricslow wave, oscillates at approximately 3 cycles per minute(0.05 Hz) and coordinates the smooth muscle contractions ofperistalsis during digestion [13]. Recent findings from invasiveelectrode recordings [14], [15] placed directly on the stomachsurface describe a normal functioning slow wave, with initia-tion in the pacemaker region and anterograde propagation ofequipotential rings, as well as abnormal initiation, with slowwaves generated outside the pacemaker region, propagatingsimultaneously in both anterograde and retrograde directions(Fig. 1).

This archetypal dysrhythmia, as well as others, has beenassociated with gastric functional and motility disorders, in-cluding gastroparesis and functional dyspepsia [16], [17], [18],[19], [20], [21], [22]. Other work, also involving electrodesplaced directly on the surface of the stomach, has shown theability of gastric slow wave abnormalities to predict states ofnausea and vomiting [23]. Recent advances in the development

2

Fig. 1. Anterior view of stomach model extracted from human research subject’s abdominal CT scan. a) Anatomical regions of the stomach with pacemakerregion highlighted in gray. b) Normal gastric slow wave initiation in the pacemaker region and propagation of equipotential rings (dashed) in anterogradefashion toward the intestinal end. c) Abnormal initiation of the gastric slow wave outside of the pacemaker region. Propagation of distinct equipotential ringsin the anterograde direction (left ring) and retrograde direction (right ring). These initiation and propagation patterns were characterized from invasive studies[14], [15].

of gastric pacemaking devices [24], [25], [26], [27] that caninitiate, entrain and/or normalize propagation of the gastricslow wave has added impetus to the identification of patientswith gastric arrhythmias who may benefit from an electricalpacing intervention. While monitoring the slow wave withelectrodes placed on the stomach during invasive surgery isnot clinically scalable as a screening tool, it does providevalidation for electrode-based characterization of the slowwave.

There is an unmet need to develop a widely-deployablescreening tool that is i) non-invasive, ii) able to make directclaims on gastric myoelectric dysfunction, and iii) able toguide effective treatments. In a similar fashion to that of thebrain and heart, waves generated at the stomach’s surfacepropagate to the skin via volume conduction. These voltagescan be measured with cutaneous electrodes, namely the elec-trogastrogram (EGG) in the case of gastric electrophysiology.Traditionally, the EGG is comprised of a small number ofelectrodes (typically 3-4) and spectral analysis within the 0.05Hz frequency range is performed [13]. Whereas some findingsbased upon these spectral analyses have shown separationbetween patients with GI symptoms and healthy controls [28],[29], [30], others have found no such differences [31], [32].

Although it is non-invasive, the EGG has not encounteredwidespread clinical adoption because of these contradictoryfindings and because of a lack of consistency with invasiveclinical gold standards [33], [34]. Numerous findings haveshown that spatial dysrhythmias, including the abnormal initi-ation, typically occur within the normal frequency range of thegastric slow wave (0.05 Hz) [15], [35], [36]. As such, currentliterature has converged on the notion that spectral-basedanalyses are unable to capture these spatial abnormalities[37], [38]. This may explain why spectral analyses have beeninconsistent in their findings.

Since the most significant features of the slow wave in termsof classifying spatial abnormalities are its initiation locationand propagation direction, it is reasonable to hypothesizethat a multi-electrode recording system with higher spatialresolution may be beneficial in detecting such abnormalities.Indeed, recently the high-resolution EGG (HR-EGG), a multi-

electrode array of 25 or more electrodes, has been shown tocapture slow waves with high spatial resolution and extractmeaningful spatial (as opposed to spectral) features, includingthe instantaneous slow wave direction at any given pointin space [39]. With the advent of these new techniques,in addition to novel artifact rejection methods [40], spatialfeatures have been shown to correlate with symptom incidenceand severity in gastroparesis and functional dyspepsia patients[41].

Taking it a step further than symptom correlation, it has yetto be determined if one can classify a normal from abnormalslow wave with HR-EGG recordings. Since the HR-EGGhas been a recent development, modern machine learninghas been seldom employed to analyze EGG waveforms andattempt ‘normal’ and ‘abnormal’ classification. Training amachine learning algorithm requires that data be labeled withground-truth metrics. Unlike in the field of cardiology, whereelectrophysiological abnormalities can be detected by visualinspection of an ECG, there are no current methods to obtaincutaneous EGG recordings labeled with spatial gastric slowwave abnormalities. Symptom reports, for example, cannot beused to label data, as symptoms can arise from a variety ofdisease etiologies [3], one of which being dysrhythmias of theenteric nervous system. In addition, a small subset of patientswith diagnosed functional and motility disorders lack spatialslow wave abnormalities [15], so clinical diagnoses cannotbe used to robustly label the data either. Currently acceptedapproaches to label these spatial abnormalities involve placingelectrodes directly on the serosal surface of the stomach duringan open abdominal surgical procedure [14], [15]. Because ofthis, simultaneous cutaneous and serosal recording has yet tobe performed in humans. Thus, a simulation for which groundtruth labels are established a priori makes possible the trainingof a machine learning algorithm with multi-electrode EGGwaveforms.

The simulation of electrical conduction patterns on anorgan’s surface is not unprecedented. Recent studies in car-diology, for example, simulate abnormal cardiac electrophys-iology along with normal cardiac function on subject-specificMRI-constructed and CT-constructed heart models to perform

3

subsequent analysis, some of which even simulate volumeconduction and propagation of voltages to virtual cutaneouselectrodes [42], [43], [44]. With an analogous methodology,in this work we simulate underlying conduction patterns on thestomach and propagate voltages to virtual cutaneous electrodesin a human subject-specific manner. Once labeled waveformsare obtained, machine learning is possible. Machine learningstudies in cardiology [45], [46], [47], [48], [49] follow thepremise that within the same class, the conduction patternon the organ is consistent, and across different classes, theconduction pattern is different. This same paradigm can beapplied to gastric electrophysiology.

Recent advances in deep learning have enabled break-through performance in healthcare applications [50], includingthe analysis of physiological signals such as the electrocardio-gram [45], [46], [47], electroencephalogram [51], [52], [53],and electromyogram [54], [55]. We hypothesize that such toolscan also be used to accurately classify normal versus abnormalslow wave propagation from cutaneous HR-EGG recordings,with robustness to high anatomical variability between sub-jects, including BMI, stomach shape, and stomach positionrelative to cutaneous landmarks. Specifically, we hypothesizethat a three-dimensional convolutional neural network (3DCNN) will succeed in such a classification task, with reasoningas follows.

The state-of-the-art class of neural network architectureused in image-classification tasks (i.e. classifying animal im-ages) is a convolutional neural network (CNN) [56], [57], [58],[59]. Similarly, three-dimensional CNNs are an accepted best-practice in video-classification tasks [60], [61], [62]. In a videorecognition task, the 3D CNN ‘sees’ the video as an ‘N’ by‘N’ grid of discrete pixels with varying intensity values overtime. The data collected by a square multi-electrode array, asseen in this study, is an ‘N’ by ‘N’ grid of voltage values overtime. Here, we make the novel analogy of a multi-electroderecording to a video so that we can subsequently apply thesevideo classification techniques from deep learning.

In this paper, we establish a deep CNN framework todisambiguate normal from abnormal slow wave initiation withcutaneous multi-electrode data. We also investigate the effectsof non-idealized measurements on classification accuracy,which include shifted array positioning with respect to thetrue center of the stomach, smaller array sizes, high BMI,and low signal-to-noise ratio (SNR). Finally, we comparethe performance of our deep CNN to a traditional machinelearning approach using spatial HR-EGG features of slowwave propagation. These objectives are carried out througha series of simulations of the normal and abnormal slow waveon a variety of computed tomography (CT)-extracted stomachsfrom human research subjects, propagated to virtual electrodeson the surface of subject-specific abdominal models.

With an ECG, identification of certain cardiac abnormalitiescan be done via visual inspection of the waveform. Thisis not the case for the EGG. Therefore, the methodologypresented in this study involving machine learning of featureswas necessary in order to separate normal and spatially ab-normal EGG waveforms. This is the first study to attempt,and moreover succeed, in using a deep CNN to disambiguate

normal and abnormal gastric slow wave patterns with HR-EGG waveforms. These findings suggest that the use of multi-electrode cutaneous abdominal recordings combined with ma-chine learning algorithms may serve as a promising avenueof further research in developing widely-deployable clinicalscreening tools for gastrointestinal foregut disorders.

II. METHODS

A. The Forward Model

We simulated two modalities of the gastric slow wave onstomach geometry extracted from CT scans from 40 humanresearch subjects: (i) normal initiation and propagation,and (ii) abnormal initiation with bifurcated retrograde andanterograde propagation (Fig. 1b-c). First, we defined a novel‘Medial Curve’ that captured the stomach’s characteristicshape (Section II-A1). We then sliced the stomach geometryinto thin planes, with each plane containing a discrete pointon the Medial Curve and oriented organoaxially (such thata normal vector to the plane was parallel to the MedialCurve’s tangent vector at the aforementioned point). Wesolved for the serosal stomach voltage at each point onthe Medial Curve, then applied it to all points within thecorresponding slice (Section II-A2), creating equipotentialrings across the entire stomach surface. Next, we calculatedsubject-specific abdominal boundaries (Section II-A3) andpropagated the serosal voltages to electrodes placed at theseabdominal boundaries (Section II-A4). Finally, we perturbedthe simulations by varying electrode placement, abdominaltissue depth, signal-to-noise ratio (SNR), and electrodearray size, with combinations of these yielding more than2,400 independent simulations from each stomach model(Section II-A5). Ethical approval for this work was obtainedfrom the institutional review board at the University ofCalifornia, San Diego.

1) Defining the Medial Curve: The Medial Curve providedthe backbone for traveling slow wave propagation. This curve,C : [−1, 1] → R3, is a collection of (x, y, z) points on thesurface of the stomach – parameterized from the esophageal

Fig. 2. Medial Curve (black) that captures the characteristic shape of thestomach, solved for with the optimization paradigm defined in (3). Units inmm.

4

to pyloric end – that traces the stomach’s unique characteristicshape (Fig 2). To find this curve, we manually extractedand voxelized the three-dimensional stomach model fromCT images [63], [64]. The voxelized representation of thestomach was iteratively thinned to its ‘geometric skeleton’[65], [66] (Supplemental Materials Section A), which is a set,B = {p1, p2, . . . , pM}, of M points where each pm ∈ R3. Thegeometric skeleton preserves stomach topography and consistsof a central spine and branches, similar to the vasculature ofa leaf. The spine of the geometric skeleton best approximatesthe characteristic shape of the stomach. M was 5374 points, onaverage. Although via visual inspection, the spine of the geo-metric skeleton roughly traces the organoaxis of the stomach, itis unknown which points in B comprise the spine and whichcomprise the branches. Furthermore B is not parameterizedfrom the esophageal end to the pyloric end. Considering this,and since gastric slow wave propagation occurs organoaxially,we developed a method using all points in B to construct acontinuous and differentiable function C(ζ) that roughly tracesthe organoaxis of the stomach.

The Medial Curve, C(ζ), was constructed as a linearcombination of Legendre polynomials (φk : k ≥ 0):

φ0(ζ) = 1, φ1(ζ) = ζ, φ2(ζ) =1

2(3ζ2 − 1), . . .

We chose to construct C(ζ) using Legendre polynomialsbecause they each are continuous and differentiable and forman orthogonal basis of functions on the [-1,1] interval. As such,the weighted combination of the Legendre polynomials used todefine the Medial Curve is still continuous and differentiable.Continuity allowed for mapping of slow wave propagationalong the organoaxis. Differentiability allowed us to buildequipotential rings as thin planes comprising a point on theMedical Curve and its associated tangent vector, for fullserosal propagation.

A vector-valued coefficient gk∈ R3 is associated with

the kth Legendre polynomial φk so that by defining thecomposite matrix G containing vector-valued coefficients ascolumn vectors, G =

[g1g2. . . g

K

], we can succinctly

describe the curve for any matrix G as:

C(ζ;G) =

K−1∑k=0

gkφk(ζ). (1)

There exists a matrix, G∗, for which C(ζ;G∗) best fits thespine of the geometric skeleton. To find G∗, we defined theobjective function JB(G) to be minimized in (2) below. Itwas necessary to incorporate both (2a) and (2b) into JB(G)in order to ensure that each point on the geometric spine wasnear some point on the Medial Curve, (2a), and each pointon the Medial Curve was near some point on the geometricspine, (2b). Without the inclusion of (2b), there are an infinitenumber of curves, C(ζ), for which (2a) holds that extend pastthe esophageal and pyloric boundaries.

JB(G) =

(M∑m=1

minζ∈[−1,1]

‖C(ζ)− pm‖22

)(2a)

+

(∫ 1

−1min

m∈{1,...,M}‖C(ζ)− pm‖22dζ

)(2b)

We solved for G∗ by minimizing (2), using the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm:

G∗ = arg minG∈R3×K

JB(G). (3)

The optimal Medial Curve was given by C∗(ζ) ≡ C(ζ;G∗).

2) Serosal Voltage Simulation: We simulated voltage po-tentials on the full serosal surface of the stomach (Fig 3).We modeled the normal and abnormal wave initiation andpropagation patterns to be consistent with recent findings frominvasive human recordings [14], [15]. This was implementedby solving the one dimensional wave equation (4) at discretepoints along the Medial Curve via finite difference analy-sis with a temporal step size calculated using the Courant-Friedrichs-Lewy condition [39]:

∂2S(ζ, t)

∂t2= c2(ζ)

∂2S(ζ, t)

∂ζ2. (4)

In (4), S(ζ, t) is voltage as a function of both time t andposition ζ on the Medial Curve. Wave speed, c(ζ), is a functionof the Euclidean position C∗(ζ) corresponding to positionζ on the Medial Curve, which is highest in the pacemakerregion (6.0 mm/s), second-highest in the antrum (5.9 mm/s),and lowest in the corpus (3.0 mm/s). We also imposed trendsin wave amplitude consistent with the current literature [14],[15]; amplitudes in the pacemaker, antrum, and corpus regionswere 0.57 mV, 0.52 mV, and 0.25 mV, respectively. At thetwo boundaries, we employed Mur’s boundary condition toprevent waves from reflecting back into the stomach. Finally,we applied each discrete voltage, S(ζ, t), in equipotential ringsoriented organoaxially on the stomach associated with pointson the Medial Curve (C∗(ζ) : ζ ∈ [−1, 1]). Altogether thisdefined the potential at all points on the stomach’s serosalsurface.

Fig. 3. Wave propagation at the serosal surface of the stomach with voltagevalues displayed as heat map. Normal slow wave initiation (top) and abnormalinitiation (bottom), with each successive snapshot (left to right) at time t =0,4,8,12 seconds. The voltage at each point on the serosal surface of thestomach is solved for in Section II-A2 in accordance with the description inthe most recent literature [14], [15], [39].

3) Abdominal Boundaries and Electrode Array Placement:To approximate any research subject’s cutaneous abdominalboundaries, we constructed a dense cylindrical point cloudwhere each cross section is an ellipse. The major and mi-nor axes of the ellipse were defined from measurements ofthe projection of the research subject’s abdominal CT scan

5

Fig. 4. a) Top-down projection of human research subject’s abdominal CTscan onto the X,Z plane. Approximated elliptical cross section (red) fromwhich elliptical radii were measured. b) Top-down projection of stomach andelliptic cylindrical point clouds onto the X,Z plane. Abdominal ellipse (red)and inner abdominal boundary (blue), separated by a 1 cm cavity. Abdominaland inner ellipses are translated laterally by z0 via optimization described inSection II-A3. The anterior stomach and abdominal surfaces are as close asthey can be to one another while imposing the constraint that no point on thestomach crosses the inner abdominal boundary. This is the 1 cm ‘baseline’abdominal tissue depth.

onto the X,Z plane (Figure 4a). The point cloud was placedconcentrically with the stomach.

We assume the smallest possible thickness of the abdominalwall (skin, adipose tissue, muscle, etc.) to be 1 cm. To createa 1 cm shell between the inner and cutaneous boundaries, wecreated an inner elliptic cylinder (blue ellipse in Fig. 4b), withelliptical radii 1 cm less than the radii of the cutaneous ellipticcylinder.

In order for the point on the anterior surface of the stomachclosest to the abdominal wall to be just within the inner ellipticcylinder, we laterally translated the two elliptic cylinders in theposterior direction by z0, which was minimized subject to theconstraint that each point from the stomach lies within theinner ellipse shifted by z0:

(xi)2

(a− 1)2+

(zi − z0)2

(b− 1)2≤ 1, i = 1, . . . ,H. (5)

where (xi, zi) is the ith point from the stomach, a is themajor radius and b is the minor radius of the cutaneous ellipse.This constrained convex optimization problem was solved in asubject-specific manner due to subject-to-subject stomach andabdominal variability (Supplemental Materials Section G). Hwas, on average, 7446 points.

At this point, the cutaneous boundary was centered at(0, z0). To model BMI trends, the cutaneous boundary wassubsequently translated in the anterior direction (increasingthe separation between the anterior stomach and abdominalboundaries) by u cm, resulting in its center now being(0, z0 + u). We defined the ‘abdominal tissue depth’ to be1 + u cm for u ≥ 0.

We constructed an electrode array on the anterior face ofthe cutaneous surface. Electrode ‘anchor points’ were placedin a grid, spaced 2 cm apart both vertically and along theelliptical arc length. Full electrodes were defined as the closest32 points within the elliptic cylinder point cloud to eachanchor point. When computing electrode voltages (SectionII-A4), the voltage at each full electrode is an average ofvoltages at these these 32 points. The entire 100-channel

electrode array was centered with respect to the stomach’s(x, y) center.

4) Propagating Voltage from the Serosal Surface of theStomach to Cutaneous Electrodes: At each electrode n, wecalculated the voltage at time t, Vn(t), using the principle ofsuperposition as a linear combination of D ‘source’ currentdipole moments, (Id(t) : d = 1, . . . , D). Each current dipoleId(t) is taken from the serosal simulation in Section II-A2 andeither corresponds to the potential along the Medial Curve attime t, S(ζ, t), or one of the associated equipotential ringson the plane normal to the tangent vector of S(ζ, t) at pointζ. We also model additive white Gaussian measurement noiseNn(t), altogether giving rise to:

Vn(t) =

D∑d=1

Wn,dId(t) +Nn(t). (6)

Source weights are given by:

Wn,d =cos θ

4πσr2n,d(7)

where θ is the angle between the organoaxial current dipoleand the electrode, rn,d is the distance between the source,d, and electrode, n, and σ = 0.125 S/m is the conductivityof the medium between the stomach and electrode interface(i.e., homogenized between abdominal bone, muscle, adiposetissue, and skin) and was chosen to be between that of muscle(0.5 S/m) and fat (0.1 S/m) [67], [68]. Since EGG is at 0.05Hz, we did not incorporate capacitive effects, as has beendone for volume conduction modeling with electrophysiologicsignals at higher frequencies [69], [70]. In this work, whenwe shift the electrode array on the surface of the abdomenand present findings, we term some electrodes as ‘out ofrange of the stomach’s electrical activity.’ The rationale ofthis is due to source weights being governed by i) the distancebetween the electrode and voltage source on the stomach andii) the angle between these two points (7). As such, in thiscontext, electrode array shifts will cause high attenuation ofthese weights. Voltage at these electrodes will be governedprimarily by noise, therefore justifying the terminology ‘outof range of electrical activity.’

5) Creating HR-EGG Datasets: For each simulation of theslow wave on the serosal surface of the stomach, we generatedseveral independent HR-EGG datasets via manipulation ofelectrode array placement, abdominal tissue depth, electrodearray size, and signal to noise ratio (SNR). We shifted theelectrode array horizontally such that the center of the arraymoved along the abdominal elliptical arc from -12 cm to12 cm in increments of 3 cm. Likewise, we shifted thecenter of the array vertically from -12 cm to 12 cm inincrements of 3 cm. Each stomach-abdomen model began witha minimum spacing of 1 cm between stomach and abdominalboundaries (Section II-A3). From this initial positioning, wemoved the electrode array laterally away from the stomachup to 15 cm in increments of 1 cm, to simulate changes inabdominal tissue depth, with the assumption that this suggests

6

Fig. 5. Convolutional neural network architecture schematic, described in detail in Section II-B1.

trends in BMI. This method of adding tissue depth created aclose resemblance to abdominal tissue regions seen from theabdominal CT scans.

We added white Gaussian noise to all simulated HR-EGGdatasets. It should be noted that this ‘measurement noise’is an addition to the noise already present in the simulatedvolume conduction solution (6). In the training datasets, wecalculated the noise variance such that the ratios of mediansignal power variance to added noise variance yielded a SNRof 20 dB. In the same fashion, we added white Gaussiannoise to the test datasets used in the experiments to assessclassifier performance over a SNR range of -40 to 40 dB.We added noise in a related but slightly different mannerto test datasets used in the experiments to assess classifierperformance over horizontal and vertical electrode array shifts,as well as abdominal tissue modulation. In these latter threetest datasets, we calculated the noise variance once for allHR-EGG datasets resulting from each distinct stomach modelsuch that the resulting SNR was 10 dB at the centeredconfiguration (i.e., horizontal and vertical shifts were 0 cm,and abdominal tissue depth was at its 1 cm baseline). We thenadded white Gaussian noise with these calculated variances toall horizontally, vertically, and laterally shifted permutationsof the HR-EGG dataset generated from the particular stomachmodel.

Previous studies have used configurations with fewer than100 electrodes. For example, the original HR-EGG recordingsutilized 25 electrode arrays [38], [39] and ambulatory systemscapable of recording from 9 electrodes have recently beenestablished [40]. As such, we trained and tested smaller squareelectrode arrays with 25 and 9 channels and added noise forall training and test datasets of the smaller arrays as describedabove.

B. Machine Learning Classification of HR-EGG Waveforms

We constructed and trained a convolutional neural network(CNN) to classify normal and abnormal HR-EGG electrodedata (Section II-B1). For comparison, we computed wavepropagation spatial features to train a linear discriminant

analysis (LDA) classifier (Section II-B2).

1) Neural Network Architecture: The CNN consisted offour sequentially ordered 3D-convolutional layers, each with32 filters, followed by two dense (fully connected) layers, with64 and 2 filters, respectively. A schematic diagram is shown inFig. 5. Layers 1 through 5 were activated with a rectified linearunit (ReLU) and the ultimate dense layer was activated with asoftmax threshold. We optimized weights via back propagationwith an Adam optimizer, using the following hyperparameters:learning rate = 0.001, β1 = 0.9, and β2 = 0.999. Loss wascomputed via categorical cross-entropy. The outputs of layerstwo and four underwent max pooling (dimension 2,2,2) inorder to reduce the run time of the algorithm. We addeda dropout function after the second, fourth, and fifth layersto prevent over-fitting [71]. We carried out all computationsthrough Tensorflow [72] and programmed nodes using theKeras API [73].

The general form of the neural network architecture wasadapted from state-of-the-art 2D and 3D CNN publishedarchitectures [56], [57], [60], [61], [62]. Namely, this generalform is a sequence of several convolutional layers followed bya few dense layers, in which the ultimate dense layer has ‘n’filters corresponding to the ‘n’ classes in the data (in this case2). The number of filters in each layer is typically task anddata-specific, with some studies using 2 filters in a layer andothers using 2,000 [56], [57], [58], [59], [60], [61], [62]. Assuch, there is not a standard number of filters to use in CNNlayers. We implemented 32 filters for each convolutional layer,which was sufficient for the CNN to accurately learn featuresand classify while keeping the computational cost low. State-of-the-art CNNs [56], [57] train on datasets on the order of 1.2million samples (i.e. ImageNet), which allows a large set ofweights arising from a very deep network to be learned. In thisstudy, however, the training set contained only approximately6,000 samples, which led us to choose the depth of the networkto i) be sufficient for accurate feature identification, ii) avoidover-fitting, and iii) be computationally efficient.

We chose a convolution stride of 1 to retain resolution atthe convolution stage, since the aforementioned architecture

7

design reduced the computational load enough to make thisstride feasible. We used the same kernel size for filters ineach layer in accordance with recent findings suggesting thathomogeneous kernel sizes throughout the network are optimalwhen classifying videos through a 3D CNN [61]. The kernelsize was chosen by the input size of the data. State-of-the-art CNNs [56] trained on data with image sizes of 256x256pixels (i.e. ImageNet) and had kernel sizes ranging from 11x11in beginning layers to 3x3 in latter layers. Each simulatedwaveform in our study had an input size of 10x10x300, wherethe third axis was the temporal component. As such, wechose the first two dimensions of our kernel size, 2x2, to besufficiently smaller. The final dimension of our kernel size,2, was also chosen to be small in order to preserve temporalresolution, overall yielding a convolutional kernel of 2x2x2.

Before undergoing any training, the cross validation‘accuracy’ is 0.5, since CNN weights are randomly initialized.As seen in Fig. 7, the cross validation accuracy increasesto above 0.7 after just one epoch. The accuracy continuesto rise throughout the course of training, but this first jumpin accuracy (>0.25) is the most prominent. Because of thisproperty, we can say the neural network is converging ifthis jump in cross validation accuracy is seen after 1 epoch.When constructing the CNN architecture, we performedconvergence tests in which we tested perturbations of thearchitecture (i.e. more or less layers, number of filters, etc.)for convergence. While there may be a slightly different 3DCNN architecture (i.e. more filters in one of the layers, orlarger kernel size) that slightly improves the performanceof the network, we found that convergence is robust acrossminor perturbations in architecture. A source of future workcould involve the tuning of hyperparameters and iterativelytesting architectural parameters to slightly improve CNNclassification performance.

2) Linear Discriminant Analysis: We represented the posi-tion of each electrode ((xn, yn) : n = 1, . . . , N) with respectto the coordinate system of the curved surface on the abdomen(Fig. 6a). An example of the voltage Vn(t), associated withtwo electrodes at different vertical locations, where a phaseshift is present, is given in Fig. 6b.

We extracted HR-EGG spatial features of the slow wavefrom the cutaneous waveforms by first performing the Hilberttransform on each individual waveform in the array to extractinstantaneous amplitude and phase information:

Vn(t) + iHb[Vn(t)] = an(t)eiφn(t), n = 1, . . . , N (8)

to construct instantaneous phase information as a function oftime and space:

φ(xn, yn, t) ≡ φn(t), n = 1, . . . , N.

The spatial gradient of instantaneous phase,∇φ(x, y, t), was constructed at each cutaneous electrode((xn, yn) : n = 1, . . . , N).

Since the wave velocity vector v is normal to contours ofconstant phase, it satisfies:

v(x, y, t) ∝ −∇φ(x, y, t). (9)

Fig. 6. a) Cartesian coordinates (x,y,z) translated to (x,y) coordinates alongthe curved abdominal surface. Spatial analyses in Section II-B2 are performedin (x,y) coordinates. b) Conceptual schematic of two out-of-phase oscillatorysignals at different vertical spatial locations, y=0 (blue) and y=1 (brown). Attime t = 0, the change in the oscillatory signal with respect to a unit changein y is π. c) Phase gradient abstraction with wave velocity vector shown inblue. Both planar and non-planar waves have equal speeds (velocity vectormagnitude) but the non-planar wave has non-homogeneous wave directions.

We define features pertaining to the time-averaged directionof the wave velocity at any electrode in terms by exploiting(9) as follows:

Λn =1

T

T∑t=1

tan−1 (∇φ(xn, yn, t)) , n = 1, . . . , N. (10)

Speed calculated from measurements directly on the stomach[14] has been shown to follow different trends with regard todisease than speed calculated from the aggregated slow waveactivity propagated to the cutaneous abdominal surface viavolume conduction [39], [41], [74]. As such, the wave speedwas not used as a feature to train our classifier. In order toinclude a feature that varies between anterograde propagation(normal function) and the combination of retrograde andanterograde propagation (abnormal initiation), we define thephase gradient directionality (PGD) as the ratio of the normof the spatially averaged electrode velocities with the spatialaverage of the norm of electrode velocities:

PGD(t) =‖ 1N

∑Nn=1∇φ(xn, yn, t)‖2

1N

∑Nn=1 ‖∇φ(xn, yn, t)‖2

, t = 1, . . . , T (11)

where in (11), velocities are replaced with −∇φ by virtueof (9). The PGD is a measure of how aligned the wavevelocities at different electrodes are at any point in time,and equals 1 for planar waves, since in such cases the wavevelocities at each location are equal. From Jensen’s inequality,the numerator is upper bounded by the denominator and so0 ≤ PGD(t) ≤ 1. Thus the PGD is a measure of how ‘close’the activity is to being a plane wave, which is akin to whatoccurs in normal slow-wave HR-EGG waveforms, exhibitingpredominantly anterograde propagation. As waves stray fromthis planar phenotype, as in the abnormal slow wave seen inHR-EGG waveforms, the PGD decreases (i.e., the PGD ofwhite Gaussian noise is with low probability above 0.5 [39]).

A conceptual schematic of PGD is shown in Fig. 6c. Thedenominator of (11) represents the average magnitude of thespatial phase gradient. In Fig. 6c, the magnitude of the spatialphase gradient is the same at all electrodes (all arrows are

8

of equal magnitude). In the planar wave case, the partialderivative of phase with respect to x is the same at allspatial locations, whereas in the non-planar wave, the partialderivative of phase with respect to x at the two leftmostelectrodes is equal to the negative value of the partial derivativeof phase with respect to x at the rightmost electrodes. Thesame pattern is seen in the partial derivative of phase withrespect to y. Thus (11) is zero for the non-planar wave shownin this figure, as wave velocities are of the same magnitudebut have opposing directions.

At certain horizontal shifts of the electrode array, the arraymay be in a range where it records mostly synchronizedretrograde activity in the abnormal simulations. This willcause both the normal and abnormal waves to resembleplanar waves (though propagating in different directions).The PGD will be indistinguishable between classes in thiscase but the temporal average of wave directions at eachspatial location will be different. As such, we used the timeaveraged directions at each electrode (Λn : n = 1, . . . , N)as well as the median of (PGD(t) : t = 1, . . . , T ) to createN + 1 features for training a linear discriminant analysis(LDA) classifier.

3) Training and Testing Both Models: Prior to training themodel, we initialized all neural network weights with a GlorotUniform distribution. We trained the CNN for 15 epochs inbatches of 40 training simulations at a time and cross validatedon a random 25% of the training set. The cross validationaccuracy and loss at each epoch during training are plottedin Fig. 7. All values of each of the horizontal and verticalelectrode array shifts, as well as abdominal tissue depths, wererepresented in the training set. When sampling one of theseshifts, the other two shifts type would remain centrally located.For example, when we varied the abdominal tissue depth from1 to 15 cm, horizontal and vertical electrode arrays shifts werekept within -3 to 3 cm. Likewise, in the test datasets, we heldthe types of perturbations not explicitly under investigation ata range close to zero. This ensured classification performancedependence on one independent variable at a time. A fulldescription of all datasets used in training and test experi-ments can be found in the Supplemental Materials Section

Fig. 7. Cross validation accuracy (blue) and loss (red) during training.Accuracy is shown on a [0,1] scale.

Fig. 8. Comparison of neural network performance with LDA performance.(a) Neural network and LDA performance as a function of SNR. (b) Neuralnetwork and LDA performance as a function of horizontal electrode arrayshifts along the curved surface of the abdomen. Shift distances correspondto arc lengths along the cylindrical ellipse. (c) Neural network and LDAperformance as a function of vertical electrode array shifts. (d) Neural networkand LDA performance as a function of abdominal tissue depth. All testdatasets (b-d) had a baseline SNR of 10 dB (Explained in detail in SectionII-A5).

B. Test data for all experiments was completely independentfrom training data. HR-EGG waveforms, including all arrayperturbations, from 28 randomly selected stomach models wasused solely in training the CNN, while waveforms from theremaining 12 stomach models was used for testing. All trainingand testing regimes were repeated six times (i.e., six uniquerandom splits of 28:12 stomach models).

III. RESULTS

In both the LDA and neural network approaches, the clas-sification accuracy saturates at high and low SNR extrema(Fig 8a). Within the SNR range of -12 to 16 dB, the CNNoutperforms the LDA classifier and both the LDA and neuralnetwork accuracy curves increase mechanistically similarly,with approximately equal slopes. When the LDA accuracycurve is shifted by -17 dB, accuracy as a function of shiftedSNR more closely resembles that of the neural network.(Supplemental Materials Section C). Thus, the ‘gain’ from useof the neural network methodology over the LDA classifier is17 dB. However, the LDA accuracy still saturates at 90.97%,which is lower than the CNN’s accuracy from 0 dB onward.Our group has observed in recordings that a high-quality EGGsignal is in the vicinity of 10dB [40]. At 10 dB, neural networkaccuracy is 95.46% and LDA accuracy is 76.79%, comprisingan accuracy improvement of 18.67%.

A full analysis of the vertical array translation and abdomi-nal tissue depth perturbations (Fig. 8c-d) can be found in theSupplemental Materials Section D. Below we will analyze the

9

Fig. 9. Vulnerability of the LDA classifier to horizontal electrode array position. a) LDA accuracy as a function of horizontal electrode array shifts. b)Front-facing view of electrode array horizontally shifted by -9 cm. The electrode array is to the right of the abnormal initiation, with rightmost electrodes outof range of electrical activity–resulting in electrodes picking up primarily noise. Panels c-d show training (top) and testing (bottom) abstractions of directionfeatures, with: c) annotation 1 and d) annotation 2. All test sets were trained on the same training set, so abstractions of learned direction features duringtraining are the same for both annotations. Direction features corresponding to anterograde propagation are shown in blue, retrograde propagation in red, andnoise-dominated measurements depicted as gray dots. e) Front-facing view of electrode array horizontally shifted by 3 cm. The electrode array is horizontallycentered over the abnormal initiation.

LDA and CNN accuracy curves as a function of horizontalshifts of the 100 channel electrode array (Fig. 8b).

Since CNN filters are stepwise convolved around the entireinput space, invariant properties of the input can be extracted,irrespective of their location in space. This robustness forCNN’s is well exemplified within the context of animal imageclassification, where the face of a cat can be localized andcorresponding features necessary for classification can beextracted from any spatial location in both testing and trainingdatasets. As such, classification of these images is robust, evenif the relative location of the cat’s face in the testing datasetdiffers from its location in the training dataset [56]. Similar tohow the cat’s face can be anywhere within the frame of theimage, the ‘anterograde activity’ and bifurcated ‘anterogradeand retrograde’ activity need not be in the exact same partof the electrode array for every subject, as evidenced byhigh classification accuracy across a broad range of electrodearray placements relative to anatomy (Fig. 8b). In the caseof the LDA, the features are specific to the spatial locationof each electrode. As seen in Figure 8b, LDA classificationaccuracy is sensitive to electrode placement. An explanation(admittedly over-simplified for conceptual transparency) ofthis vulnerability is as follows.

In the training set (identical in both the CNN and LDAtraining paradigms), there is an over-representation of horizon-tal shifts within -3 to 3 cm, as compared to other horizontalshifts outside of this range. It follows that the LDA model fitin training will associate, at a high level, with direction featureconstructs as seen in the ‘Training’ abstractions in Fig. 9c-d.

In this range of shifts, the array is horizontally centered

near the location of the abnormal initiation, and so a learnedrepresentation can be simply described as follows. For thenormal wave propagation, waves are propagating in an an-terograde direction near the left side of the array and ananterograde direction near the right side. Since both halvesof the array have waves propagating in the same direction,the PGD is high. For the abnormal initiation and propagation,there is anterograde activity near the left side of the arrayand retrograde activity near the right side. Since one side ofthe array has waves propagating in an opposite direction fromwaves in the other side of the array, the PGD is distinctlylower than that of the normal case.

This general logic explains the variability in performanceduring testing at the two annotations in Figure 9a. At an-notation 2 in Fig. 9, the array is most centered over theabnormal initiation during testing (Fig. 9e and SupplementalMaterials Section E). As such, the learned representation ofwave directions is very consistent with training (Fig. 9d) andindeed the classification performance is highest. Furthermore,the largest class separation between the normal and abnormalPGD values is seen at annotation 2, and the magnitude of theweight associated with PGD in the LDA classifier is 40 timeshigher than that of other direction features.

As the electrode array is moved slightly in the negativedirection (i.e. horizontal shifts of 0 and -3), the direction vec-tors are still overall consistent with those learned in training.However, poor classification performance is observed becausethe array is moving to a region over more synchronized ret-rograde activity. Thus, the abnormal wave resembles a planarwave, similar to the normal wave but propagating in a different

10

direction. Because the PGD is maximized for planar wavepropagation (regardless of direction), the PGD for the normaland abnormal waves will begin to be almost indistinguishable.Since the PGD is the highest-weighted feature in the LDAclassifier, the sharp decline in performance is explained.

At annotation 1 in Fig. 9, the center of the electrodearray is shifted to the right of the abnormal initiation point(Fig. 9b). When testing the abnormal waveforms, the leftside of the array now records retrograde activity and theright side records noise (Fig. 9c). Thus none of the directionfeatures are consistent between testing and training. Whentesting the normal waveforms, the left side of the array recordsanterograde activity and the right side records noise. Thusonly the left-side direction features are consistent betweentesting and training. Though some of these direction fea-tures are consistent, direction features on the right side ofthe array are weighted higher in classification than thoseon the left (Supplemental Materials Section F). Moreover,wave directions are not coherent and the PGD is reduced incomparison to its learned value during training. Altogether, anerosion of classification performance ensues. Though the CNNapproach is robust across horizontal shifts, we see a unimodaltrend in classification accuracy with respect to electrode arraypositioning. We hypothesize that the most significant ‘feature’used in classifying the abnormal initiation is one extracted bya filter in the shape of retrograde activity. The CNN’s peakperformance at a horizontal shift value of -3 cm supports thishypothesis. At this shift configuration, the electrode array iscentered at a region in which there is mostly retrograde activityin the abnormal simulation (the array is out of range of theanterograde activity seen to the left of the abnormal initiation-refer to Figure 9b). Thus, during testing, the CNN will onlysee retrograde activity for the abnormal simulation and onlyanterograde activity for the normal simulation– yielding highclassification accuracy.

Under our hypothesis that the CNN is heavily selecting forretrograde activity to classify the waveforms as ‘abnormal’, aconfiguration that is more localized to the abnormal initiationlocation will ‘confuse’ the CNN with incidence of featuresresulting from anterograde waves in addition to retrogradewaves. This is consistent with classification performance de-cline as a function of positive shifts (Fig. 8b). The gradualdecline in CNN classification accuracy as the array is shiftedtoward -12 cm is due to the array moving out of range ofmost of the slow wave activity in general and measurementstending toward noise.

As the electrode array is shifted 6 and 9 cm horizontally,the one standard deviation error bars for the CNN and LDAclassification performance overlap. This is mainly because theshifts within 6 to 9 cm move the center of the array nearthe abnormal initiation site, a trend that generally maximizesLDA performance and minimizes neural network performance.In this case, the CNN‘s worst classification accuracy is stillwithin one standard deviation of the LDA’s best classifica-tion accuracy, exhibiting the CNN’s high performance acrosshorizontal electrode array shifts. In comparison, the LDAperforms significantly poorly when the array is not centeredaround the site of the abnormal initiation (i.e. when it is

shifted by -12 cm). Since the abnormal initiation is notanatomy-specific, the corresponding horizontal shift that putsthe electrode array in its range will not be known a priori forany individual human subject – even if an abdominal imagehas been obtained. Altogether, this suggests that the CNNmethodology we propose would be significantly less error-prone and more robust, in comparison to the the LDA approachoperating on spatial features.

Shifting the electrode array by 6, 9, and 12 cm moves thecenter of the electrode array toward the pyloric end of thestomach. As the electrode array is shifted in this direction, itmoves away from regions of the stomach that exhibit retro-grade wave activity in the abnormal case. Under our hypothesisthat the CNN is heavily selecting for retrograde activity toprovide an abnormal classification, moving away from regionswith retrograde activity will cause a decline in performance.There is a high amount of stomach size variability across thisaxis (6.7 cm, Supplemental Materials section G). Thus, theseshift configurations will move the electrode array completelyout of range of retrograde activity (or even out of range ofthe entire stomach) for smaller stomachs but not for largerones, and a large increase in accuracy variance is observed(Fig.8b). This split is seen to the fullest extent at shifts of9 cm, with shifts of 6 cm having more electrode arrays stillrecording retrograde activity and shifts of 12 cm placing ahigher number of electrode arrays out of range of retrogradeactivity.

Furthermore, this region of the stomach exhibits the highestamount of stomach shape variability (Supplemental MaterialsSection G), which, in turn, results in less consistent patternsbetween simulations at the electrode level. We hypothesize thatthis is why the variance increase is not seen symmetrically(i.e. in shifts of -6, -9, and -12 cm), as the esophageal sideof the stomach is not as anatomically variable. In the LDAclassifier with electrode shifts of 6, 9, and 12 cm, the increasein accuracy variance does not mirror that of the CNN. Instead,variability in classification accuracy is seen at shifts of -3and 0 cm. This is because LDA performance depends on thedegree to which the electrode array is centered around thebifurcated anterograde and retrograde activity in the abnormalinitiation. Again, due to anatomical size and shape variability,electrode arrays remain centered around the antrum at shiftsof -3 and 0 cm for larger stomachs but move away from thewave bifurcation at these shifts for smaller stomachs.

As smaller array sizes are tested, slight differences inperformance robustness with respect to the number of channelsare observed (Fig. 10). Performance from waveforms of the100 channel arrays exceeds that of the 25 and 9 channel arraysby more than 1 standard deviation in the SNR range of -8 to 0dB (Fig. 10a). When the electrode array is shifted horizontally,the most noticeable discrepancy in performance lies in therange of -12 to -6 cm (Fig. 10b). This is likely due to the factthat the 25 and 9 channel electrode arrays are out of rangeof electrical activity, whereas the leftmost electrodes of the100 channel electrode array do remain within the range ofelectrical activity. In a similar fashion, the largest discrepancyin performance when the array is shifted vertically lies in theextrema (Fig. 10c), as smaller electrode arrays move more so

11

Fig. 10. Comparison of neural network accuracy between 100 channel (black),25 channel (blue), and 9 channel (red) electrode arrays with respect to: a) SNR,b) horizontal electrode array shifts, c) vertical electrode array shifts, and d)abdominal tissue depth.

out of the range of electrical activity than the 100 channelelectrode array. All three array sizes yield similar modelclassification performance trends when abdominal tissue depthis increased, as these permutations are tested only at verticallyand horizontally centered electrode array configurations.

IV. DISCUSSION

In this study, machine learning techniques involving a CNNwere applied to the slow wave of the stomach to determine ifthe clinically important marker of abnormal gastric slow waveinitiation could be accurately identified. Machine learning wascarried out under thousands of possible test conditions andoverall it performed very favorably in accurately identifyingcases of abnormal slow wave initiation in these cases.

This works paves the way for development of a secondgeneration model which incorporates additional abnormali-ties beyond pacemaker initiation, specifically, a conductionblock and/or abnormal conduction velocity [15]. The CNNarchitecture developed here could be slightly modified toaccommodate the classification of more than two hypotheses.Specifically, the ultimate fully connected layer would havefour, as opposed to two, filters.

Although we hypothesized that modern machine learningtechniques (i.e. CNNs) could be used to disambiguate normaland abnormal gastric myoeletctric function, such techniquestypically require availability of massive amounts of trainingdata. For instance, in our CNN approach, we had to trainclose to 175,000 parameters and test the model on severalperturbations, which required over 10,000 labeled datasets. Ifwe wanted to first directly test this in humans, this wouldhave been time and cost prohibitive. One value of the workpresented here is that we were able to rapidly determine that

such approaches are worth further exploration, using humananatomy and physiologically plausible models of normal andabnormal gastric myoelectric function. Furthermore, a recentstudy in cardiology that used a 2D CNN to classify single-lead ECG [49] suggests that certain cardiac arrhythmias areonly detectable via a multi-lead ECG. Since the 3D CNNmethodology developed here allows for the analysis of adynamic process recorded by a multi-lead system, our workpaves the way for this paradigm to be applied in fields beyondgastroenterology, such as cardiology.

Another value of this work is that it is, to the best of ourknowledge, the first study to explore the use of a deep CNNto disambiguate a normal and abnormally initiated gastricslow wave from multi-electrode cutaneous abdominal wave-forms. Although simultaneous serosal and cutaneous electri-cal recordings to test our methodology in humans requireabdominal surgeries and have not yet been accomplished,recent findings demonstrate the feasibility of multi-electrodegastric mucosal recordings [75] and low-resolution simul-taneous mucosal and cutaneous recordings [76]. As thesetechniques advance and are used in larger patient cohort sizes,opportunities will arise to validate our methods in humansusing multi-electrode mucosal recordings to define groundtruth labels.

Furthermore, one recent advancement in machine learningmethodologies includes the use of transfer learning, whichallows a new classification task to be performed on a pre-trained model. Transfer learning can make it possible totrain a neural network and perform a classification task on adataset containing an otherwise insufficient amount of trainingdata. With this technique, secondary training or fine tuningof previously-learned neural network weights with the newdataset is performed. Secondary training involves freezing low-specificity weights from the first few layers of the neuralnetwork while the remaining final layers remain trainable. Datafrom the new classification task, termed ‘target data’, can beused to fine-tune the highly-specificity weights correspondingto specific features. Another method involves fine-tuning theentire neural network with target data. Recent studies havedemonstrated efficacy in transfer learning even when thetrained and targeted datasets vary significantly [77], [78], [79].

Though many state-of-the-art studies involve transfer learn-ing of 2D models, recent work has shown efficacy of transferlearning with 3D-CNNs [80], [81], [82]. It is reasonable tohypothesize that 100-channel EGG data collected from hu-mans could fit within the realm of similarity to the simulationtraining data, for which transfer learning is effective. Wehypothesize that down or up-samping the human EGG data tomatch the sampling rate of the in silico data will preserve thetemporal relationship between trained and target data, whichis the third axis of the 3D structure. As such, the weightslearned from training the CNN in this in silico study couldbe applied with transfer learning to deploy this frameworkon multi-electrode human data. This transfer-learned study islikely not trivial, but is a future direction within the realm ofpossibility.

A common problem in medical data is a class imbalancebetween patients and healthy controls. However, due to the

12

large prevalence of GI disorders, there is no dearth of patients.In addition, since the EGG is noninvasive, the barrier to enrollhealthy controls is low. Thus, there need not be high classimbalances in EGG studies with real data. If there still is aclass imbalance, though, one could use the synthetic minorityover-sampling technique (SMOTE) and or generate syntheticdata in the minority class. Data generation in the minorityclass has been used in the field of cardiology [47], as well asradiology [83], via a a generative adversarial network (GAN)in the latter.

The approximations and simplifications of our in silicostudy do not, however, capture the following: (i) Slow waveabnormalities may be intermittent [40]. To account for this,one could employ state space models (i.e. hidden Markovmodels) or their analogues in deep neural networks (i.e.LSTM) to develop probabilistic descriptions of the state acrosstime. (ii) This model does not simulate artifacts arising frommovement and/or abdominal muscle contractions that aretypically present in human electrical recordings. To addressthis, our group has recently developed novel artifact rejectionmethodology that leverages the unique oscillatory nature of theslow wave and preserves the signal even in time windows forwhich artifact is present [40]. One may employ these methodson the real HR-EGG data recordings to increase their similarityto the simulation datasets. (iii) The stomach geometry maydynamically change, as compared to our assumption of itbeing static throughout the recording. In practice, the stomachgeometry expands and contracts primarily during and after ameal. Notably, this modulation is found mostly in the fundus,a region with lowest electrical activity, and thus a non-regionof interest in myoelectric efforts. (iv) This model places theelliptical abdomen concentrically with the stomach, which isnot true to the patient’s anatomy in most cases. However, ourfindings pertaining to classification with respect to horizontaland vertical displacement suggest that any trends in classifi-cation accuracy are a function of abnormal initiation location,not anatomy-specific locations. (v) The forward model usedin our simulation does not model all the ionic conductancesunderlying the generation of the slow wave. A more realisticforward model that captures normal and abnormal function hasrecently been established [84] and could be used as further insilico validation of our presented approach.

Though this study and others have shown high classificationaccuracy and robustness, there remains the challenge of lowclinical acceptance of deep learning due to its lack of featureinterpretability. To address this, rather than inputting the origi-nal spatiotemporal voltage time series, ‘animated flashlight se-quences’ [85] may be used as a pre-processing step for which,subsequently, the aforementioned CNN methodology may beapplied. This may be particularly attractive for a human-computer interface setting, since the input to the algorithmis analogous to the input that a human would use to identifyabnormalities. Even if this slightly sacrifices performance, itcould lead to improved interpretability and an increased chanceof adoption as a screening tool by clinicians.

In cardiology, identification of patients with sick sinussyndrome, where the cardiac pacemaker cells are not initi-ating normally, prompts placement of a potentially life-saving

pacemaker device. Indeed, a whole field called cardiac elec-trophysiology has developed around management of cardiacpacemaker cell arrhythmias. While these concepts are stillin a nascent phase in the gastrointestinal tract, with initialefforts showing promise [24], [25], [26], [86], the work herewith machine learning shows that gastric pacemaker initiationproblems can be readily identified with a high accuracy.Advancement towards improved detection of the underlyingstomach pathophysiology from which foregut symptoms arisewill lead to improved and more etiology-specific treatment.

V. CONCLUSION

To summarize, this in silico study demonstrates the efficacyof using machine learning to classify normal and abnor-mal slow wave activity from EGG data. This technique isparticularly relevant because many foregut GI disorders canmasquerade as one another when relying on symptoms alone.A recent finding indicates that with imaging-guided placementof multi-electrode arrays, slow wave spatial electrical patternsbecome associated with disease and symptom severity [41].The robustness of the CNN approach shown here provides pre-liminary evidence that even in the absence of image guidance,and across a wide range of BMIs and SNRs, such associationsmay still be established. Furthermore, the approach we havedeveloped has an ability to discern, at an individual leveland with high accuracy, an abnormally functioning slow wavefrom one of normalcy. Apart from simply being a proof ofconcept, the network architecture and learned weights have thepotential, through transfer learning, to reduce the amount oftraining data required for high performance in human studies.Altogether, these findings suggest that multi-electrode cuta-neous abdominal recordings, combined with modern machinelearning techniques, have the potential to address unmet needsand possibly serve as widely deployable screening tools ingastroenterology.

VI. ACKNOWLEDGMENTS

We’d like to acknowledge Kevin King, MD, PhD, for hisadvisory contributions to the paper as an expert in cardiology.

REFERENCES

[1] H. El-Serag and N. Talley, “The prevalence and clinical course offunctional dyspepsia,” Alimentary pharmacology & therapeutics, vol. 19,no. 6, pp. 643–654, 2004.

[2] H.-K. Jung et al., “The incidence, prevalence, and outcomes of patientswith gastroparesis in olmsted county, minnesota, from 1996 to 2006,”Gastroenterology, vol. 136, no. 4, pp. 1225–1233, 2009.

[3] A. Gikas and J. K. Triantafillidis, “The role of primary care physiciansin early diagnosis and treatment of chronic gastrointestinal diseases,”International Journal of General Medicine, vol. 7, p. 159, 2014.

[4] Z. S. Heetun and E. M. Quigley, “Gastroparesis and Parkinsons disease:a systematic review,” Parkinsonism and Related Disorders, vol. 18, no. 5,pp. 433–440, 2012.

[5] M. Horowitz et al., “Gastroparesis: prevalence, clinical significanceand treatment,” Canadian Journal of Gastroenterology and Hepatology,vol. 15, no. 12, pp. 805–813, 2001.

[6] P. J. Pasricha et al., “Characteristics of patients with chronic unexplainednausea and vomiting and normal gastric emptying,” Clinical Gastroen-terology and Hepatology, vol. 9, no. 7, pp. 567–576, 2011.

[7] P. Janssen et al., “The relation between symptom improvement and gas-tric emptying in the treatment of diabetic and idiopathic gastroparesis,”The American Journal of Gastroenterology, vol. 108, no. 9, p. 1382,2013.

13

[8] R. Corinaldesi et al., “Effect of chronic administration of cisapride ongastric emptying of a solid meal and on dyspeptic symptoms in patientswith idiopathic gastroparesis.” Gut, vol. 28, no. 3, pp. 300–305, 1987.

[9] R. McCallum, O. Cynshi, and I. Team, “Clinical trial: effect ofmitemcinal (a motilin agonist) on gastric emptying in patients withgastroparesis–a randomized, multicentre, placebo-controlled study,” Ali-mentary Pharmacology and Therapeutics, vol. 26, no. 8, pp. 1121–1130,2007.

[10] M. E. Barton et al., “A randomized, double-blind, placebo-controlledphase II study (MOT114479) to evaluate the safety and efficacy anddose response of 28 days of orally administered camicinal, a motilinreceptor agonist, in diabetics with gastroparesis,” Gastroenterology, vol.146, no. 5, pp. S–20, 2014.

[11] J. Langworthy, H. P. Parkman, and R. Schey, “Emerging strategies forthe treatment of gastroparesis,” Expert Review of Gastroenterology andHepatology, vol. 10, no. 7, pp. 817–825, 2016.

[12] E. J. Irvine et al., “Design of treatment trials for functional gastroin-testinal disorders,” Gastroenterology, vol. 150, no. 6, pp. 1469–1480,2016.

[13] K. L. Koch and R. M. Stern, Handbook of Electrogastrography. OxfordUniversity Press, 2004.

[14] G. O’Grady et al., “Origin and propagation of human gastric slow-wave activity defined by high-resolution mapping,” American Journalof Physiology-Gastrointestinal and Liver Physiology, vol. 299, no. 3,pp. G585–G592, 2010.

[15] G. O’Grady et al., “Abnormal initiation and conduction of slow-waveactivity in gastroparesis, defined by high-resolution electrical mapping,”Gastroenterology, vol. 143, no. 3, pp. 589–598, 2012.

[16] Z. Lin et al., “Gastric myoelectrical activity and gastric emptying inpatients with functional dyspepsia,” The American Journal of Gastroen-terology, vol. 94, no. 9, p. 2384, 1999.

[17] J. Chen et al., “Abnormal gastric myoelectrical activity and delayedgastric emptying in patients with symptoms suggestive of gastroparesis,”Digestive Diseases and Sciences, vol. 41, no. 8, pp. 1538–1545, 1996.

[18] M. Bortolotti et al., “Gastric myoelectric activity in patients with chronicidiopathic gastroparesis,” Neurogastroenterology and Motility, vol. 2,no. 2, pp. 104–108, 1990.

[19] W. Sha, P. J. Pasricha, and J. D. Chen, “Rhythmic and spatial abnor-malities of gastric slow waves in patients with functional dyspepsia,”Journal of Clinical Gastroenterology, vol. 43, no. 2, pp. 123–129, 2009.

[20] A. Leahy et al., “Abnormalities of the electrogastrogram in functionalgastrointestinal disorders,” The American Journal of Gastroenterology,vol. 94, no. 4, p. 1023, 1999.

[21] G. O’Grady et al., “Recent progress in gastric arrhythmia: patho-physiology, clinical significance and future horizons,” Clinical andExperimental Pharmacology and Physiology, vol. 41, no. 10, pp. 854–862, 2014.

[22] C. Ouyang, “Clinical significance of gastric dysrhythmias,” World Jour-nal of Gastroenterology, vol. 2, no. Suppl1, pp. 5–6, 1996.

[23] A. C. Nanivadekar et al., “Machine learning prediction of emesis andgastrointestinal state in ferrets,” bioRxiv, p. 607242, 2019.

[24] T. L. Abell et al., “Gastric electrical stimulation in intractable symp-tomatic gastroparesis,” Digestion, vol. 66, no. 4, pp. 204–212, 2002.

[25] G. O’Grady et al., “High-resolution entrainment mapping of gas-tric pacing: a new analytical tool,” American Journal of Physiology-Gastrointestinal and Liver Physiology, vol. 298, no. 2, pp. G314–G321,2009.

[26] S. Alighaleh et al., “Real-time evaluation of a novel gastric pacing devicewith high-resolution mapping,” Gastroenterology, vol. 154, no. 6, pp. S–39, 2018.

[27] S. C. Payne, J. B. Furness, and M. J. Stebbing, “Bioelectric neuromod-ulation for gastrointestinal disorders: effectiveness and mechanisms,”Nature Reviews Gastroenterology & Hepatology, p. 1, 2018.

[28] W. Sha, P. J. Pasricha, and J. D. Chen, “Correlations among electrogas-trogram, gastric dysmotility, and duodenal dysmotility in patients withfunctional dyspepsia,” Journal of Clinical Gastroenterology, vol. 43,no. 8, pp. 716–722, 2009.

[29] H. P. Parkman et al., “Electrogastrography and gastric emptying scintig-raphy are complementary for assessment of dyspepsia,” Journal ofClinical Gastroenterology, vol. 24, no. 4, pp. 214–219, 1997.

[30] K. L. Koch et al., “Gastric emptying and gastric myoelectrical activityin patients with diabetic gastroparesis: effect of long-term domperidonetreatment.” American Journal of Gastroenterology, vol. 84, no. 9, 1989.

[31] P. Holmvall and G. Lindberg, “Electrogastrography before and after ahigh-caloric, liquid test meal in healthy volunteers and patients withsevere functional dyspepsia,” Scandinavian Journal of Gastroenterology,vol. 37, no. 10, pp. 1144–1148, 2002.

[32] A. Oba-Kuniyoshi et al., “Postprandial symptoms in dysmotility-likefunctional dyspepsia are not related to disturbances of gastric myoelec-trical activity,” Brazilian Journal of Medical and Biological Research,vol. 37, no. 1, pp. 47–53, 2004.

[33] S. Abid and G. Lindberg, “Electrogastrography: poor correlation withantro-duodenal manometry and doubtful clinical usefulness in adults,”World Journal of Gastroenterology: WJG, vol. 13, no. 38, p. 5101, 2007.

[34] H. Parkman et al., “Electrogastrography: a document prepared by thegastric section of the American Motility Society Clinical GI MotilityTesting Task Force,” Neurogastroenterology and Motility, vol. 15, no. 2,pp. 89–102, 2003.

[35] T. R. Angeli et al., “Loss of interstitial cells of Cajal and patterns ofgastric dysrhythmia in patients with chronic unexplained nausea andvomiting,” Gastroenterology, vol. 149, no. 1, pp. 56–66, 2015.

[36] R. Berry et al., “Patterns of abnormal gastric pacemaking after sleevegastrectomy defined by laparoscopic high-resolution electrical mapping,”Obesity surgery, vol. 27, no. 8, pp. 1929–1937, 2017.

[37] P. Du et al., “Progress in mathematical modeling of gastrointestinal slowwave abnormalities,” Frontiers in physiology, vol. 8, p. 1136, 2018.

[38] G. O’Grady et al., “Methods for high-resolution electrical mappingin the gastrointestinal tract,” IEEE reviews in biomedical engineering,vol. 12, pp. 287–302, 2019.

[39] A. A. Gharibans et al., “High-resolution electrogastrogram: a novel,noninvasive method for determining gastric slow-wave direction andspeed,” IEEE Transactions on Biomedical Engineering, vol. 64, no. 4,pp. 807–815, 2017.

[40] A. A. Gharibans et al., “Artifact rejection methodology enables con-tinuous, noninvasive measurement of gastric myoelectric activity inambulatory subjects,” Scientific Reports, vol. 8, no. 1, p. 5019, 2018.

[41] A. A. Gharibans et al., “Spatial patterns from high-resolution electrogas-trography correlate with severity of symptoms in patients with functionaldyspepsia and gastroparesis,” Clinical Gastroenterology and Hepatology,2019.

[42] S. Zahid et al., “Patient-derived models link re-entrant driver localizationin atrial fibrillation to fibrosis spatial pattern,” Cardiovascular research,vol. 110, no. 3, pp. 443–454, 2016.

[43] C. T. Villongco et al., “Non-invasive, model-based measures of ventricu-lar electrical dyssynchrony for predicting CRT outcomes,” EP Europace,vol. 18, no. suppl 4, pp. iv104–iv112, 2016.

[44] G. Ho et al., “Rotors exhibit greater surface ECG variation duringventricular fibrillation than focal sources due to wavebreak, secondaryrotors, and meander,” Journal of cardiovascular electrophysiology,vol. 28, no. 10, pp. 1158–1166, 2017.

[45] U. R. Acharya et al., “Automated detection of arrhythmias using dif-ferent intervals of tachycardia ECG segments with convolutional neuralnetwork,” Information Sciences, vol. 405, pp. 81–90, 2017.

[46] B. Pourbabaee, M. J. Roshtkhari, and K. Khorasani, “Deep convolutionalneural networks and learning ECG features for screening paroxysmalatrial fibrillation patients,” IEEE Transactions on Systems, Man, andCybernetics: Systems, no. 99, pp. 1–10, 2017.

[47] U. R. Acharya et al., “A deep convolutional neural network model toclassify heartbeats,” Computers in Biology and Medicine, vol. 89, pp.389–396, 2017.

[48] S. P. Shashikumar et al., “A deep learning approach to monitoring anddetecting atrial fibrillation using wearable technology,” in Biomedicaland Health Informatics (BHI), 2017 IEEE EMBS International Confer-ence on. IEEE, 2017, pp. 141–144.

[49] P. Rajpurkar et al., “Cardiologist-level arrhythmia detection with con-volutional neural networks,” arXiv preprint arXiv:1707.01836, 2017.

[50] O. Faust et al., “Deep learning for healthcare applications based onphysiological signals: a review,” Computer Methods and Programs inBiomedicine, 2018.

[51] U. R. Acharya et al., “Deep convolutional neural network for theautomated detection and diagnosis of seizure using EEG signals,”Computers in Biology and Medicine, vol. 100, pp. 270–278, 2018.

[52] Y. Ren and Y. Wu, “Convolutional deep belief networks for feature ex-traction of EEG signal,” in Neural Networks (IJCNN), 2014 InternationalJoint Conference on. IEEE, 2014, pp. 2850–2853.

[53] P. W. Mirowski et al., “Comparing SVM and convolutional networks forepileptic seizure prediction from intracranial EEG,” in Machine Learningfor Signal Processing, 2008. MLSP 2008. IEEE Workshop on. IEEE,2008, pp. 244–249.

[54] U. C. Allard et al., “A convolutional neural network for robotic armguidance using SEMG based frequency-features,” in Intelligent Robotsand Systems (IROS), 2016 IEEE/RSJ International Conference on.IEEE, 2016, pp. 2464–2470.

14

[55] P. Xia, J. Hu, and Y. Peng, “EMG-Based estimation of limb movementusing deep learning with recurrent convolutional neural networks,”Artificial Organs, vol. 42, no. 5, pp. E67–E77, 2018.

[56] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classificationwith deep convolutional neural networks,” in Advances in Neural Infor-mation Processing Systems, 2012, pp. 1097–1105.

[57] C. Szegedy et al., “Going deeper with convolutions,” in The IEEEConference on Computer Vision and Pattern Recognition (CVPR), June2015.

[58] D. Ciresan, U. Meier, and J. Schmidhuber, “Multi-column deep neuralnetworks for image classification,” ArXiv, 2012.

[59] K. Simonyan and A. Zisserman, “Very deep convolutional networks forlarge-scale image recognition,” ArXiv, 2014.

[60] S. Ji et al., “3D convolutional neural networks for human action recogni-tion,” IEEE Transactions on Pattern Analysis and Machine Intelligence,vol. 35, no. 1, pp. 221–231, 2013.

[61] D. Tran et al., “Learning spatiotemporal features with 3D convolutionalnetworks,” in Proceedings of the IEEE International Conference onComputer Vision, 2015, pp. 4489–4497.

[62] D. Maturana and S. Scherer, “Voxnet: A 3D convolutional neuralnetwork for real-time object recognition,” in Intelligent Robots andSystems (IROS), 2015 IEEE/RSJ International Conference on. IEEE,2015, pp. 922–928.

[63] F. S. Nooruddin and G. Turk, “Simplification and repair of polygonalmodels using volumetric techniques,” IEEE Transactions on Visualiza-tion and Computer Graphics, vol. 9, no. 2, pp. 191–205, 2003.

[64] P. Min, “binvox,” http://www.patrickmin.com/binvox.[65] K. Palagyi and A. Kuba, “Directional 3D thinning using 8 subitera-

tions,” in International Conference on Discrete Geometry for ComputerImagery. Springer, 1999, pp. 325–336.

[66] P. Min, “thinvox,” http://www.patrickmin.com/thinvox.[67] R. Plonsey, “Bioelectric phenomena,” Wiley Encyclopedia of Electrical

and Electronics Engineering, 2001.[68] K. Roeleveld et al., “Volume conduction models for surface EMG;

confrontation with measurements,” Journal of Electromyography andKinesiology, vol. 7, no. 4, pp. 221–232, 1997.

[69] B. J. Roth and F. L. Gielen, “A comparison of two models for calcu-lating the electrical potential in skeletal muscle,” Annals of biomedicalengineering, vol. 15, no. 6, pp. 591–602, 1987.

[70] T. Gootzen et al., “A comprehensive model description for motor unitaction potentials,” Electroenceph Clin Neurophysiol, vol. 66, p. S40,1987.

[71] N. Srivastava et al., “Dropout: a simple way to prevent neural networksfrom overfitting,” The Journal of Machine Learning Research, vol. 15,no. 1, pp. 1929–1958, 2014.

[72] M. Abadi et al., “TensorFlow: Large-scale machine learning onheterogeneous systems,” 2015, software available from tensorflow.org.[Online]. Available: http://tensorflow.org/

[73] F. Chollet et al., “Keras,” https://keras.io, 2015.[74] L. Bradshaw et al., “Diabetic gastroparesis alters the biomagnetic

signature of the gastric slow wave,” Neurogastroenterology and Motility,vol. 28, no. 6, pp. 837–848, 2016.

[75] N. Paskaranandavadivel et al., “Multi-day, multi-sensor ambulatorymonitoring of gastric electrical activity,” Physiological measurement,vol. 40, no. 2, p. 025011, 2019.

[76] A. Shine et al., “Electrogastography at baseline and response to tem-porary gastric electrical stimulation – a comparison of cutaneous withmucosal recordings,” Digestive Disease Week, 2019.

[77] A. Sharif Razavian et al., “CNN features off-the-shelf: an astoundingbaseline for recognition,” in Proceedings of the IEEE Conference onComputer Vision and Pattern Recognition Workshops, 2014, pp. 806–813.

[78] J. Donahue et al., “Decaf: A deep convolutional activation feature forgeneric visual recognition,” in International Conference on MachineLearning, 2014, pp. 647–655.

[79] J. Yosinski et al., “How transferable are features in deep neural net-works?” in Advances in Neural Information Processing Systems, 2014,pp. 3320–3328.

[80] S. Hussein et al., “Risk stratification of lung nodules using 3D CNN-based multi-task learning,” in Information Processing in Medical Imag-ing. Cham: Springer International Publishing, 2017, pp. 249–260.

[81] P. Molchanov et al., “Online detection and classification of dynamichand gestures with recurrent 3D convolutional neural network,” inProceedings of the IEEE Conference on Computer Vision and PatternRecognition, 2016, pp. 4207–4215.

[82] L. A. Alexandre, “3D object recognition using convolutional neuralnetworks with transfer learning between input channels,” in IntelligentAutonomous Systems 13, E. Menegatti, N. Michael, K. Berns, andH. Yamaguchi, Eds. Cham: Springer International Publishing, 2016,pp. 889–898.

[83] H. Salehinejad et al., “Generalization of deep neural networks for chestpathology classification in x-rays using generative adversarial networks,”in 2018 IEEE International Conference on Acoustics, Speech and SignalProcessing (ICASSP). IEEE, 2018, pp. 990–994.

[84] S. Calder et al., “Torso-tank validation of high-resolution electrogastrog-raphy (EGG): Forward modelling, methodology and results,” Annals ofBiomedical Engineering, pp. 1–11, 2018.

[85] S. Calder et al., “Wave front tracking and velocity profiling of EGGsignatures,” in 2018 International Gastrointestinal ElectrophysiologySociety Annual Meeting on. iGES, 2018.

[86] Y.-K. Lo et al., “A wireless implant for gastrointestinal motility disor-ders,” Micromachines, vol. 9, no. 1, p. 17, 2018.

a deep convolutional neural network approach to classify

Documents