modelling gambiense human african trypanosomiasis ......may 26, 2021  · disease is gambiense human...

14
Modelling gambiense human African trypanosomiasis infection in villages of the Democratic Republic of Congo using Kolmogorov forward equations Christopher N. Davis 1,2,* , Matt J. Keeling 1,2,3 , Kat S. Rock 1,2 * Corresponding author: [email protected] 1 Mathematics Institute, University of Warwick, Coventry, CV4 7AL, UK 2 Zeeman Institute (SBIDER), University of Warwick, Coventry, CV4 7AL, UK 3 School of Life Sciences, University of Warwick, Coventry, CV4 7AL, UK Abstract Stochastic methods for modelling disease dynamics enables the direct computation of the probability of elim- ination of transmission (EOT). For the low-prevalence disease of human African trypanosomiasis (gHAT), we develop a new mechanistic model for gHAT infection that determines the full probability distribution of the gHAT infection using Kolmogorov forward equations. The methodology allows the analytical investigation of the probabilities of gHAT elimination in the spatially-connected villages of the Kwamouth and Mosango health zones of the Democratic Republic of Congo, and captures the uncertainty using exact methods. We predict that, if current active and passive screening continue at current levels, local elimination of infection will occur in 2029 for Mosango and after 2040 in Kwamouth, respectively. Our method provides a more realistic approach to scaling the probability of elimination of infection between single villages and much larger regions, and provides results comparable to established models without the requirement of detailed infection structure. The novel flexibility allows the interventions in the model to be implemented specific to each village, and this introduces the framework to consider the possible future strategies of test-and-treat or direct treatment of individuals living in villages where cases have been found, using a new drug. Keywords: Mathematical Model, Kolmogorov Forward Equations, African Trypanosomiasis, African Sleep- ing Sickness Background In mathematical epidemiology, a growing number of models employ the use of stochastic events to describe infection dynamics. These stochastic methodologies, such as the Gillespie or tau-leaping algorithm, typically use a large number of event-driven stochastic simulations to estimate a distribution of possible behaviours for an epidemic [1]. A central benefit of stochastic models is that the random nature of each realisation means a larger range of outcomes is captured than simply the equilibrium dynamics predicted by a deterministic model [2]. These stochastic methods are also integer-based and so the exact number of people infected at a given time is monitored, including when infections reach zero; a deterministic model will never reach zero infections and so a threshold needs to be applied to reach the elimination [3–5]. Thus, a model in a stochastic framework, will have different expected behaviour and be better suited than a deterministic variant when close the the elimination threshold, either due to low infection numbers or small populations [6]. However, event-driven stochastic methods require a large number of realisations of the epidemic process to be generated to be confident that the full distribution of events has been captured; even then this is still only an approximation of the true probability distribution of the potential trajectories for the infection dynamics in time. This is particularly important if there are any rare events of the system — something 1 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review) The copyright holder for this preprint this version posted May 26, 2021. ; https://doi.org/10.1101/2021.05.24.21257532 doi: medRxiv preprint NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice.

Upload: others

Post on 18-Aug-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Modelling gambiense human African trypanosomiasis ......May 26, 2021  · disease is gambiense human African trypanosomiasis (gHAT). Infection with gHAT is a caused by a parasite Trypanosoma

Modelling gambiense human African trypanosomiasis infection in

villages of the Democratic Republic of Congo using Kolmogorov

forward equations

Christopher N. Davis1,2,*, Matt J. Keeling1,2,3, Kat S. Rock1,2

* Corresponding author: [email protected] Mathematics Institute, University of Warwick, Coventry, CV4 7AL, UK2 Zeeman Institute (SBIDER), University of Warwick, Coventry, CV4 7AL, UK3 School of Life Sciences, University of Warwick, Coventry, CV4 7AL, UK

Abstract

Stochastic methods for modelling disease dynamics enables the direct computation of the probability of elim-ination of transmission (EOT). For the low-prevalence disease of human African trypanosomiasis (gHAT), wedevelop a new mechanistic model for gHAT infection that determines the full probability distribution of thegHAT infection using Kolmogorov forward equations. The methodology allows the analytical investigationof the probabilities of gHAT elimination in the spatially-connected villages of the Kwamouth and Mosangohealth zones of the Democratic Republic of Congo, and captures the uncertainty using exact methods. Wepredict that, if current active and passive screening continue at current levels, local elimination of infectionwill occur in 2029 for Mosango and after 2040 in Kwamouth, respectively. Our method provides a morerealistic approach to scaling the probability of elimination of infection between single villages and muchlarger regions, and provides results comparable to established models without the requirement of detailedinfection structure. The novel flexibility allows the interventions in the model to be implemented specific toeach village, and this introduces the framework to consider the possible future strategies of test-and-treat ordirect treatment of individuals living in villages where cases have been found, using a new drug.

Keywords: Mathematical Model, Kolmogorov Forward Equations, African Trypanosomiasis, African Sleep-ing Sickness

Background

In mathematical epidemiology, a growing number of models employ the use of stochastic events to describeinfection dynamics. These stochastic methodologies, such as the Gillespie or tau-leaping algorithm, typicallyuse a large number of event-driven stochastic simulations to estimate a distribution of possible behaviours foran epidemic [1]. A central benefit of stochastic models is that the random nature of each realisation meansa larger range of outcomes is captured than simply the equilibrium dynamics predicted by a deterministicmodel [2]. These stochastic methods are also integer-based and so the exact number of people infected ata given time is monitored, including when infections reach zero; a deterministic model will never reach zeroinfections and so a threshold needs to be applied to reach the elimination [3–5]. Thus, a model in a stochasticframework, will have different expected behaviour and be better suited than a deterministic variant whenclose the the elimination threshold, either due to low infection numbers or small populations [6].

However, event-driven stochastic methods require a large number of realisations of the epidemic processto be generated to be confident that the full distribution of events has been captured; even then this isstill only an approximation of the true probability distribution of the potential trajectories for the infectiondynamics in time. This is particularly important if there are any rare events of the system — something

1

. CC-BY 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 26, 2021. ; https://doi.org/10.1101/2021.05.24.21257532doi: medRxiv preprint

NOTE: This preprint reports new research that has not been certified by peer review and should not be used to guide clinical practice.

Page 2: Modelling gambiense human African trypanosomiasis ......May 26, 2021  · disease is gambiense human African trypanosomiasis (gHAT). Infection with gHAT is a caused by a parasite Trypanosoma

that requires more realisations to determine the true frequency at which they occur [6]. Alternative to theseevent-driven stochastic simulations, the Kolmogorov forward equations provide a method incorporating thestochastic behaviour in a set of ordinary differential equations (ODEs), which fully determine the probabilitydistribution of the epidemic over all possible infection states for the population [6]. This approach is simpleto formulate, as many systems are linear in terms of the probability of being in each infection state, and socan be written in a matrix formulation; other methods exist for when this is not the case [7]. The system isthus easy to solve using standard methods and provides a complete description of the dynamics. It is alsomuch faster to solve than repeatedly generating realisations of the stochastic process. The solutions are alsonumerically exact and an array of derived quantities can be directly calculated.

Kolmogorov forward equations have been utilised in several epidemiological contexts [8–10] and discussedas a powerful tool, but are not widely used to constraints on computer memory [11]. Every possible infectionstate must be explicitly tracked and so if a population is large (and particularly if each individual can be in anyof a large number of infection states), the number of these states quickly becomes prohibitive to the method,as the number of required computations becomes infeasible. Simulation methods take a lot of computationtime; Kolmogorov forward equation methods take a lot of memory storage capacity [12]. Therefore, aKolmogorov forward equation approach to modelling infections dynamics would be most applicable for adisease with a relatively small number of cases that has the potential to achieve elimination [13]. One suchdisease is gambiense human African trypanosomiasis (gHAT).

Infection with gHAT is a caused by a parasite Trypanosoma brucei gambiense and is transmitted bytsetse across Central and West Africa, with the majority of infection occurring in the Democratic Republicof Congo (DRC). This infection has been traditionally controlled with active and passive screening, withsubsequent treatment of infection individuals and has been targeted for elimination of transmission (EOT) by2030 by the World Health Organization (WHO). There have been a substantial number of recent modellingstudies that use compartmental models for gHAT, with these studies typically using deterministic systems ofordinary differential equations (ODEs) [14–18], with more recent studies also considering stochastic event-driven approaches [19–22].

Here, we have developed new model for gHAT infection in spatial connectivity of villages within healthzones of DRC. We use a lower-dimensional state space for infection than considered in the majority of theliterature, which hence allows for Kolmogorov forward equations to be used to define the dynamics. Thisformulation fully captures the possible behaviour of the infection in a stochastic framework, while exhibitingthe advantages of a providing the full and exact distribution of the infection states. We investigate theprobabilities of gHAT persistence or extinction and compute expected times until elimination of infection,which, despite the lower-dimensional infection structure, provides comparable results to more commonlyused methods.

Furthermore, the Kolmogorov forward equation model provides the necessary formulation to explorethe interactions between a large number of villages. Village-specific simulations can mimic the real-worldinterventions observed at the level at which they occur, with the results then realistically scaled up to largerregions, where elimination of infection can be considered at more meaningful geographic scales [23]. In thecontext of gHAT, the method provides a link between individual village [19] and health zone [18] modelling,while including the stochastic properties required to directly simulate elimination of infection, we can assesspotential strategies and progress towards achieving elimination goals.

Methods

Kolmogorov forward equations

We construct our Kolmogorov forward equation model by adapting the structure of the suite of gHAT modelsfirst presented in Rock et al. [14] and updated in Crump et al. [17]. Unlike these previous models, the modelpresented in this manuscript contains just two infection states for a person (susceptible and infected) andtwo types of people (low- and high-risk of exposure to tsetse bites), and does not explicitly model the numberof infected tsetse, the biological vector of the disease.

We derive the new model equations by replacing the tsetse dynamics of previous models with the quasi-equilibrium solution, in order to limit the number of model compartments. For the case gHAT, this assump-tion is justified by the short life-expectancy of the vector (tsetse) [24] and the long timescales of the infection

2

. CC-BY 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 26, 2021. ; https://doi.org/10.1101/2021.05.24.21257532doi: medRxiv preprint

Page 3: Modelling gambiense human African trypanosomiasis ......May 26, 2021  · disease is gambiense human African trypanosomiasis (gHAT). Infection with gHAT is a caused by a parasite Trypanosoma

Figure 1: Model diagrams for the two gHAT models. The left panel shows the full higher-dimensional ODEmodel from Crump et al. [17] that includes tsetse dynamics. The right panel shows the lower-dimensionalspatially-connected Kolmogorov forward equation model presented in this manuscript. In both panels thesolid black lines shows the rate of movement between compartments and dashed black lines shows the theresult of active screening and treatment of infected individuals. The dashed grey paths show interactionbetween humans and tsetse or between all villages, respectively in each panel.

in humans [25]. The number of infection compartments for humans are also reduced from five (susceptible,exposed, infected Stage 1, infected Stage 2, and hospitalised) to two, whereby exposed, infected Stage 1 andinfected Stage 2 are now included as a single infected state, IH , and the former susceptible and hospitalisedcompartments are now given as all susceptible SH (Figure 1). The total human population size is constantand denoted as NH , with a small natural mortality rate of people, µH , replaced by new susceptible indi-viduals. We retain a risk structure whereby a small minority of the population is high-risk, with a higherexposure to biting tsetse and failure to attend active screening.

This leaves four possible infection states for any person, and therefore the lower-dimensional modelstructure, but for an ODE framework is given by Equations 1 and 2.

dSHi

dt= (µH + ψH(Y )) IHi − ζi(IH1, IH2)SHi, (1)

dIHi

dt= ζi(IH1, IH2)SHi − (µH + ψH(Y )) IHi, (2)

for i = 1, 2, for the low- and high-risk group respectively. The natural human mortality rate, µH , determinesthe total human birth rate, µHNH , such that a constant population size is maintained. The recoveryrate, ψH(Y ), is dependent on time due to increased detection rates in at later times and is derived froma combination of parameters from Crump et al. [17], detailed in the supplementary material. The force ofinfection is given by ζi(IH1, IH2), which is a function of the risk class and the number of people infected in eachrisk class, further depending on the quasi-equilibrium solution for the tsetse (Figure 1). See supplementaryinformation a full explanation of the derivation for these model parameters.

To translate this re-formulated model (Equations 1 and 2) into the Kolmogorov forward equations, wefirst consider infection state of the population. Since we assume constant population sizes NH1 and NH2 ineach risk group and SHi + IHi = NHi for i = 1, 2, the infection state is fully determined by the number ofinfected low- and high-risk people IH1 and IH2. Thus, we define the probability of being in a given stateat time t as PIH1,IH2

(t). From this population infection state, there are four possible transitions to differentstates: a low-risk human can recover or die, a high-risk human can recover or die, a low-risk human can get

3

. CC-BY 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 26, 2021. ; https://doi.org/10.1101/2021.05.24.21257532doi: medRxiv preprint

Page 4: Modelling gambiense human African trypanosomiasis ......May 26, 2021  · disease is gambiense human African trypanosomiasis (gHAT). Infection with gHAT is a caused by a parasite Trypanosoma

infected, or a high-risk human can get infected (since we assume a constant population size, death is followedby immediate replacement in the susceptible class and so we do not consider this separately to recovery).Therefore, the Kolmogorov forward equations are given by:

dPIH1,IH2(t)

dt= −PIH1,IH2

(t)(ζ1 (IH1, IH2, t) (NH1 − IH1) + ζ2 (IH1, IH2, t) (NH2 − IH2)

+ (µH + ψH(Y )) (IH1 + IH2))

+ PIH1−1,IH2(t)ζ1(IH1 − 1, IH2, t) (NH1 − IH1 + 1)

+ PIH1,IH2−1(t)ζ2(IH1, IH2 − 1, t) (NH2 − IH2 + 1)

+ PIH1+1,IH2(t) (µH + ψH(Y )) (IH1 + 1)

+ PIH1,IH2+1(t) (µH + ψH(Y )) (IH2 + 1) , (3)

where IH1 = 0, ...,MH1 and IH2 = 0, ...,MH2. The values MH1 and MH2 are the maximum number ofpeople possible to be infected in each risk group.

In theory, the MH1 = NH1 and MH2 = NH2, however because in practice gHAT is a low-prevalenceinfection [26], we can reduce the state space of the model and impose a lower maximum threshold, whileretaining high model accuracy. We ensure that the probability of exceeding the threshold is very small(less than 1 × 10−8), with the values of MHi dependent on NHi, and explicitly given in the supplementarymaterial.

Thus, the Kolmogorov forward equations (Equation 3) comprise of a system of (MH1 + 1)(MH2 + 1)ODEs (since the low-risk infected population can take any value between 0 and NH1 and similarly for thehigh-risk). The Kolmogorov forward equations are a linear system, and hence we simplify the notation bywriting the equations in matrix form. By defining the probability vector:

p(t) = (P0,0(t), P1,0(t), ..., PNH1,0(t), P0,1(t), ..., PNH1,NH2(t)), (4)

we obtain the Kolmogorov forward equations in matrix form as:

dp(t)

dt= p(t)Q(t), (5)

where Q(t) is the rate matrix of all transition rates at time t. We subsequently give the time t in both theyear Y and number of days into that year d and hence, p(t) = p(Y, y). However, we note that Q(t) = Q(Y )because we assume, as per Crump et al. [17], that the change in the rate matrix is due to an increase in therate of passive detection of infection due to improvements in the passive surveillance system, which occursannually.

Since the equations of Crump et al. [17] consider large populations of roughly 100,000 people, ratherthan much smaller villages, we additionally include a rate of importation of infection into a village, due tomovement of people between villages, for which we use a value derived in Davis et al. [19] and denote byδ(Y ). The event of an external importation increases the number of infected people in either risk groupby one and adds terms to Equation 3 representing a change of state from IH1 low-risk infected and IH2

high-risk by increasing from IH1 − 1 to IH1, increasing from IH2 − 1 to IH2, and increasing from IH1 andIH2 to IH1 + 1 or IH2 + 1 respectively:

+ PIH1−1,IH2(t)δ(Y ) (NH1 − IH1 + 1)

+ PIH1,IH2−1(t)δ(Y ) (NH2 − IH2 + 1)

− PIH1,IH2(t)δ(Y ) ((NH1 − IH1) + (NH2 − IH2)) . (6)

In matrix notation, we include these new terms in matrix QE(Y ), whereby the rate matrix is given byQ(Y ) = (QV (Y ) + QE(Y )), for the village and external terms respectively.

Additionally, active screening, the process whereby a large number people in a village are targeted to bescreened for the disease and then treated if infected, is modelled as a the multiplication of the probabilityvector p(Y,y) by a lower-triangular transition matrix A(Y ). In line with previous modelling studies, weassume that only the low-risk class are affected by this discrete-time event that occurs annually at thebeginning of each year, whereas the high-risk class do not attend active screening events. The active screening

4

. CC-BY 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 26, 2021. ; https://doi.org/10.1101/2021.05.24.21257532doi: medRxiv preprint

Page 5: Modelling gambiense human African trypanosomiasis ......May 26, 2021  · disease is gambiense human African trypanosomiasis (gHAT). Infection with gHAT is a caused by a parasite Trypanosoma

matrix A(Y ) is calculated, for a given screening coverage (which can change each year), by the use of ahypergeometric distribution to determine the number of infected people screened, followed by a binomialdistribution to find the number of infections detected due to the imperfect sensitivity of the test (full detailsare in the supplementary material).

We assume that active screening began in 1998 [27], and the system was previously at endemic equilibriumgiven there had been no screening for several years previously. Therefore, we derive the full distribution ofinfection states for a population in day d of year Y as

p (Y, d) = p (1997, 365)

(Y−1∏

i=1998

A (i) exp (−365 (QV (i) + QE (i)))

)A (Y ) exp (− (QV (Y ) + QE(Y )) y) ,

(7)for Y ≥ 1998 and 0 ≤ d ≤ 365. Active screening in year Y is modelled as occurring at the start of the year(d = 0), and hence the probability state just before active screening in year Y is given by p (Y − 1, 365). Wecalculate p (1997, 365) by finding quasi-stationary equilibrium, finding the eigenvector corresponding to thelargest eigenvalue of the rate matrix, (QV (1997) + QE (1997)).

Parameter values

The parameters are taken from Crump et al. [17] and transformed into the new parameters of the Kolmogorovforward equations (see supplementary material). The values of the original parameters are either fixed wherewell-defined in the literature, and specific to DRC, or as the median estimate of the posterior from fittingto screening and incidence data in the DRC health zone of Kwamouth, using a Metropolis–Hastings MCMCalgorithm. The data which span 2000–2016 came from the WHO HAT Atlas [26].

The health zone of Kwamouth is used to parameterise the model since it is a comparatively high-incidenceregion of the DRC, the country with most cases [28] and pre-2018 (the study period with available data)had no large-scale vector control [29], which would affect the infection dynamics. We additionally presentsimilar results for the low-incidence health zone of Mosango when comparing health zone (Figure 5) and inthe supplementary material. We note that the methodology presented here could alternatively be applied toany other health zone or region.

Results

Infection dynamics of a single village

Solving the Kolmogorov forward equations, using the matrix exponential in the form presented in Equation7, we can obtain projections of how the infection dynamics will change in time in full probabilistic form. Asan illustrative example, we calculate the distribution of infection up to the year 2030 for a village of 1,000people in the Kwamouth health zone, assuming that within the village there is an active screening coverageof 50% every year (Figure 2a). We are assuming a small rate of infectious importations from movement ofpeople that decreases with time, δ(Y ) = (3.4 × 10−6) exp(−0.1071(Y − 2000)) days−1, and that the villagestarts at endemic equilibrium conditions in 1998, calculated as the steady state of the Kolmogorov forwardequations.

The results show that the introduction of active screening drives down the expected number of infectedpeople from the steady-state distribution in 1998. By 2030, there is a probability of 0.90 that there areno infected people in this example village. The steady-state is concentrated on negligible infection levelsassuming the active screening is maintained. The full probability distribution does, however, show that forthis individual village there is still some probability the infections will not fall as quickly, or indeed remainconstant or increase.

These results are dependent on the import rate between villages. Without the importation rate, there isan absorbing disease-free population state dynamics, which the population would eventually move towards(a steady-state of no infection). This is because without the importation rate, if infection levels reach zerothere is no person to re-introduce infection the population (albeit through tsetse in practice).

Conversely to an endemic village, we consider a hypothetical village in Kwamouth that was infection-freein 1998 and therefore not targeted for active screening (Figure 2b). We predict some level of resurgence on

5

. CC-BY 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 26, 2021. ; https://doi.org/10.1101/2021.05.24.21257532doi: medRxiv preprint

Page 6: Modelling gambiense human African trypanosomiasis ......May 26, 2021  · disease is gambiense human African trypanosomiasis (gHAT). Infection with gHAT is a caused by a parasite Trypanosoma

Figure 2: Illustrative example of the distribution of infection in a single example village of population sizeNH = 1, 000 for two scenarios. (a) A gHAT-endemic village with 50% annual active screening coverage anda small rate of infectious importations. (b) A village with local elimination of infection pre-2000 followed bythe introduction of one high-risk infected person in 2000, including a small rate of infectious importationsin the future, but no active screening. For each scenario, the top panel shows the distribution of the totalnumber of infected people in time and the lower panels show the distribution between the risk groups atselected time points. The red line in the top panel shows the mean expected dynamics and the red crossesin the bottom panels shows the expected numbers in each risk class at that time.

6

. CC-BY 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 26, 2021. ; https://doi.org/10.1101/2021.05.24.21257532doi: medRxiv preprint

Page 7: Modelling gambiense human African trypanosomiasis ......May 26, 2021  · disease is gambiense human African trypanosomiasis (gHAT). Infection with gHAT is a caused by a parasite Trypanosoma

average in the village if a single high-risk person of the village became infected through travelling to anothervillage. There is a high probability of onward transmission in the village; the expected number of infectedpeople increases initially, before decreasing again. However, infection is unlikely to be maintained in the longterm, even without additional controls — there is a probability of 0.83 of a return to no infection by 2030.

Infection dynamics across multiple villages

The infection dynamics in individual villages informs us about the probability of local elimination of infectionfor an average village, but at a larger scale, such as health zone or country level, elimination of infection willdepend on more than the probability of elimination in individual villages. While the distribution of gHATinfection is heterogeneous across the DRC [30], highly-clustered incidence means that local movement ofpeople will affect the probability of elimination in neighbouring villages. The rate of infectious importationsin a village will be dependent on the total infection level in the area.

Therefore, to consider the dynamics across multiple villages in a region, we modify our rate of infectiousimportations to remove the exponential decrease in time matched to the trend in global infections andreplace this with a term proportional to the total number of expected infections in model predictions acrossall villages of the local study region — in this case the health zone of Kwamouth — such that δ(1998) =(3.4×10−6) days−1. The supplementary material provides a complete description of how the rate of infectiousimportations is formulated.

We consider the expected number of infections and the probability of elimination of infection for fourgroups of 10,000 people, comprised of groups of villages of different sizes (NH = 10, 000, 1, 000, 100 and 10)(Figure 3) to understand the impact of the metapopulation structure [23]. The smaller village populationsizes within total populated area have fewer expected gHAT infections and a higher probability of eliminationof infection. The reduced number of interactions of mixing in smaller villages also results in a lower steady-state from before the active screening begins. There is a much lower probability of elimination of infectionwhen there are fewer larger villages. For 1,000 villages with a population size NH = 10, the probabilityof elimination of infection in 2030 is > 0.99, while for just one village of NH = 10, 000, there is a smallerprobability of 0.77 for elimination of infection by 2030. This is in agreement with results of metapopulationstudies, where more stochastic fade-outs of infection occur in the smaller populations, leading to a greaterprobability of elimination of infection across larger areas when sub-divided into more populations [31]. Theexample here highlights the benefit of modelling the full stochastic dynamics, where the small populationsizes determine the frequency of extinction events; the results from an ODE model with constant populationsize would not vary with the size of the population.

Infection dynamics across a health zone

While we have considered theoretical villages to explore the behaviour of the Kolmogorov forward equationmodel, we now use data from the WHO HAT Atlas [28] to obtain a list of real villages to apply our modelto. Since our model uses parameters matched to the data from the health zone of Kwamouth, we extractthe population sizes of each village in Kwamouth (see supplementary material for details) along with thepast active screening coverage. A plausible screening pattern is obtained taking the mean active screeningcoverage across all screenings and the probability that any particular village listed in the WHO HAT Atlasis screened in a given year. For Kwamouth, these values are a 68.6% coverage occurring at a probability of0.23 each year using active screening data from 2000–2018 (see supplementary material for details). Thisactive screening scheme is incorporated into the model by a new parameter for the probability of an activescreening event in a village. Thus, we model active screening as the linear combination of the probabilityof no screening multiplied by the current distribution and the probability of a screening multiplied by thedistribution after an active screening of given coverage.

We additionally adapt the value of the rate of importations of infection at steady state (previously takenfrom Davis et al. [19]). We calculate the value that is now specific to the health zone of Kwamouth asδ(Y ) = 2.86 × 10−6 days−1, which is determined by matching the steady state of the whole health zone tothe steady state of the system of ODEs in the original model for Kwamouth.

Applying the full Kolmogorov forward equation model to the health zone of Kwamouth, we obtain a fullprobability distribution in time for each village. We present the probability distribution of infection across

7

. CC-BY 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 26, 2021. ; https://doi.org/10.1101/2021.05.24.21257532doi: medRxiv preprint

Page 8: Modelling gambiense human African trypanosomiasis ......May 26, 2021  · disease is gambiense human African trypanosomiasis (gHAT). Infection with gHAT is a caused by a parasite Trypanosoma

Figure 3: Infection dynamics in a total population of 10,000, partitioned into groups of villages of differentsizes NH = 10, 000, 1, 000, 100 and 10. The annual active screening coverage is 50% and projections arestarted for the equilibrium distribution in 1998. (a) The expected number of infections. The shaded regionshows the 95% prediction intervals around the solid line for the expected value. (b) The probability of localelimination of infection.

the risk groups for a selection of villages of different sizes (NH = 100, 971, 5, 628 and 20, 697) and for thewhole health zone at key time points (Figure 4). Similar results for the lower-endemicity health zone ofMosango are presented in the supplementary material.

The expected number of infections in all villages decreases in time, such that by 2030 most of theprobability is centred around no infection. For the smallest villages, there is a large probability of localgHAT elimination by 2030 (> 0.99 for the village of size NH < 100) as there are initially few or no cases,which are then identified by active screening, or passive screening and treatment or death. However, for thelargest villages, such as the one with a population of NH = 20, 697, there is a high probability of continuedinfection with a probability of just 0.26 that elimination of infection will be met in that village by 2030.Hence, we will frequently see local elimination events, with global persistence across the health zone [31].

Active screening is shown to reduce the infection. This is visually explicit in the column for 2000, wherethere are separate high probability clouds for whether an active screening has occurred and hence the numberof infected people in the low-risk group identified and treated (Figure 4). At the beginning of 2000, activescreenings may have occurred in both 1998 and 1999 and so in the bottom left panel of Figure 4 (villageof population size NH = 20, 697), is is clear that there are four possible outcomes highlighted as separateregions of the probability distribution: an active screening event occurred in both 1998 and 1999, just 1998,just 1999, or neither 1998 or 1999. This behaviour is less obvious in some smaller villages as the highprobability regions overlap, yet is still present. We note that since active screening is identifying only low-risk individuals, infection is being pushed down but proportionally most of the reduction is in the low-riskgroup.

The decline in expected infection in all of individual villages is also evident total infection of the healthzone (Figure 5). This is calculated as the sum of the expected infection in each village. Note that here wedo not present the number of cases, but the underlying infections — case numbers would be substantiallylower as there are typically high levels of under-reporting [32]. By 2030, the expected number of infectedpeople have greatly decreased, yet persist in low number. This is mirrored in the probability of eliminationof infection, calculated as the product of achieving zero infections in all villages of the health zone, which isless than 10−4 by 2030 if this active screening coverage remains constant and only identifies low-risk infectedindividuals, with the mean expected year of elimination after 2040. Using parameter values in the modelmatched to WHO HAT Atlas data for the low-incidence health zone of Mosango, we observe an earlier mean

8

. CC-BY 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 26, 2021. ; https://doi.org/10.1101/2021.05.24.21257532doi: medRxiv preprint

Page 9: Modelling gambiense human African trypanosomiasis ......May 26, 2021  · disease is gambiense human African trypanosomiasis (gHAT). Infection with gHAT is a caused by a parasite Trypanosoma

Figure 4: The risk distribution of infection in selected villages of Kwamouth at different time points (theyears 2000, 2010, 2020 and 2030); the first possible active screening of each village was in 1998. There are418 villages with populations ranging between 3 and 20,697 of which we present the probability distributionof infected people in four villages (NH = 100, 971, 5, 628 and 20, 697). On each subplot the x-axis representsthe number of high-risk people infected, and the y-axis, the number of low-risk people infected. The riskdistribution of infection for the alternative health zone of Mosango are presented in the supplementarymaterial.

9

. CC-BY 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 26, 2021. ; https://doi.org/10.1101/2021.05.24.21257532doi: medRxiv preprint

Page 10: Modelling gambiense human African trypanosomiasis ......May 26, 2021  · disease is gambiense human African trypanosomiasis (gHAT). Infection with gHAT is a caused by a parasite Trypanosoma

Figure 5: The total infection in the health zones of Kwamouth and Mosango. We assumed Kwamouth had apopulation of 206,135 people made up 481 distinct villages, while Mosango had a population size of 107,685of 204 villages. (a) The expected number of infections across all villages in time. The shaded region showsthe 95% prediction intervals. (b) The probability of zero infections in time and hence elimination for allvillages in each health zone together.

expected year of elimination of infection in 2029.

Discussion

The Kolmogorov forward equation model facilitates a powerful and efficient way to analyse the dynamics ofa low-prevalence infections such as gHAT. This method presented here, utilising a lower-dimensional modelstructure than commonly used models, is fast to compute (with sufficiently small populations) and yetmaintains a good correspondence with more complex approaches. The nature of the implementation meansthat various interesting properties can easily be explored, with exact methods for calculating extinction timesand expected dynamics [6]. This approach has also allowed the model to be easily extended to consider theinteraction of multiple villages and even to consider the dynamics of persistence at the health zone level bylinking the total number of infected individuals to the rate of infectious imports into the villages.

Using this model, we conclude that based upon the strategy of active screening at the mean level, theexpected year of elimination of infection is after 2040 for Kwamouth and in 2029 for Mosango. This is inline with deterministic predictions using this parameterisation but at a health zone level; Huang et al. [18]predicted that the year of elimination of infection would be after 2040 for Kwamouth and in 2031 for Mosango,using a similar model and the mean coverage of active screening. Likewise, stochastic models at a healthzone level, found that without COVID-19 interruptions to gHAT activities, elimination might be expectedin 2025 in Mosango (with this slightly earlier prediction may be explained by higher assumed screeningfrom 2017 onwards) [21]. This general agreement between either connected village scale and health zonescale models is reassuring and supports continued use of both model frameworks, however by formulatingconnected village-scale methods it is possible to consider spatial dynamics in a more nuanced way.

We note that in considering the populations, we have only used villages that are listed in the WHOHAT Atlas and there are known to be additional villages within these health zones that are not listed, sincethey may not have ever been screened. Hence, the distribution of active screening is not truly as uniformlydistributed across the health zones as presented here. As shown in the supplementary material, we also knowthat screening coverage and frequency is correlated with population size of a settlement, and so adaptingour model to account for this would also likely result in more accurate predictions. The proportion of peoplein each risk class is also constant for all population sizes in the model, whereas we could speculate that

10

. CC-BY 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 26, 2021. ; https://doi.org/10.1101/2021.05.24.21257532doi: medRxiv preprint

Page 11: Modelling gambiense human African trypanosomiasis ......May 26, 2021  · disease is gambiense human African trypanosomiasis (gHAT). Infection with gHAT is a caused by a parasite Trypanosoma

the larger populations are more town-like and perhaps have fewer high-risk members. This potential over-estimation of high-risk people in the large populations could explain some of the very high gHAT persistenceprobabilities seen in results for these populations (Figure 4).

The assumption that movement of individuals is proportional to the expected number of people infectedin the health zone, rather than the full distribution, is also a simplification to avoid calculating all possiblecombinations infection states in each village. This is only true for a large number of villages, but hence avalid assumption at the health zone level.

In addition, we consider a probability of active screening every year, despite the fact that continued activescreening is unlikely to be necessary for small villages, where the infection is almost certain to be locallyeliminated in latter years. To improve the plausibility of the model, we could add a cessation criterion,similar other studies [20, 33, 34]. This is less straightforward to implement in this probabilistic frameworkthan in the tau-leaping scheme, as we do not consider specific realisations of the model where the infection iseither detected or not, but have a full probability distribution of all possible infection states. One potentialsolution could be to link the probability of being screened to the probability of observing a case in activescreening. We do show that the assumption of a single screening event each year, as opposed to continuouslythroughout the year, shows negligible differences and so adopt this method for simplicity (see supplementarymaterial).

We have no data on the movement of people between villages, and so the rate of importation was estimatedby matching to the probability of detecting infection on the first active screening in a village [19] or matchingfor the health zone to the expected equilibrium state of an ODE model variant [18]. However, using thesevalues as an approximation for the mixing between villages, we achieve a good match to other model variant(see supplementary material), while retaining the efficiency of the Kolmogorov forward equations and theadditional benefits of calculating the full probability distribution.

In the future, this modelling approach could be very valuable for assessing not only decisions aboutcontinuation or cessation of screening in specific villages (based on village size and previous detections) butalso provides a method through which the impact of village-level mass drug administration on health zonetransmission dynamics could be assessed. Whilst such as drug is not currently licensed for this type ofdelivery, a new compound – acoziborole – which is in phase 3 clinical trials as a single-dose cure, is a possiblecandidate [35]. These types of village-scale strategy decisions would be challenging to analyse through healthzone level approaches.

Conclusions

We have shown that a lower-dimensional model of gHAT that operates at the village level can achieve verysimilar results to a more biologically realistic version for larger spatial scales, while introducing a method ofobtaining numerically exact results for extinction times and expected number of infected individuals. Thepredictions provided are in line with previous deterministic and stochastic results and the model imple-mentation provides a framework to scale between modelling at a health zone or national level and at thevillage level. This is an important development, since the data obtained and the actual interventions (activescreening) conducted are at village level.

The Kolmogorov forward equation model suggests that with mean coverage of active screening andcontinuing passive screening, with no additional interventions such as vector control, the infection is almostcertain to persist for long periods. This indicates that additional or intensified controls are required toachieve elimination of transmission, such as tsetse control, improved passive detection or targeting of high-risk people in active screening. This model structure expands the range of analytical projections it is possibleto generate and demonstrates the results that can be obtained with a Kolmogorov forward equation model.

Acknowledgments

The authors thank Programme National de Lutte contre la Trypanosomiase Humaine Africaine (PNLTHA)of DRC and its director Dr Erick Mwamba Miaka for original data collection and WHO for data access (inthe framework of the WHO HAT Atlas [28]).

11

. CC-BY 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 26, 2021. ; https://doi.org/10.1101/2021.05.24.21257532doi: medRxiv preprint

Page 12: Modelling gambiense human African trypanosomiasis ......May 26, 2021  · disease is gambiense human African trypanosomiasis (gHAT). Infection with gHAT is a caused by a parasite Trypanosoma

Funding

This work was supported by the Bill and Melinda Gates Foundation (www.gatesfoundation.org) in partner-ship with the Task Force for Global Health through the NTD Modelling Consortium [OPP1184344] (C.N.D.,K.S.R. and M.J.K.) and the Bill and Melinda Gates Foundation through the Human African TrypanosomiasisModelling and Economic Predictions for Policy (HAT MEPP) project [OPP1177824] (K.S.R., and M.J.K.).The funders had no role in study design, data collection and analysis, decision to publish, or preparation ofthe manuscript.

References

[1] Daniel T Gillespie. A general method for numerically simulating the stochastic time evolution of coupledchemical reactions. Journal of computational physics, 22(4):403–434, 1976.

[2] Andrew J Black and Alan J McKane. Stochastic formulation of ecological models and their applications.Trends in ecology & evolution, 27(6):337–345, 2012.

[3] Maurice S Bartlett. Deterministic and stochastic models for recurrent epidemics. Technical report,University of Manchester, 1956.

[4] Bryan Thomas Grenfell, BM Bolker, and A Kleczkowski. Seasonality and extinction in chaotic metapop-ulations. Proceedings of the Royal Society of London. Series B: Biological Sciences, 259(1354):97–103,1995.

[5] Maryam Aliee, Kat S Rock, and Matt J Keeling. Estimating the distribution of time to extinction ofinfectious diseases in mean-field approaches. Journal of the Royal Society Interface, 17(173):20200540,2020.

[6] Matt J Keeling and Joshua V Ross. On methods for studying stochastic disease dynamics. Journal ofthe Royal Society Interface, 5(19):171–181, 2007.

[7] Garrett Jenkinson and John Goutsias. Numerical integration of the master equation in some models ofstochastic epidemiology. PloS one, 7(5):e36160, 2012.

[8] David Alonso and Alan McKane. Extinction dynamics in mainland–island metapopulations: an n-patchstochastic model. Bulletin of mathematical biology, 64(5):913–958, 2002.

[9] A-F Viet and GF Medley. Stochastic dynamics of immunity in small populations: a general framework.Mathematical biosciences, 200(1):28–43, 2006.

[10] Hans P Duerr and Martin Eichner. Epidemiology and control of onchocerciasis: the threshold bitingrate of savannah onchocerciasis in africa. International journal for parasitology, 40(6):641–650, 2010.

[11] Matthew James Keeling and Joshua V Ross. Efficient methods for studying stochastic disease andpopulation dynamics. Theoretical population biology, 75(2-3):133–141, 2009.

[12] Jaroslav Albert. A hybrid of the chemical master equation and the gillespie algorithm for efficientstochastic simulations of sub-networks. PLOS One, 11(3), 2016.

[13] BT Grenfell. Chance and chaos in measles dynamics. Journal of the Royal Statistical Society: Series B(Methodological), 54(2):383–398, 1992.

[14] Kat S Rock, Steve J Torr, Crispin Lumbala, and Matt J Keeling. Quantitative evaluation of the strategyto eliminate human African trypanosomiasis in the DRC. Parasites & Vectors, 8(1):532, 2015.

[15] Chris M Stone and Nakul Chitnis. Implications of heterogeneous biting exposure and animalhosts on Trypanosomiasis brucei gambiense transmission and control. PLOS Computational Biology,11(10):e1004514, 2015.

12

. CC-BY 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 26, 2021. ; https://doi.org/10.1101/2021.05.24.21257532doi: medRxiv preprint

Page 13: Modelling gambiense human African trypanosomiasis ......May 26, 2021  · disease is gambiense human African trypanosomiasis (gHAT). Infection with gHAT is a caused by a parasite Trypanosoma

[16] Abhishek Pandey, Katherine E Atkins, Bruno Bucheton, Mamadou Camara, Serap Aksoy, Alison PGalvani, and Martial L Ndeffo-Mbah. Evaluating long-term effectiveness of sleeping sickness controlmeasures in Guinea. Parasites & Vectors, 8(1):550, 2015.

[17] Ronald E Crump, Ching-I Huang, Edward S Knock, Simon EF Spencer, Paul E Brown, ErickMwamba Miaka, Chansy Shampa, Matt J Keeling, and Kat S Rock. Quantifying epidemiologicaldrivers of gambiense human African trypanosomiasis across the Democratic Republic of Congo. PLOSComputational Biology, 17(1):e1008532, 2021.

[18] Ching-I Huang, Ronald E Crump, Paul Brown, Simon EF Spencer, Erick Mwamba Miaka, ChansyShampa, Matt J Keeling, and Kat S Rock. Shrinking the gHAT map: identifying target regions forenhanced control of gambiense human African trypanosomiasis in the Democratic Republic of Congo.medRxiv, 2020.

[19] Christopher N Davis, Kat S Rock, Erick Mwamba Miaka, and Matt J Keeling. Village-scale persis-tence and elimination of gambiense human African trypanosomiasis. PLOS Neglected Tropical Diseases,13(10), 2019.

[20] M Soledad Castano, Maryam Aliee, Erick Mwamba Miaka, Matt J Keeling, Nakul Chitnis, and Kat SRock. Screening strategies for a sustainable endpoint for gambiense sleeping sickness. The Journal ofInfectious Diseases, 221(Supplement 5):S539–S545, 2020.

[21] Maryam Aliee, M Soledad Castano, Christopher N Davis, Swati Patel, Erick Mwamba Miaka, Simon EFSpencer, Matt J Keeling, Nakul Chitnis, and Kat S Rock. Predicting the impact of covid-19 interruptionson transmission of gambiense human african trypanosomiasis in two health zones of the democraticrepublic of congo. Transactions of The Royal Society of Tropical Medicine and Hygiene, 115(3):245–252, 2021.

[22] Christopher N Davis, M Soledad Castano, Maryam Aliee, Swati Patel, Erick Mwamba Miaka, Matt JKeeling, Simon EF Spencer, Nakul Chitnis, and Kat S Rock. Modelling to quantify the likelihood thatlocal elimination of transmission has occurred using routine gambiense human African trypanosomiasissurveillance data. Clinical infectious diseases, 2021.

[23] Matt J Keeling, Ottar N Bjørnstad, and Bryan T Grenfell. Metapopulation dynamics of infectiousdiseases. In Ecology, genetics and evolution of metapopulations, pages 415–445. Elsevier, 2004.

[24] John Philip Glasgow et al. The distribution and abundance of tsetse. The Distribution and Abundanceof Tsetse., 1963.

[25] Francesco Checchi, Sebastian Funk, Daniel Chandramohan, Daniel T Haydon, and Francois Chappuis.Updated estimate of the duration of the meningo-encephalitic stage in gambiense human african try-panosomiasis. BMC research notes, 8(1):1–3, 2015.

[26] Jose R. Franco, Giuliano Cecchi, Gerardo Priotto, Massimo Paone, Abdoulaye Diarra, Lise Grout,Pere P. Simarro, Weining Zhao, and Daniel Argaw. Monitoring the elimination of human Africantrypanosomiasis: Update to 2016. PLOS Neglected Tropical Diseases, 12(12):e0006890, 2018.

[27] Dietmar Steverding. The history of african trypanosomiasis. Parasites & vectors, 1(1):1–8, 2008.

[28] Jose R Franco, Giuliano Cecchi, Gerardo Priotto, Massimo Paone, Abdoulaye Diarra, Lise Grout,Pere P Simarro, Weining Zhao, and Daniel Argaw. Monitoring the elimination of human Africantrypanosomiasis at continental and country level: Update to 2018. PLOS Neglected Tropical Diseases,14(5):e0008261, 2020.

[29] Inaki Tirados, Andrew Hope, Richard Selby, Fabrice Mpembele, Erick Mwamba Miaka, Marleen Boe-laert, Mike J Lehane, Steve J Torr, and Michelle C Stanton. Impact of tiny targets on glossina fuscipesquanzensis, the primary vector of human african trypanosomiasis in the democratic republic of thecongo. PLoS neglected tropical diseases, 14(10):e0008270, 2020.

13

. CC-BY 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 26, 2021. ; https://doi.org/10.1101/2021.05.24.21257532doi: medRxiv preprint

Page 14: Modelling gambiense human African trypanosomiasis ......May 26, 2021  · disease is gambiense human African trypanosomiasis (gHAT). Infection with gHAT is a caused by a parasite Trypanosoma

[30] Philippe Buscher, Giuliano Cecchi, Vincent Jamonneau, and Gerardo Priotto. Human African try-panosomiasis. The Lancet, 390(10110):2397–2409, 2017.

[31] Bryan Grenfell and John Harwood. (meta) population dynamics of infectious diseases. Trends in ecology& evolution, 12(10):395–399, 1997.

[32] Dieudonne Mumba, Elaine Bohorquez, Jane Messina, Victor Kande, Steven M Taylor, Antoinette KTshefu, Jeremie Muwonga, Melchior M Kashamuka, Michael Emch, Richard Tidwell, et al. Prevalence ofhuman african trypanosomiasis in the democratic republic of the congo. PLoS Negl Trop Dis, 5(8):e1246,2011.

[33] Christopher N Davis, Kat S Rock, Marina Antillon, Erick Mwamba Miaka, and Matt J Keeling. Cost-effectiveness modelling to optimise active screening strategy for gambiense human African trypanoso-miasis in the Democratic Republic of Congo. medRxiv, 2020.

[34] Marina Antillon, Ching-I Huang, Ronald E Crump, Paul E Brown, Rian Snijders, Erick Mwamba Miaka,Matt J Keeling, Kat S Rock, and Fabrizio Tediosi. Economic evaluation of gambiense human africantrypanosomiasis elimination campaigns in five distinct transmission settings in the democratic republicof congo. Ronald E. and Brown, Paul E. and Snijders, Rian and Miaka, Erick Mwamba and Keeling,Matt J. and Rock, Kat S. and Tediosi, Fabrizio, Economic Evaluation of gambiense Human AfricanTrypanosomiasis Elimination Campaigns in Five Distinct Transmission Settings in the Democratic Re-public of Congo, 2020.

[35] Drugs for Neglected Diseases. Prospective study on efficacy and safety of acoziborole (scyx-7158) inpatients infected by human african trypanosomiasis due to t.b. gambiense (oxa002). 2020.

14

. CC-BY 4.0 International licenseIt is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted May 26, 2021. ; https://doi.org/10.1101/2021.05.24.21257532doi: medRxiv preprint