wavelets and excess disease models for analysis of time series data
DESCRIPTION
Wavelets and excess disease models for analysis of time series data. Dan Weinberger Fogarty International Center National Institutes of Health. Analyzing time series data. Wavelets : evaluate timing of peaks and dominant frequency - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/1.jpg)
Wavelets and excess disease models for analysis of
time series data
Dan WeinbergerFogarty International CenterNational Institutes of Health
![Page 2: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/2.jpg)
Analyzing time series data
• Wavelets: evaluate timing of peaks and dominant frequency
• Regression models: estimate seasonal baseline and calculate excess incidence
![Page 3: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/3.jpg)
Part 1: Wavelets
![Page 4: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/4.jpg)
Motivating example: Measles
Figures from Grenfell et al, Nature 2001
Does the frequency of the measles epidemics change after vaccination?
![Page 5: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/5.jpg)
Shift in dominant frequencyTime
![Page 6: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/6.jpg)
Wavelets: a powerful solution
– Identification of the dominant frequencies in a series (ie annual, monthly…) at each specific time
– Determine the “phase” of these cycles and compare series
– Filter to smooth a series (remove high frequency noise)
– And many other applications…
![Page 7: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/7.jpg)
Basic conceptsH
ighe
r fre
quen
cy
-1.5
-1
-0.5
0
0.5
1
1.5
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37
Sample time series (wave with 10 unit cycle=sin(2*π/10*t)
-Wavelets:little waves of a specific shape-”slide” wavelet along timeseries to determine strength of correlation-repeat, while shrinking and expanding the wavelet -Can use different shapes of wavelets for different situations
![Page 8: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/8.jpg)
Wavelet spectrum of a sine wave10 year cycle dominates
![Page 9: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/9.jpg)
Wavelet spectrum of a sine wave
“Global wavelet”=Average across entire time period
Grey area=Cone of influence:Less confidence in this region
Color=power of spectra:Red=higher amplitude at thatfrequency and time
Significance tested by a permutation test
![Page 10: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/10.jpg)
Multiple frequencies
![Page 11: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/11.jpg)
Wavelet with changing frequencies
Interpretation: Wavelength increases from ~0.25 to ~0.5 (from 4 cycles/year to 2 cycles/year)
![Page 12: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/12.jpg)
Example: epidemic timing
![Page 13: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/13.jpg)
Example: Using wavelets to extract phase (timing) information
1.5-
3 ye
ar c
ompo
nent
![Page 14: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/14.jpg)
Applying wavelets to “real” data
• Step 1: remove any long term trends from the data (calculate baseline using spline of summer months and then divide by baseline)
• Step 2: Square root or log-transform the data• Step 3: Use transformed data in wavelet
transform, evaluate spectra, extract phase data
Note: it is important to have a complete time series without missing data for the wavelets. Need to have relatively long time series since accuracy of wavelets is poor at the beginning and end of the time series
![Page 15: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/15.jpg)
Part 2: Excess Disease models
![Page 16: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/16.jpg)
Serfling Regression
• Step 1: Define influenza, non-influenza period• Step 2: Set a baseline and threshold (95%
confidence interval) for pneumonia during non-influenza period
• Step 3: Calculate excess mortality for each year– Sum of observed mortality subtracted from the
model baseline during “epidemic months” (when flu deaths cross threshold)
![Page 17: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/17.jpg)
USA P&I Deaths per 100,000 (1972-2006), All Ages
![Page 18: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/18.jpg)
USA P&I Deaths per 100,000 (1972-2006), All Ages
![Page 19: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/19.jpg)
USA May-Nov P&I Deaths per 100,000 (1972-2006), All Ages
OBTAIN A BASELINE MODEL FROM THESE DATA
![Page 20: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/20.jpg)
USA May-Nov P&I Deaths per 100,000 (1972-2006), with Model Baseline, All Ages
Full Model: E(Yi) = α + β1cos(2π ti/12) + β2sin(2π ti/12) + β3 ti + β4ti2 + β5ti
3 + β6ti
4 + β7ti5 + εi
Best-fit Model: E(Yi) = α + β1cos(2π ti/12) + β2sin(2π ti/12) + β3ti + β4ti2 +
β5ti3 + β6ti
4 + εi
12-month sine and cosine waveto account for baseline seasonal variations Linear time trend
Polynomial time trendTo account for long-term fluctuations
![Page 21: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/21.jpg)
USA P&I Deaths per 100,000 (1972-2006), Model Baseline and Upper 95% Confidence
Band, All Ages
![Page 22: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/22.jpg)
USA P&I Deaths per 100,000 (1972-2006), Model Baseline and Upper 95% Confidence
Band, All Ages
Epidemic Months Highlighted in Grey
![Page 23: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/23.jpg)
USA P&I Deaths per 100,000 (1972-2006) and Model Baseline 65-89 Year Olds
Epidemic Months (Grey) Defined by All-Ages P&I Model
Excess
![Page 24: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/24.jpg)
Calculating Excess Mortality• Monthly Excess Mortality:
– For epidemic months (months in which the observed P&I mortality exceeds the upper 95% CI of the model baseline for all-ages):• Observed P&I mortality – model baseline predicted
P&I mortality • Seasonal Excess Mortality:
– For each influenza season (defined as Nov.-May in the US): • Σ Monthly excess mortality
![Page 25: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/25.jpg)
Seasonal US Excess Mortality Table
SeasonAge
Group
No. of Epidemic
Months
Excess P&I Deaths
per 100,000
Excess A-C Deaths
per 100,000
1977/1978 65-89 4 30.41 137.05
1978/1979 65-89 1 0.75 7.40
1979/1980 65-89 3 15.68 66.85
1980/1981 65-89 3 28.45 136.71
1981/1982 65-89 3 3.67 27.35
1982/1983 65-89 4 12.68 47.83
1983/1984 65-89 4 9.51 33.90
1984/1985 65-89 4 21.38 122.61
AVERAGES
- -3 (median = 3,
range = 1-5) 41.34 78.37
![Page 26: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/26.jpg)
Pros and Cons of the Serfling Approach
• Very flexible: can be used without virological data—especially useful for data on past pandemics– However, need at least three years of data
• Only works if disease is seasonal– Needs clear periods with no viral activity that can
be used to create the baseline – Cannot be used as is for tropical countries that
have year-round influenza circulation • There are techniques to adapt Serfling models for these
purposes
![Page 27: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/27.jpg)
An alternative/complementary approach
• What proportion of “pneumonia and influenza” hospitalizations can be attributed to influenza?
• Use regression models with terms for seasonal variation, influenza, RSV (can be viral surveillance data, viral-specific hospitalization codes…)
![Page 28: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/28.jpg)
A quick review of regression
• Linear regression:– Y=β1x1+ β2x2 + a – β1: 1 unit increase in x1 results in β1 increase in Y
• Poisson regression– Used when “Y” is a count variable/incidence rate
rather than continuous– Usually has a skewed distribution– Multiplicative Poisson model: Y=e(β1x1+ β2x2 + a)
– If data are not Poisson distributed, use an alternative model, such as negative binomial
![Page 29: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/29.jpg)
Estimation of influenza hospitalization burden
•Outcome = weekly pneumonia and influenza hospitalization rate
•Explanatory variables=influenza-specific and RSV-specific hospitalizations (proxies of viral activity),
•Seasonal estimates of influenza-related hospitalization rates obtained as sum of predicted rates minus baseline rates (influenza covariate set to 0)
![Page 30: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/30.jpg)
Example: Estimation of influenza hospitalization burden in California in seniors
over 65 yrs
0
50
100
150
200
250
300
Date
Hos
pita
lizat
ions
per
100
,000
per
sons
Excess P&I hospitalizations associatedwith InfluenzaExpected P&I Hospitalizations
Observed P&I Hospitalizations
Modeled Baseline
RSVflut
ttttY agestate
654
32
21,
)2.52/2cos(
2.52/2sinexp)(
![Page 31: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/31.jpg)
Comparison between Serfling and Poisson regression
Newall, Viboud, and Wood, 2009 Epidemiolo Infect.
![Page 32: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/32.jpg)
Alternative Models for Estimating Influenza Burden
•Peri-season models•Use months surrounding influenza epidemics as
baseline•ARIMA models
•Estimate seasonal baseline by adjusting for serial autocorrelation.
•Serfling-Poisson combined models•Serfling seasonal excess mortality estimates are
regressed against seasonal virus prevalence. Takes care of random variations in virus prevalence at small time scales
•Iterative Serfling models (for non-seasonal data)
![Page 33: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/33.jpg)
Validity tests for influenza disease burden models
• Regression diagnostics• Checks based on the epidemiology of influenza
– A/H3N2 vs A/H1N1/B dominant seasons (2-3 ratio)– RSV vs influenza (age!)– Higher rates in 65 yrs and over– Multiple years: seasons will little influenza circulation
very precious! • Difficult to estimate disease burden with precision
– Mild seasons– Young children– Middle age groups
![Page 34: Wavelets and excess disease models for analysis of time series data](https://reader035.vdocument.in/reader035/viewer/2022062301/5681631b550346895dd393cf/html5/thumbnails/34.jpg)
Acknowledgement
• Cecile Viboud for providing some R program samples
• Cécile Viboud and Vivek Charu for slides on Serfling regression
• http://www.ecs.syr.edu/faculty/lewalle/ (Jacques Lewalle) for some ideas on presentation