change-point detection techniques for piecewise locally stationary time series
DESCRIPTION
Change-Point Detection Techniques for Piecewise Locally Stationary Time Series. Michael Last National Institute of Statistical Sciences Talk for Midyear Anomaly Detection Workshop 2/3/2006. Stationary Time Series. - PowerPoint PPT PresentationTRANSCRIPT
Change-Point Change-Point Detection Techniques Detection Techniques for Piecewise Locally for Piecewise Locally Stationary Time Stationary Time SeriesSeries
Michael LastMichael LastNational Institute of Statistical SciencesNational Institute of Statistical SciencesTalk for Midyear Anomaly Detection WorkshopTalk for Midyear Anomaly Detection Workshop2/3/20062/3/2006
Stationary Time SeriesStationary Time Series
We call a time series We call a time series stationarystationary if the if the distribution of (xdistribution of (xii,x,xkk) depends only on l=i-k) depends only on l=i-k
Usually use Usually use weakly stationaryweakly stationary, where , where we only look at the first two moments we only look at the first two moments (equivalent in Gaussian case)(equivalent in Gaussian case)
Example: Sunspot numbers, Chandler Example: Sunspot numbers, Chandler Wobble, rainfall (over decades)Wobble, rainfall (over decades)
Detecting Changes: Detecting Changes: Piecewise Stationary Time Piecewise Stationary Time SeriesSeries
Many series not stationaryMany series not stationary EarthquakesEarthquakes SpeechSpeech FinanceFinance
How to model?How to model? Try stationary between change-pointsTry stationary between change-points
Problems With This Problems With This ApproachApproach
Adak (1998) proposed computing Adak (1998) proposed computing distance between power spectrum distance between power spectrum computed over small windows – if computed over small windows – if adjacent windows are “close”, then adjacent windows are “close”, then merge them into a larger windowmerge them into a larger window
Finds too many change-points in Finds too many change-points in earthquakes. earthquakes. E.g. secondary wave tapers off, but change-E.g. secondary wave tapers off, but change-
points will be detectedpoints will be detected
Time-Varying Power Time-Varying Power SpectrumSpectrum
Power spectrum computed over a Power spectrum computed over a window about a pointwindow about a point Window width selection an open questionWindow width selection an open question
Does this have features we can use?Does this have features we can use?
Yes!Yes!
Time-Varying Power Time-Varying Power SpectrumSpectrum
Finding Abrupt ChangesFinding Abrupt Changes
What do we mean by abrupt changes?What do we mean by abrupt changes? Distance between spectrumDistance between spectrum Spectrum as distribution, K-L Information Spectrum as distribution, K-L Information
DiscriminationDiscrimination Requirement of local estimationRequirement of local estimation
Our Distance FunctionOur Distance Function
2
)(
)(
)(
)(1n
L
R
R
L
f
f
f
f
n
Theoretical PerformanceTheoretical Performance
Maximum away from change-points converges Maximum away from change-points converges to 1. Rate of convergence: to 1. Rate of convergence:
Consistently estimated with smoothed Consistently estimated with smoothed periodogramsperiodograms
Asymptotically normalAsymptotically normal Finite sample critical values independent of Finite sample critical values independent of
underlying signalunderlying signal n is length of window, T is length of seriesn is length of window, T is length of series
)log(2/1 Tn
Example SeriesExample Series
Simulation ResultsSimulation Results
Simulations to determine effectiveness of Simulations to determine effectiveness of change-point localization and identificationchange-point localization and identification Separated tasksSeparated tasks 8 types of series with different features8 types of series with different features Minimal amount of tuningMinimal amount of tuning Compared with other methodsCompared with other methods
Results:Results: Good localizationGood localization 65+% correct identification65+% correct identification
Data PerformanceData Performance
Primary WavePrimary Wave
Secondary WaveSecondary Wave
Speech SegmentationSpeech Segmentation
Abrupt changes exist at transitions Abrupt changes exist at transitions between phonemesbetween phonemes Can we reliably recover these?Can we reliably recover these? Given segmented speech, can we Given segmented speech, can we
meaningfully cluster it?meaningfully cluster it? Can we interpret clusters?Can we interpret clusters? Can we use the clusters to deduce speaker, Can we use the clusters to deduce speaker,
accent, or language?accent, or language?
Time-Varying Power-Time-Varying Power-SpectraSpectra
SpeechSpeech
Window Width Window Width ConsiderationsConsiderations
Need a window with enough data to Need a window with enough data to estimate several frequencies in the range estimate several frequencies in the range where interesting events happenwhere interesting events happen Below 10Hz for earthquakesBelow 10Hz for earthquakes At least down to 20Hz for audioAt least down to 20Hz for audio
At present, this remains one of the major At present, this remains one of the major tuning parameters. In effect, wide tuning parameters. In effect, wide windows have low variance but risk windows have low variance but risk higher biashigher bias
How to asses a How to asses a “Significant Change”“Significant Change”
Asymptotic Distribution:Asymptotic Distribution: Test statistic sum of variables with an F Test statistic sum of variables with an F
distribution plus their inversesdistribution plus their inverses Asymptotic normalityAsymptotic normality Problem: Events of interest are in the tail, Problem: Events of interest are in the tail,
asymptotic results break down in tails of asymptotic results break down in tails of distributionsdistributions
Test statistic signal independentTest statistic signal independent Simulate on white noise, pick significance from Simulate on white noise, pick significance from
therethere
End of TalkEnd of Talk
Slides which may address specific Slides which may address specific questions follow, but unless I’ve talked questions follow, but unless I’ve talked way too fast, there probably won’t be way too fast, there probably won’t be time to show these. So let’s break for time to show these. So let’s break for coffee, and if anybody has a burning coffee, and if anybody has a burning desire to learn more about what I’ve said, desire to learn more about what I’ve said, please come and ask me – I’m happy to please come and ask me – I’m happy to answer any questions, and may just have answer any questions, and may just have a slide lying around to answer witha slide lying around to answer with
Finding the Change-Finding the Change-Point(s)Point(s) Assume correct number of change-points, and Assume correct number of change-points, and
findfind
IssuesIssues
How to assess a “significant change?”How to assess a “significant change?” Uncertainty in location?Uncertainty in location? Choosing parametersChoosing parameters
Window widthWindow width SmoothingSmoothing WeightsWeights
Choosing Parameters:Choosing Parameters:Window WidthWindow Width
We need a window width much wider or We need a window width much wider or much narrower than the scale interesting much narrower than the scale interesting changes happen onchanges happen on Much wider and the series mixes within a Much wider and the series mixes within a
windowwindow Much narrower and continuity of time-varying Much narrower and continuity of time-varying
power spectrum kicks inpower spectrum kicks in Same scale and oscillations can be detected Same scale and oscillations can be detected
as big changesas big changes
SmoothingSmoothing
Makes estimate consistentMakes estimate consistent Ruins independence in frequencyRuins independence in frequency Another tuning parameterAnother tuning parameter Bandwidth matters more than shapeBandwidth matters more than shape Current heuristic is about square-root of Current heuristic is about square-root of
number of frequencies, seems to work number of frequencies, seems to work wellwell
WeightsWeights
Method for incorporating prior knowledgeMethod for incorporating prior knowledge High weights for frequencies where real High weights for frequencies where real
changes likely, low for where real changes changes likely, low for where real changes unlikelyunlikely
Akin to placing a prior on what frequencies Akin to placing a prior on what frequencies changes will happen onchanges will happen on
Equivalent to linear filter of signalEquivalent to linear filter of signal
Speech: Unresolved Speech: Unresolved IssuesIssues
Frequency domain representation of Frequency domain representation of speech different across speakers – e.g. speech different across speakers – e.g. Jessica speaks at a higher pitch Jessica speaks at a higher pitch (frequency) than I do(frequency) than I do
Can we find a transform to fix this? Can we find a transform to fix this? After solving this problem, what is the After solving this problem, what is the
next problem?next problem?