ca-smooth: content adaptive smoothing of time series … · [email protected]...

8
CA-Smooth: Content Adaptive Smoothing of Time Series Leveraging Locally Salient Temporal Features * Rosaria Rossini Links Foundation Torino [email protected] Silvestro Poccia Computer Science Dept. Univ. of Torino [email protected] K.Selçuk Candan CIDSE Arizona State University [email protected] Maria Luisa Sapino Computer Science Dept. Univ. of Torino [email protected] ABSTRACT Imprecision and noise in the time series data may result in series with similar overall behaviors being recognized as be- ing dissimilar because of the accumulation of many small lo- cal differences in noisy observations. While smoothing tech- niques can be used for eliminating such noise, the degree of smoothing that needs to be performed may vary signifi- cantly at different parts of the given time series. In this pa- per, we propose a content-adaptive smoothing, CA-Smooth, technique to reduce the impact of non-informative details and noise in time series by means of a data-driven approach to smoothing. The proposed smoothing process treats dif- ferent parts of the time series according to local information content. We show the impact of different adaptive smooth- ing criteria on a number of samples from different datasets, containing series with diverse characteristics. CCS Concepts Information systems Spatial-temporal systems; Data cleaning; Keywords Time series; smoothing; salient features * Research is supported by NSF1909555 “pCAR: Discovering and Leveraging Plausibly Causal (p-causal) Relationships to Understand Complex Dynamic Systems”, NSF1827757 “Data-Driven Services for High Performance and Sustain- able Buildings”, NSF1610282 “DataStorm: A Data Enabled System for End-to-End Disaster Planning and Response”, NSF1633381“BIGDATA: Discovering Context-Sensitive Im- pact in Complex Systems”, and “FourCmodeling”: EU- H2020 Marie Sklodowska-Curie grant agreement No 690817. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full cita- tion on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re- publish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. MEDES ’19 November 12–14, 2019, Limassol, Cyprus c 2019 ACM. ISBN 978-1-4503-4895-9. . . $15.00 DOI: (a) original time-series (b) alternative smoothings Figure 1: Alternative smoothings of time series 1. INTRODUCTION Filtering is a function that transforms one time series into another by altering the values of its elements in a way that takes into account their neighborhoods. The filtering oper- ation can smooth, dampen, or accentuate fluctuations con- tained in the time series data [16]. Smoothing is performed through a convolution function that takes as input a time- series and a smoothing filter – the convolution operation slides the filter over the time series and for each position the overlapping values of the series and the kernel are multiplied and summed up: Definition 1.1 (Smoothing). Given a time series T =[t1,t2, ..., Tn] of length n, and a smoothing filter Φ= [φ1,...,φm], the convolution T * φ is defined as: ti * Φ= X k φ k * t i-k+ m 2 , Where * denotes the convolution operator and m is the length of the filter. In this paper, we assume that the smoothing filter, Φ, is defined through a Gaussian filter: φx = G(x - m 2 )= 1 2πσ 2 e - x- m 2 2 2σ 2 . The degree of the smoothing process can be controlled by varying the smoothing weights, determined by the standard deviation, σ, of the Gaussian – note that σ also defines the width of the Gaussian kernel (a kernel of length 6σ captures 99.9% of the smoothing weights). In Figure 1, we see four different smoothed versions of a given series, differing in the

Upload: others

Post on 20-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CA-Smooth: Content Adaptive Smoothing of Time Series … · rosaria.rossini@linksfoundation.com Silvestro Poccia Computer Science Dept. Univ. of Torino poccia@di.unito.it K.Selçuk

CA-Smooth: Content Adaptive Smoothing of Time SeriesLeveraging Locally Salient Temporal Features ∗

Rosaria RossiniLinks Foundation

[email protected]

Silvestro PocciaComputer Science Dept.

Univ. of [email protected]

K.Selçuk CandanCIDSE

Arizona State [email protected]

Maria Luisa SapinoComputer Science Dept.

Univ. of [email protected]

ABSTRACTImprecision and noise in the time series data may result inseries with similar overall behaviors being recognized as be-ing dissimilar because of the accumulation of many small lo-cal differences in noisy observations. While smoothing tech-niques can be used for eliminating such noise, the degreeof smoothing that needs to be performed may vary signifi-cantly at different parts of the given time series. In this pa-per, we propose a content-adaptive smoothing, CA-Smooth,technique to reduce the impact of non-informative detailsand noise in time series by means of a data-driven approachto smoothing. The proposed smoothing process treats dif-ferent parts of the time series according to local informationcontent. We show the impact of different adaptive smooth-ing criteria on a number of samples from different datasets,containing series with diverse characteristics.

CCS Concepts•Information systems → Spatial-temporal systems;Data cleaning;

KeywordsTime series; smoothing; salient features

∗Research is supported by NSF1909555 “pCAR: Discoveringand Leveraging Plausibly Causal (p-causal) Relationshipsto Understand Complex Dynamic Systems”, NSF1827757“Data-Driven Services for High Performance and Sustain-able Buildings”, NSF1610282 “DataStorm: A Data EnabledSystem for End-to-End Disaster Planning and Response”,NSF1633381“BIGDATA: Discovering Context-Sensitive Im-pact in Complex Systems”, and “FourCmodeling”: EU-H2020 Marie Sklodowska-Curie grant agreement No 690817.

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full cita-tion on the first page. Copyrights for components of this work owned by others thanACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-publish, to post on servers or to redistribute to lists, requires prior specific permissionand/or a fee. Request permissions from [email protected].

MEDES ’19 November 12–14, 2019, Limassol, Cyprusc© 2019 ACM. ISBN 978-1-4503-4895-9. . . $15.00

DOI:

(a) original time-series (b) alternative smoothings

Figure 1: Alternative smoothings of time series

1. INTRODUCTIONFiltering is a function that transforms one time series into

another by altering the values of its elements in a way thattakes into account their neighborhoods. The filtering oper-ation can smooth, dampen, or accentuate fluctuations con-tained in the time series data [16]. Smoothing is performedthrough a convolution function that takes as input a time-series and a smoothing filter – the convolution operationslides the filter over the time series and for each position theoverlapping values of the series and the kernel are multipliedand summed up:

Definition 1.1 (Smoothing). Given a time seriesT = [t1, t2, ..., Tn] of length n, and a smoothing filter Φ =[φ1, . . . , φm], the convolution T ∗ φ is defined as:

ti ∗ Φ =∑k

φk ∗ ti−k+m2,

Where ∗ denotes the convolution operator and m is thelength of the filter.

In this paper, we assume that the smoothing filter, Φ, isdefined through a Gaussian filter:

φx = G(x− m

2, σ) =

1

2πσ2e−x−m

22

2σ2 .

The degree of the smoothing process can be controlled byvarying the smoothing weights, determined by the standarddeviation, σ, of the Gaussian – note that σ also defines thewidth of the Gaussian kernel (a kernel of length 6σ captures∼ 99.9% of the smoothing weights). In Figure 1, we see fourdifferent smoothed versions of a given series, differing in the

Page 2: CA-Smooth: Content Adaptive Smoothing of Time Series … · rosaria.rossini@linksfoundation.com Silvestro Poccia Computer Science Dept. Univ. of Torino poccia@di.unito.it K.Selçuk

(a) fixed scale smoothing

0.8

0.9

1

(b) content-adaptive smoothing

Figure 2: Fixed scale vs. content-adaptive smooth-ing

Figure 3: Three sample salient temporal features ofa time series: In [4], authors recognized that salienttemporal features can represent salient events on atime series; in this paper, we argue that they canalso help partition a series into fragments that showdifferent characteristics

value of σ and thus the size of the smoothing neighborhoodconsidered: in general, the bigger the size of the smoothingfilter, the more details are lost as each smoothed value cap-tures a weighted average of a larger fragment of the timeseries.

1.1 Proposed Approach: Content AdaptiveSmoothing of Time Series, CA-Smooth

While smoothing can be used for eliminating noise, inpractice, it may be difficult to determine a priori the amountof smoothing that needs to be applied: in fact, the amount ofsmoothing that needs to be performed may vary dependingon what part of the time series is being considered. In par-ticular, we argue that different parts (fragments) of the se-ries potentially carry different amounts of redundancy and,thus, we propose to localize the smoothing process by (a)partitioning the series into fragments, (b) which are thenindependently smoothed according to their individual localinformation characteristics. Therefore, in this paper, we pro-pose a content adaptive smoothing technique, CA-Smooth,to reduce the impact of non-informative details and noise intime series by means of a data-driven approach to local fil-tering. The four-step smoothing process treats the differentparts of the time series differently, depending on the localinformation content:

• Locally-salient feature detection: In the first step,

CA-Smooth identifies locally salient features appearingin the time series, at different scales, by means of asalient feature detection algorithm [4] (see Figure 3).

• Fragment detection: In the second step, the algo-rithm uses these locally salient features to partitionthe input series into coherent fragments.

• Fragment characterization: In the third step, thesefragments are analyzed to identify their key propertiesrelevant for the smoothing operation. More specifi-cally, we consider three properties to characterize timeseries fragments: entropy and representativeness.

• Content-adaptive filtering: Finally, the identifiedproperties of each fragment are used for associatingto each time instant t a locally appropriate smoothingparameter, σ(t), which will help preserve the criticallocal information, while dropping the non relevant de-tails around time instant, t.

As visualized in Figure 2, the proposed content-adaptiveCA-smooth algorithm treats different parts of the time seriesaccording to local the content.

1.2 Organization of the PaperThe paper is organized as follows: In the next section we

discuss related works. In Section 3, we introduce the detailsof the proposed content adaptive smoothing strategy, CA-Smooth. In Section 4, we present experimental evaluationsand we conclude the paper in Section 5.

2. RELATED WORKSImprecision and noise in the time series data may result

in series with similar overall behaviors being recognized asbeing dissimilar because of the accumulation of many smalllocal differences in noisy observations. This potentially af-fects the results of (similarity based) clustering and classifi-cation algorithms. In general, the phase of noise reduction isa very important preprocessing step in time series analysis;for this reason, many techniques have been developed [5, 15,14] to reduce and remove the noise from the signal.

Although smoothing has been recognised as a techniquethat can contribute to noise reduction [2], recent studies in-cluding [8] have shown that existing unsupervised smooth-ing technique do not significantly contribute to clusteringand classification improvement. In [8] authors show thatthe automated application of smoothing without domain ex-pertise does not, on average, improve the performance ofbaseline classifiers. This “negative result” motivates our re-search: we claim that this lack of improvement is due tothe fact that unsupervised smoothing does not take into ac-count the peculiarities of the different portions of the con-sidered time series; we therefore propose content adaptive(data driven) techniques to mediate between unsupervisedand data/domain aware smoothing techniques. Time seriessmoothing techniques have also been proposed to prioritizeend usersO attention: [11] proposes to smooth time seriesvisualizations as much as possible to remove noise, whileretaining large-scale structure to highlight significant devia-tions, and develop the ASAP analytical operator that auto-matically smooths streaming time series by adaptively opti-mizing the trade-off between noise reduction (i.e., variance)and trend retention (i.e., kurtosis).

Page 3: CA-Smooth: Content Adaptive Smoothing of Time Series … · rosaria.rossini@linksfoundation.com Silvestro Poccia Computer Science Dept. Univ. of Torino poccia@di.unito.it K.Selçuk

An adaptive denoising algorithm presented in [6] consistsof two steps: given a target length of 2n + 1, the first steppartitions time series into segments of 2n+ 1 points, in sucha way that each segment has an overlap of n+ 1 points withthe next one. Then, the second step fits the segments withthe best polynomial of order K. The two free parameters(K and n) are determined by studying the variance of theresidual data. As opposed to our content adaptive method,this approach adapts the denoising parameters to the data,but it keeps them constant across the entire analysed timeseries, while, as it will be clearer in the next sections, weadapt the smoothing intensity to the local characteristicsof fragments, whose length itself is not common to all thefragments but varies according to local characteristics.

3. CA-SMOOTH: CONTENT-ADAPTIVETIME SERIES SMOOTHING

As described in the Introduction, in this paper, we arguethat different parts of a given time series may carry differ-ent amounts of noise and redundancy and, thus, a content-adaptive smoothing process which locally varies the degreesmoothing by (a) partitioning the series into fragments (b)which are then independently smoothed according to theirindividual local information characteristics, may lead to sig-nificantly better smoothing results than an inflexible strat-egy that applies the same smoothing filter across the entiretime series. In this section, we present a content-adaptivesmoothing algorithm, CA-Smooth. In particular, we de-scribe the four key steps of the algorithm: (a) locally-salientfeature detection, (b) fragment detection, (c) fragment char-acterization, and (d) content-adaptive filtering.

3.1 Locally-Salient Feature DetectionA series can be fragmented in different ways [7, 9, 10, 4]. A

fixed segmentation strategy would partition the series intofragments of the same size, while an adaptive data-drivensegmentation would identify fragment boundaries in a waythat represents characteristics and features of the time se-ries. Other techniques, such as CUTs [10], consider the cur-vature of the series and create segments in such a way thateach segment has a minimal distance from the line connect-ing the first and the last points of the segment.

In CA-Smooth, we use a feature based approach: the tem-poral features used in the segmentation are identified usinga locally-salient robust feature detection algorithm from ourprior work [4]. Given a time series, T , of length n, the algo-rithm returns a set, S of salient features, where each salientfeature, si = 〈ti, σi, descri〉 ∈ S, is a triple, where ti is thecenter of the temporal feature, σi is the feature scale defin-ing the 6σ temporal scope, (ti−3σi, ti+3σi), of the feature,and descri is a histogram of 1D gradients describing the lo-cal temporal structure. Intuitively, each of these featuresis significantly different from its neighborhood in the cor-responding scale and therefore can be used to characterizethe “key” events on the series. The algorithm uses a σminthreshold to control the sizes of the smallest features identi-fied on the series. A second parameter, τ , described in thenext subsection, is used to control the lengths of the frag-ments. Since the number of salient features will impact thenumber of fragments (and thus their lengths), if the num-ber of features is greater than n

τ, we prune lower-intensity

features among those that are temporally co-located.

ab

c

de f

σ

1 2 3 4 5 6 7 8 9 10 11 12 13

1 2 3 4 5 76

Figure 4: Fragment aggregation

3.2 Fragment DetectionAs discussed above, the scope of a salient temporal fea-

ture describes a characteristic region in which an importantchange in the trend of the series happens. Our approachexploits this property to determine the fragments of a timeseries. In particular, we argue that a coherent data drivensegmentation should preserve the integrity of salient fea-tures: i.e., portions of the series which can be associatedto a given salient feature should ideally fall within the samefragment. This would ensure that a salient event would berepresented in its entirety within the same fragment. Let Tbe a given time series of length n and S be the set of salientfeatures. The algorithm considers the set S and returns aset of time series fragments, F , such that the features thatcover the same region are grouped to create a cluster in away that preserves their integrity.

It is easy to see that, one can order the interval boundariesof the salient features in S to obtain an initial fragmentationof the time series, where each fragment would be representedby the set of overlapping salient features at that region ofthe time series. This, however, may have two disadvantages:firstly, (a) we may end up with a lot of small fragments and(b) this process may partition the individual salient features,negatively impacting their integrity. To avoid these, giventwo consecutive interval boundaries, the algorithm checks ifthey represent a fragment with length less than τ . If thisis the case, the smaller interval is merged with the biggestnearest interval to create a new larger fragment. If the newinterval satisfies the threshold constraint, it is confirmed asa fragment and the algorithm processes the next pair of in-terval boundaries. This process leads to a set of fragments,F = {f1, ..., fl}, where each fragment fi has a correspondingscope (ts,i, te,i) such that

1. the length of each fragment is greater than or equal tothe length lowerbound τ ; i.e., te,i − ts,i ≥ τ ;

2. the number of the fragments is suitably bounded; i.e.,l ≤ n

τ; and

3. the first and the last fragment boundaries are the firstand the last point of the series, respectively; i.e., xs,1 =1 and xe,l = n.

Figure 4 presents an example: Here, the black circles rep-resent the temporal scopes of the features. As we can see,using these features, we can potentially create up to 13 can-didate fragments. Ideally, however, we would seek to avoidoverly small fragments and therefore combine nearby inter-vals to generate 7 larger and more coherent fragments.

3.3 Fragment CharacterizationLet F be the set of fragments of a time series, identified

as above. The next step analyzes each fragment to extractcharacteristics to help compute the amount of smoothing

Page 4: CA-Smooth: Content Adaptive Smoothing of Time Series … · rosaria.rossini@linksfoundation.com Silvestro Poccia Computer Science Dept. Univ. of Torino poccia@di.unito.it K.Selçuk

that will be applied during the final filtering phase. Natu-rally, the degree of smoothing of a fragment will be a func-tion of its local content and its relationship relative to therest of the series. In this section, we consider two criteria:

• entropy: the information content of a fragment, mea-sured using entropy, may indicate how much a frag-ment should be smoothed;

• representativeness: the similarity of a given fragmentto the rest of the series may indicate how represen-tative the fragment is and, therefore, may need to beconsidered during smoothing.

3.3.1 EntropyWe compute the entropy, E(fi) of a given fragment fi by

first quantizing the fragment into a quantization alphabetand counting the number of occurrences of each symbol:

Definition 3.1 (entropy). Let A be the quantizationalphabet and prij denotes the portion of times the symbolajinA occurs in fragment fi. The entropy [13], for the frag-ment fi is defined as

E(fi) = −∑aj∈A

prij log(prij).

3.3.2 RepresentativenessIn addition to entropy and feature scale, we also consider

the similarity of a given fragment to the rest of the time se-ries as a potential indicator of how representative the frag-ment is and, therefore, how much smoothing should be ap-plied to the fragment. To compute the representativenessof the fragment fi ∈ F , we first compute pairwise DynamicTime Warping distances [12], ∆dtw(fi, fj), between all pairs,fi, fj , of fragments in the series. We then create a similaritymatrix Π, where the entry Π[i, j] represents the similarity offragment fi to fragment fj , defined as

Π[i, j] = 1− ∆dtw(fi, fj)

maxfh∈F {∆dtw(fi, fh)} .

We then row-normalize the similarity matrix Π to obtain

Π′[i, j] =Π[i, j]∑

fh∈FΠ[i, h]

.

Note that the transpose, T = Π′T , of the row-normalizedmatrix Π′ can be considered as a random walk transitiongraph, where each column j corresponds to a fragment, fjand the entry T [i, j] indicates the probability of randomwalk from fragment fj to fragment fi, such that the moresimilar the fragment fi is to the fragment fj the higher theprobability of the random walk from fj to fi.

Given this transition matrix, T , representing mutual sim-ilarities among the fragments, CA-Smooth then applies thewell-known PageRank algorithm [3] to associate a degree ofrepresentativeness to each fragment in the graph:

Definition 3.2 (Representativeness). Let F be theset of all fragments and let T be the transition graph obtainedas described above. Then, for fi ∈ F , we compute its degreeof representativeness, R(fi), as

R(fi) = PR(fi, T ),

where PR(fi, T ) is the PageRank of fragment fi based onthe transition graph T that captures fragment similarities.

0.8

0.9

1

Figure 5: Adaptive smoothing of the series in Fig-ure 2 revisited: here the thin green line indicatesthe degree of smoothing for different fragments ofthe time series; as we see in the figure, this is sub-ject to abrupt changes – CA-Smooth, instead, ad-justs the degree of smoothing on a per-instant basis(thin blue line)

3.3.3 Characteristics of a FragmentGiven the above, we associate to each fragment fi ∈ F a

combined characteristic, C(fi), defined as

C(fi) = 〈E(fi), R(fi), E(fi), R(fi)〉,

where E and R are the inverses of E and R respectively. Inthis paper, the inverse E(fi) is defined as

E(fi) =maxj(E(fj))− E(fi)

maxj(E(fj))−minj(E(fj))

if maxj(E(fj)) 6= minj(E(fj)); and E(fi) = E(fi) other-wise. Similarly for R.

3.4 Content-Adaptive FilteringGiven a fragment fi, CA-Smooth associates an overall

smoothing scale to that fragment taking into account itscharacteristics, C(fi). More specifically, the smoothing scaleσ(fi) corresponding to the fragment fi is computed as

∀fi∈F σ(fi) = Θ(C(fi)),

where CA-Smooth uses the function Θ to adjust the degreeof smoothing local to each fragment taking into account thefragments characteristics.1.

3.4.1 Computing Instantaneous Smoothing DegreesNote that, even though σ(fi) captures the local charac-

teristics of fi, applying this smoothing scale to each andevery instant within the fragment would cause a significantproblem at the fragment boundaries: since the amount ofsmoothing might change abruptly at the fragment bound-aries, this can potentially introduce undesirable artifactsin the smoothed series. Therefore, CA-Smooth does notdirectly apply σ(fi) smoothing to each and every instantwithin the fragment, fi. Instead, the algorithm varies thesmoothing scale on a per-time-instant basis rather than on aper-fragment basis: further away from the fragment bound-aries, the smoothing scale is governed by C(fi), but closerto the fragment boundaries (i.e., boundaries with fi−1 andfi+1), the smoothing scale would be a combination of thesmoothing scales of nearby fragments, eliminating abruptchanges on the smoothing filter (Figure 5). Intuitively, the

1We show the impact of different Θ functions in the experi-mental evaluation section.

Page 5: CA-Smooth: Content Adaptive Smoothing of Time Series … · rosaria.rossini@linksfoundation.com Silvestro Poccia Computer Science Dept. Univ. of Torino poccia@di.unito.it K.Selçuk

smoothing scales themselves are smoothed to avoid smooth-ing artifacts at the fragment boundaries:

Let F = {f1, . . . , fl} be the set of fragments identified onthe time series, T , fi ∈ F be a fragment, and let th be atime instance in fi. Let also:

• wi = te,i − ts,i denote the length of the frame fi,

• th⊥ = th − wi−1/2,

• th> = th + wi+1/2,

• f(t) denote a function that returns the fragment thatcontains the time instance t.

Given these, CA-Smooth computes the instantanoussmoothing scale, σ(th), to be applied at time instant th asfollows:

• if f(th⊥) = f(th>) = fi, then the time instance this sufficiently away from the two boundaries of thefragment and we can apply the smoothing functionσ(th) = σ(fi);

• if th⊥ < 1 or th> > l, then th is closer to the startor end of the time series than any other neighboringfragments and, thus, we again have σ(th) = σ(fi);

• if, however, f(th⊥) = fi−1, then the point is closer tofragment fi−1 and, therefore, the smoothing scale willbe a weighted average of the smoothing scales of thefragments fi−1 and fi:

σ(th) =ts,i − th⊥

wσ(fi−1) +

th> − ts,iw

σ(fi);

• similarly, if f(th>) = fi+1, then the point is closerto fragment fi+1 and the smoothing scale will be aweighted average of the smoothing scales of the frag-ments fi and fi+1:

σ(th) =th> − te,i

wσ(fi+1) +

te,i − th⊥w

σ(fi).

Intuitively, we are fitting a length-w window around eachtime point th, where w corresponds to the length of thefragment fi that contains the time point th. Given thiswindow, the degree of smoothing at time th is computedbased on how much this window overlaps2 with the framesthat come before or after the fragment fi.

3.4.2 Adjusting Instantaneous SmoothingNote that, so far, we have not considered the degree of

smoothing, σg, global for the entire series. Intuitively, givena target degree, σg, for the series, the average of all instan-taneous smoothing degrees should be σg. Therefore, in thefinal step, we adjust instantaneous smoothing degrees to re-flect σg:

σ′(th) = σg

(σ(th)∑

1≤j≤n σ(tj)

).

2Note that the window can overlap with only one of thefragments fi−1 or fi+1 at a time, never both; note also thatthe only situation in which the window does not overlap witheither of the two neighboring fragments is when th is at thecenter of the fragment fi.

Table 1: Datasets CharacteristicsDataSet #series

Coffee2 286FaceFour 350

Gun 150ECG200 96

synthetic control 60Lighting2 TEST 638

Table 2: Experiments ConfigurationsParameter Value

σg 2% of the series lengthτ c× (6× σg)c 1,2,3σ0

c2∗ σg

# of octaves (O) 2

Once the scale σ(th) is computed for time instance, th, weapply a Gaussian filter with variance σ(th), centered aroundth. As we discussed earlier, this corresponds to a filteringwindow (th − 3σ(th), th + 3σ(th)). At the very boundariesof the time series, where the start or end of the smoothinginterval may fall beyond the boundaries, we suitably scaleσ(th) to ensure that smoothing filter always falls within thetime series.

4. EVALUATIONIn this section, we present case studies to illustrate the

impact of different content-adaptive smoothing criteria. Forthis purpose, we use six datasets 4 with different tempo-ral characteristics. These data sets, listed in Table 4, arepublicly available at [1].

4.1 ConfigurationsIn Table 4, we report the configuration parameters we

consider. Here, σg is the global smoothing parameter, whichwould be used by a fixed smoothing strategy. The adaptivemethods considered in this paper adapt the instantaneousdegree of smoothing based on the characteristics of the data,in such a way that the overall average smoothing across theentire series is equal to σg. As described in Section 3.2, theparameter τ is the length of the smallest fragment createdand is a multiple, c, of the of the average smoothing win-dow size, (6 × σg). In this paper, we rely on the processdescribed in [4] for extracting locally-salient robust features(Section 3.1) which are then used for identifying the frag-ment boundaries (Section 3.2); σ0 and O are two parame-ters used by this algorithm to control the feature sizes. Inparticular, the size of the smallest feature identified by thealgorithm is 6×σ0, whereas the size of the largest feature is6×σ0× 2O. As we see in Table 4, the sizes of these featuresare also a function of the target smoothing rate, σg.

4.2 ResultsIn Figures 6 and 7, we present the smoothing samples for

the six data sets; in particular, figures labeled (a), (c), and(e) show the instantaneous smoothing degrees computed byCA-Smooth under four entropy and representativeness basedsmoothing criteria discussed in Section 3.3.3, whereas figures

Page 6: CA-Smooth: Content Adaptive Smoothing of Time Series … · rosaria.rossini@linksfoundation.com Silvestro Poccia Computer Science Dept. Univ. of Torino poccia@di.unito.it K.Selçuk

!"

#"

$"

%"

&"

'!"

'#"

'$"

'%"

!"

("

'!"

'("

#!"

#("

)!"

)("

$!"

'"

'!"

'*"

#&"

)+"

$%"

(("

%$"

+)"

&#"

*'"

'!!"

'!*"

''&"

'#+"

')%"

'$("

'($"

'%)"

'+#"

'&'"

'*!"

'**"

#!&"

#'+"

##%"

#)("

#$$"

#()"

#%#"

#+'"

#&!"

!"#$""%&'%()

&&*+,-#%.(,#)/0%

1)23,*45"%

6&7""8%9:()&&*+%./;#%(,#)/%<%=>?908%@<9%

,-." /012" /013" ,452" ,453"

!"

#"

$!"

$#"

%!"

%#"

&!"

&#"

'!"

$"

("

$)"

%#"

&&"

'$"

'("

#)"

*#"

)&"

+$"

+("

()"

$!#"

$$&"

$%$"

$%("

$&)"

$'#"

$#&"

$*$"

$*("

$))"

$+#"

$(&"

%!$"

%!("

%$)"

%%#"

%&&"

%'$"

%'("

%#)"

%*#"

%)&"

%+$"

!"#$%&'()*

+,-)).*/01",,&2*3456*1%6"4*7*89:/;.*<7/*

,-./" ,-.0" 123" 145/" 1450"

(a) (b)

!"

!#$"

%"

%#$"

&"

&#$"

'"

'#$"

("

(#$"

$"

)&#$"

)&"

)%#$"

)%"

)!#$"

!"

!#$"

%"

%#$"

&"

&#$"

'"

%" (" *" %!"%'"%+"%,"&&"&$"&-"'%"'("'*"(!"('"(+"(,"$&"$$"$-"+%"+("+*"*!"*'"*+"*,"-&"-$"--",%",("

!"#$""%&'%()

&&*+,-#%.(,#)/0%

1)23,*45"%

6789%:;()&&*+%./<#%(,#)/%=%>?@:09%A=:%

./0" 1234" 123)" .564" .56)"

!"#$%

!"%

!&#$%

!&%

!'#$%

'%

'#$%

&%

&#$%

"%

"#$%

(%

&% )% *% &'% &(% &+% &,% ""% "$% "-% (&% ()% (*% )'% )(% )+% ),% $"% $$% $-% +&% +)% +*% *'% *(% *+% *,% -"% -$% --% ,&% ,)%!"#$%&'()*

+,-.*/01"22&3*4567*1%7"5*8*9:;/<.*=8/*

./01% ./0!% 234% 2561% 256!%

(c) (d)

!"

#"

$!"

$#"

%!"

%#"

&'"

&("

&%"

&$"

!"

$"

%"

("

'"

$"

$%"

%("

('"

'#"

#)"

)*"

*+"

+,"

$!!"

$$$"

$%%"

$(("

$''"

$##"

$))"

$**"

$++"

$,,"

%$!"

%%$"

%(%"

%'("

%#'"

%)#"

%*)"

%+*"

%,+"

(!,"

(%!"

(($"

('%"

!"#$""%&'%()

&&*+,-#%.(,#)/0%

1)23,*45"%

6/7"6&4$8%9:()&&*+%./;#%(,#)/%<%=08%7<9%

-./" 0123" 012&" -453" -45&"

!"#

!$#

!%#

!&#

'#

&#

%#

$#

"#

&#

&&#

%&#

$&#

"&#

(&#

)&#

*&#

+&#

,&#

&'&#

&&&#

&%&#

&$&#

&"&#

&(&#

&)&#

&*&#

&+&#

&,&#

%'&#

%&&#

%%&#

%$&#

%"&#

%(&#

%)&#

%*&#

%+&#

%,&#

$'&#

$&&#

$%&#

$$&#

$"&#

!"#$%&'()*

+,-)+.'/0*123"..&4*5,67*3%7",*8*9:0*-81*

-./0# -./!# 123# 1450# 145!#

(e) (f)

Figure 6: Impact of different filtering strategies

labeled (a), (b), and (c) show the resulting content-adaptedsmoothed series.

Let us consider the Gun series presented in Figures 7(a)and 7(b). As we see in Figures 7(a), the entropy based cri-teria E (Ent+ in the figure) is active where there is majorchange in the series and, as expected, the opposite crite-ria, E (Ent- in the figure), provides higher instantaneoussmoothing weights where the series are relatively flat. Thefigure also shows that different regions of the series are as-sociated with different degrees of representativeness as de-scribed in Section 3.3.2. In the figure Rep+ corresponds tothe criterion R which applies high smoothing on parts ofthe series with high representativeness, whereas Rep− cor-responds to the opposite criterion, R, which smooths pri-marily parts of the series that are not representative. Aswe see in Figures 7(b), different criteria lead to different de-grees of smoothing especially at the parts of the series withlarge change – the appropriate criteria to select would be afunction of the underlying task.

Finally, Figure 8 shows the impact of different values ofc, which controls the lowerbound, τ , of fragment sizes, onthe instantaneous smoothing. As we see in these figures,for lower values of c, we obtain more fragments and theinstantaneous smoothing degrees becomes more impacted bythe very local characteristics of the series; on the other hand,as the value of the c increases, the number of fragmentsreduces and the degree of instantaneous smoothing becomesimpacted by the characteristics of larger fragments.

5. CONCLUSIONSArguing that time series smoothing must be performed

in a way that is adaptive to the temporally-varying char-acteristics of the input series, in this paper, we proposed acontent-adaptive smoothing strategy, CA-Smooth. The pro-posed technique relies on a locally-robust feature extractionapproach to locate robust time series fragments and thenidentify characteristics, such as entropy and representative-

Page 7: CA-Smooth: Content Adaptive Smoothing of Time Series … · rosaria.rossini@linksfoundation.com Silvestro Poccia Computer Science Dept. Univ. of Torino poccia@di.unito.it K.Selçuk

!"

#"

$"

%"

&"

'"

("

)#*'"

)#"

)!*'"

!"

!*'"

#"

#*'"

#" (" ##" #(" $#" $(" %#" %(" &#" &(" '#" '(" (#" ((" +#" +(" ,#" ,(" -#" -("#!#"#!("###"##("#$#"#$("#%#"#%("#&#"#&("

!"#$""%&'%()

&&*+,-#%.(,#)/0%

1)23,*45"%

64-7%89()&&*+%./:#%(,#)/%;%<07%=;8%

./0" 1234" 123)" .564" .56)"

!"#$%

!"%

!&#$%

&%

&#$%

"%

"#$%

"%

$%

'%

"(%

")%

*"%

*$%

*'%

((%

()%

+"%

+$%

+'%

$(%

$)%

,"%

,$%

,'%

)(%

))%

-"%

-$%

-'%

'(%

')%

"&"%

"&$%

"&'%

""(%

"")%

"*"%

"*$%

"*'%

"((%

"()%

"+"%

"+$%

"+'%

!"#$%&'()*

+',-*./0"11&2*3456*0%6"4*7*89-*:7.*

./01% ./0!% 234% 2561% 256!%

(a) (b)

!"

#"

$!"

$#"

%!"

%#"

&!"

&#"

'!"

(%"

!"

%"

'"

)"

*"

$!"

$%"

$"

%!"

&+"

#*"

,,"

+)"

$$#"

$&'"

$#&"

$,%"

$+$"

%$!"

%%+"

%'*"

%),"

%*)"

&!#"

&%'"

&'&"

&)%"

&*$"

'!!"

'$+"

'&*"

'#,"

',)"

'+#"

#$'"

#&&"

##%"

#,$"

#+!"

)!+"

)%*"

!"#$""%&'%()

&&*+,-#%.(,#)/0%

1)23,*45"%

6,#+*+,-#7%89()&&*+%./:#%(,#)/%;%<8=>?07%@;8%

-./" 0123" 012(" -453" -45("

!"#

$#

"#

%#

&#

'#

($#

("#

(#

('#

)*#

*"#

&+#

'&#

($)#

("$#

(),#

(*%#

(,(#

(''#

"$*#

"""#

")+#

"*&#

",)#

"+$#

)$,#

)"%#

)%(#

)*'#

),*#

)+"#

%$+#

%"&#

%%)#

%&$#

%,,#

%+%#

*((#

*"'#

*%*#

*&"#

*,+#

*+&#

&()#

&)$#

!"#$%&'()*

+%,-./,0*123"44&-*567,*3%,"6*8*91:;<=0*>81*

-./0# -./!# 123# 1450# 145!#

(c) (d)

!"

!#$"

%"

%#$"

&"

&#$"

'"

'#$"

("

(#$"

)&"

)%#$"

)%"

)!#$"

!"

!#$"

%"

%#$"

&"

%" '" $" *" +" %%"%'"%$"%*"%+"&%"&'"&$"&*"&+"'%"''"'$"'*"'+"(%"('"($"(*"(+"$%"$'"$$"$*"$+"

!"#$""%&'%()

&&*+,-#%.(,#)/0%

1)23,*45"%

67-*+"89:&-*$&3;%<=()&&*+%./>#%(,#)/%?%@A<0;%9?<%

,-." /012" /01)" ,342" ,34)"

!"#

!$%&#

!$#

!'%&#

'#

'%&#

$#

$%&#

"#

$# (# &# )# *# $$# $(# $&# $)# $*# "$# "(# "&# ")# "*# ($# ((# (&# ()# (*# +$# +(# +&# +)# +*# &$# &(# &&# &)# &*#

!"#$%&'()*

+,-&.)/012-&32$*4*567"22&.*89:;*7%;"9*<*=>5?4*0<5*

,-./# ,-.!# 012# 034/# 034!#

(e) (f)

Figure 7: Impact of different filtering strategies (cont.)

ness, of these fragments. CA-Smooth treats different partsof the time series according to the discovered fragment char-acteristics and associates instantaneous smoothing weightsto each time instant in a way that represents both the user’sglobal smoothing target and the local series characteristics.

6. REFERENCES[1] Time series dataset at UC Riverside.

https://www.cs.ucr.edu/˜eamonn/time series data/.

[2] An evaluation of time-series smoothing algorithms forland-cover classifications using modis-ndvimulti-temporal data. Remote Sensing of Environment,174:258 – 265, 2016.

[3] S. Brin and L. Page. The anatomy of a large-scalehypertextual web search engine. Computer networksand ISDN systems, 30(1-7):107–117, 1998.

[4] K. S. Candan, R. Rossini, X. Wang, and M. L. Sapino.sdtw: computing dtw distances using locally relevant

constraints based on salient feature alignments.Proceedings of the VLDB Endowment,5(11):1519–1530, 2012.

[5] X. Chen, V. Mithal, S. R. Vangala, I. Brugere,S. Boriah, and V. Kumar. A study of time series noisereduction techniques in the context of land coverchange detection. Department of Computer Scienceand Engineering-Technical Reports, no. TR, pages11–016, 2011.

[6] J. Gao, H. Sultan, J. Hu, and W.-W. Tung. Denoisingnonlinear time series by adaptive filtering and waveletshrinkage: a comparison. IEEE signal processingletters, 17(3):237–240, 2009.

[7] E. Keogh, S. Chu, D. Hart, and M. Pazzani.Segmenting time series: A survey and novel approach.In In an Edited Volume, Data mining in Time SeriesDatabases. Published by World Scientific, pages 1–22.Publishing Company, 1993.

Page 8: CA-Smooth: Content Adaptive Smoothing of Time Series … · rosaria.rossini@linksfoundation.com Silvestro Poccia Computer Science Dept. Univ. of Torino poccia@di.unito.it K.Selçuk

(a) (b)

!"

#"

$"

%"

&"

'"

("

)#*'"

)#"

)!*'"

!"

!*'"

#"

#*'"

#" (" ##" #(" $#" $(" %#" %(" &#" &(" '#" '(" (#" ((" +#" +(" ,#" ,(" -#" -("#!#"#!("###"##("#$#"#$("#%#"#%("#&#"#&("

!"#$""%&'%()&&*+,-#%

.)/0,*12"%

31-4%567)&&*+%89:#%7,#)9%;%<=%

./0" 1234"5$6789%:" 1234"5$6789$:" 1234"5$6789#:"

!"

#"

$"

%"

&"

'"

("

)#*'"

)#"

)!*'"

!"

!*'"

#"

#*'"

#" (" ##" #(" $#" $(" %#" %(" &#" &(" '#" '(" (#" ((" +#" +(" ,#" ,(" -#" -("#!#"#!("###"##("#$#"#$("#%#"#%("#&#"#&("

!"#$""%&'%()&&*+,-#%

.)/0,*12"%

31-4%567)&&*+%89:#%7,#)9%;%<=%

./0" 1234"5$6789%:" 1234"5$6789$:" 1234"5$6789#:"

(c) (d)

(e) (f)

Figure 8: Impact of different fragment length thresholds

[8] J. Large, P. Southam, and A. Bagnall. Can automatedsmoothing significantly improve benchmark time seriesclassification algorithms? In H. Perez Garcıa,L. Sanchez Gonzalez, M. Castejon Limas,H. Quintian Pardo, and E. Corchado Rodrıguez,editors, Hybrid Artificial Intelligent Systems, pages50–60, Cham, 2019. Springer International Publishing.

[9] W.-H. Lee, J. Ortiz, B. Ko, and R. Lee. Time seriessegmentation through automatic feature learning,2018.

[10] Y. Qi and K. S. Candan. CUTS: CU rvature-baseddevelopment pattern analysis and segmentation forblogs and other Text Streams. In HYPERTEXT2006, Proceedings of the 17th ACM Conference onHypertext and Hypermedia, August 22-25, 2006,Odense, Denmark, pages 1–10, 2006.

[11] K. Rong and P. Bailis. ASAP: prioritizing attentionvia time series smoothing. PVLDB, 10(11):1358–1369,2017.

[12] H. Sakoe and S. Chiba. Dynamic programmingalgorithm optimization for spoken word recognition.

IEEE transactions on acoustics, speech, and signalprocessing, 26(1):43–49, 1978.

[13] C. E. Shannon. A mathematical theory ofcommunication. Bell system technical journal,27(3):379–423, 1948.

[14] K. Shin, J. Hammond, and P. White. Iterative svdmethod for noise reduction of low-dimensional chaotictime series. Mechanical Systems and Signal Processing,13(1):115–124, 1999.

[15] J. Verbesselt, R. Hyndman, G. Newnham, andD. Culvenor. Detecting trend and seasonal changes insatellite image time series. Remote sensing ofEnvironment, 114(1):106–115, 2010.

[16] S. Winsberg. Jeffrey s. simonoff. smoothing methodsin statistics. PSYCHOMETRIKA, 62:163–164, 1997.