[width=3.3cm]images/logomoa.jpg .5cm concept driftabifet/523/conceptdrift-slides.pdf ·...
TRANSCRIPT
![Page 1: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/1.jpg)
Concept Drift
Albert Bifet
March 2012
![Page 2: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/2.jpg)
COMP423A/COMP523A Data Stream Mining
Outline
1. Introduction2. Stream Algorithmics3. Concept drift4. Evaluation5. Classification6. Ensemble Methods7. Regression8. Clustering9. Frequent Pattern Mining
10. Distributed Streaming
![Page 3: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/3.jpg)
Data Streams
Big Data & Real Time
![Page 4: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/4.jpg)
Data Mining Algorithms with Concept Drift.
-input output
DM Algorithm
Static Model
-
Change Detect.-
6
�
-input output
DM Algorithm
-
Estimator1
Estimator2
Estimator3
Estimator4
Estimator5
![Page 5: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/5.jpg)
Introduction.
ProblemGiven an input sequence x1, x2, · · · , xt we want to output atinstant t an alarm signal if there is a distribution change andalso a prediction x̂t+1 minimizing prediction error:
|x̂t+1 − xt+1|
Outputs
I an estimation of some important parameters of the inputdistribution, and
I a signal alarm indicating that distribution change hasrecently occurred.
![Page 6: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/6.jpg)
Change Detectors and Predictors
-xt
Estimator
-Estimation
![Page 7: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/7.jpg)
Change Detectors and Predictors
-xt
Estimator
-Estimation
- -Alarm
Change Detect.
![Page 8: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/8.jpg)
Change Detectors and Predictors
-xt
Estimator
-Estimation
- -Alarm
Change Detect.
Memory-
6
6?
![Page 9: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/9.jpg)
Concept Drift Evaluation
Mean Time between False Alarms (MTFA)Mean Time to Detection (MTD)Missed Detection Rate (MDR)Average Run Length (ARL(θ))
The design of a change detector is acompromise between detecting truechanges and avoiding false alarms.
![Page 10: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/10.jpg)
Data Stream Algorithmics
I High accuracy in the predictionI Low mean time to detection (MTD), false positive rate
(FAR) and missed detection rate (MDR)I Low computational cost: minimum space and time neededI Theoretical guaranteesI No parameters needed
Main properties of an optimal changedetector and predictor system.
![Page 11: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/11.jpg)
The CUSUM Test
I The cumulative sum (CUSUM algorithm), gives an alarmwhen the mean of the input data is significantly differentfrom zero.
I The CUSUM test is memoryless, and its accuracy dependson the choice of parameters υ and h.
g0 = 0, gt = max (0,gt−1 + εt − υ)
if gt > h then alarm and gt = 0
Cumulative sum algorithm (CUSUM).
![Page 12: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/12.jpg)
Page Hinckley Test
I The CUSUM test
g0 = 0, gt = max (0,gt−1 + εt − υ)
if gt > h then alarm and gt = 0
I The Page Hinckley Test
g0 = 0, gt = gt−1 + (εt − υ)
Gt = min(gt)
if gt −Gt > h then alarm and gt = 0
![Page 13: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/13.jpg)
Geometric Moving Average Test
I The CUSUM test
g0 = 0, gt = max (0,gt−1 + εt − υ)
if gt > h then alarm and gt = 0
I The Geometric Moving Average Test
g0 = 0, gt = λgt−1 + (1− λ)εt
if gt > h then alarm and gt = 0
The forgetting factor λ is used to give more or less weightto the last data arrived.
![Page 14: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/14.jpg)
Statistical test
µ̂0 − µ̂1 ∈ N(0, σ20 + σ2
1), under H0
Example: Probability of false alarm of 5%
Pr
|µ̂0 − µ̂1|√σ2
0 + σ21
> h
= 0.05
As P(X < 1.96) = 0.975 the test becomes
(µ̂0 − µ̂1)2
σ20 + σ2
1> 1.962
![Page 15: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/15.jpg)
Concept Drift
6 sigma
![Page 16: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/16.jpg)
Concept Drift
Number of examples processed (time)
Erro
r rat
e
concept drift
pmin
+ smin
Drift level
Warning level
0 50000
0.8
new window
Statistical Drift Detection Method(Joao Gama et al. 2004)
![Page 17: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/17.jpg)
ADWIN: Adaptive Data Stream Sliding Window
Let W = 101010110111111
I Equal & fixed size subwindows: 1010 1011011 1111
I Equal size adjacent subwindows: 1010101 1011 1111
I Total window against subwindow: 10101011011 1111
I ADWIN: All adjacent subwindows:
1 01010110111111
1010 10110111111
1010101 10111111
1010101101 11111
10101011011111 1
![Page 18: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/18.jpg)
Data Stream Sliding Window
101100011110101 0111010
Sliding WindowWe can maintain simple statistics over sliding windows, usingO(1
ε log2 N) space, whereI N is the length of the sliding windowI ε is the accuracy parameter
M. Datar, A. Gionis, P. Indyk, and R. Motwani.Maintaining stream statistics over sliding windows. 2002
![Page 19: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/19.jpg)
Exponential Histograms
M = 2
1010101 101 11 1 1 1Content: 4 2 2 1 1 1Capacity: 7 3 2 1 1 1
1010101 101 11 11 1Content: 4 2 2 2 1Capacity: 7 3 2 2 1
1010101 10111 11 1Content: 4 4 2 1Capacity: 7 5 2 1
![Page 20: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/20.jpg)
Exponential Histograms
1010101 101 11 1 1Content: 4 2 2 1 1Capacity: 7 3 2 1 1
Error < content of the last bucket W/Mε = 1/(2M) and M = 1/(2ε)
M · log(W/M) buckets to maintain thedata stream sliding window
![Page 21: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/21.jpg)
Exponential Histograms
1010101 101 11 1 1Content: 4 2 2 1 1Capacity: 7 3 2 1 1
To give answers in O(1) time,it maintain three counters LAST, TOTAL and VARIANCE.
M · log(W/M) buckets to maintain thedata stream sliding window
![Page 22: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/22.jpg)
Algorithm ADaptive Sliding WINdow
ADWIN: ADAPTIVE WINDOWING ALGORITHM
1 Initialize W as an empty list of buckets2 Initialize WIDTH, VARIANCE and TOTAL3 for each t > 04 do SETINPUT(xt ,W )5 output µ̂W as TOTAL/WIDTH and ChangeAlarm
SETINPUT(item e, List W)
1 INSERTELEMENT(e,W )2 repeat DELETEELEMENT(W )3 until |µ̂W0 − µ̂W1 | < εcut holds4 for every split of W into W = W0 ·W1
![Page 23: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/23.jpg)
Algorithm ADaptive Sliding WINdow
INSERTELEMENT(item e, List W)
1 create a new bucket b with content e and capacity 12 W ←W ∪ {b} (i.e., add e to the head of W )3 update WIDTH, VARIANCE and TOTAL4 COMPRESSBUCKETS(W )
DELETEELEMENT(List W)
1 remove a bucket from tail of List W2 update WIDTH, VARIANCE and TOTAL3 ChangeAlarm← true
![Page 24: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/24.jpg)
Algorithm ADaptive Sliding WINdow
COMPRESSBUCKETS(List W)
1 Traverse the list of buckets in increasing order2 do If there are more than M buckets of the same capacity3 do merge buckets4 COMPRESSBUCKETS(sublist of W not traversed)
![Page 25: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/25.jpg)
Algorithm ADaptive Sliding WINdow
TheoremAt every time step we have:
1. (False positive rate bound). If µt remains constant withinW, the probability that ADWIN shrinks the window at thisstep is at most δ.
2. (False negative rate bound). Suppose that for somepartition of W in two parts W0W1 (where W1 contains themost recent items) we have |µW0 − µW1 | > 2εcut . Then withprobability 1− δ ADWIN shrinks W to W1, or shorter.
ADWIN tunes itself to the data stream at hand, with no need forthe user to hardwire or precompute parameters.
![Page 26: [width=3.3cm]images/LogoMOA.jpg .5cm Concept Driftabifet/523/ConceptDrift-Slides.pdf · COMP423A/COMP523A Data Stream Mining Outline 1.Introduction 2.Stream Algorithmics 3. Concept](https://reader033.vdocument.in/reader033/viewer/2022060219/5f06e0f67e708231d41a3045/html5/thumbnails/26.jpg)
Algorithm ADaptive Sliding WINdow
ADWIN using a Data Stream Sliding Window Model,I can provide the exact counts of 1’s in O(1) time per point.I tries O(log W ) cutpointsI uses O(1
ε log W ) memory wordsI the processing time per example is O(log W ) (amortized
and worst-case).
Sliding Window Model
1010101 101 11 1 1Content: 4 2 2 1 1Capacity: 7 3 2 1 1