compressed k-nearest neighbors classification for evolving

42
Compressed k-Nearest Neighbors Classification for Evolving Data Streams Maroua Bahri 1 , Albert Bifet 1,2 , Silviu Maniu 3 , Rodrigo F. de Mello 4 , Nikolaos Tziortziotis 5 1 LTCI, T´ el´ ecom Paris, 2 University of Waikato, 3 LRI, Universit´ e Paris-Sud, 4 University of ao Paulo, 5 Tradelab

Upload: others

Post on 22-Jun-2022

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Compressed k-Nearest Neighbors Classification for Evolving

Compressed k-Nearest NeighborsClassification for Evolving Data

Streams

Maroua Bahri1, Albert Bifet1,2, Silviu Maniu3, Rodrigo F. de Mello4,Nikolaos Tziortziotis5

1LTCI, Telecom Paris, 2University of Waikato, 3LRI, Universite Paris-Sud, 4University ofSao Paulo, 5Tradelab

Page 2: Compressed k-Nearest Neighbors Classification for Evolving

/

Outline

1 Introduction

2 Compressed k-Nearest Neighbors

3 Experiments

4 Conclusions

1 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 3: Compressed k-Nearest Neighbors Classification for Evolving

Introduction/

Outline

1 Introduction

2 Compressed k-Nearest Neighbors

3 Experiments

4 Conclusions

1 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 4: Compressed k-Nearest Neighbors Classification for Evolving

Introduction/

Internet of Things (IoT)

Network of connected devices

2 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 5: Compressed k-Nearest Neighbors Classification for Evolving

Introduction/

Internet of Things (IoT)

Con

nect

ed d

evic

es in

bill

ions

0

20

40

60

80

2015 2016 2017 2018 2019 2020 2021 2022 2023 2024 2025

Statista predicts around 80 billion IoT devices by 2025

2 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 6: Compressed k-Nearest Neighbors Classification for Evolving

Introduction/

IoT Stream Mining

Challenges

Memory

MemoryUse a limited amount of memory

3 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 7: Compressed k-Nearest Neighbors Classification for Evolving

Introduction/

IoT Stream Mining

Challenges

Memory

Time

MemoryUse a limited amount of memory

TimeWork in a limited amount of time

3 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 8: Compressed k-Nearest Neighbors Classification for Evolving

Introduction/

IoT Stream Mining

Challenges

Memory

High-dimensionality

Time

MemoryUse a limited amount of memory

TimeWork in a limited amount of time

DimensionalityHandle high-dimensional data

3 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 9: Compressed k-Nearest Neighbors Classification for Evolving

Introduction/

IoT Stream Mining

Challenges

Memory

High-dimensionality

TimeConcept drifts

MemoryUse a limited amount of memory

TimeWork in a limited amount of time

DimensionalityHandle high-dimensional data

Concept driftsDetect and adapt to changes

3 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 10: Compressed k-Nearest Neighbors Classification for Evolving

Introduction/

IoT Stream Mining

Incorporate data on the fly

Single pass, one instance at a time

Once processed, it is discarded or archived

Be ready to predict at any instance

4 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 11: Compressed k-Nearest Neighbors Classification for Evolving

Introduction/

Objectives

The available algorithms have significant shortcomings :Some are efficient, but do not produce accurate modelSome produce accurate model, but inefficientCostly with high-dimensional data

ObjectivesImprove stream algorithms performanceGuarantee a good precisionTradeoff between resources and accuracy

⇒ Sampling, sketching, dimensionality reduction, · · ·

5 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 12: Compressed k-Nearest Neighbors Classification for Evolving

Introduction/

Objectives

The available algorithms have significant shortcomings :Some are efficient, but do not produce accurate modelSome produce accurate model, but inefficientCostly with high-dimensional data

ObjectivesImprove stream algorithms performanceGuarantee a good precisionTradeoff between resources and accuracy

⇒ Sampling, sketching, dimensionality reduction, · · ·

5 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 13: Compressed k-Nearest Neighbors Classification for Evolving

Introduction/

Objectives

The available algorithms have significant shortcomings :Some are efficient, but do not produce accurate modelSome produce accurate model, but inefficientCostly with high-dimensional data

ObjectivesImprove stream algorithms performanceGuarantee a good precisionTradeoff between resources and accuracy

⇒ Sampling, sketching, dimensionality reduction, · · ·

5 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 14: Compressed k-Nearest Neighbors Classification for Evolving

Compressed k-Nearest Neighbors/

Outline

1 Introduction

2 Compressed k-Nearest Neighbors

3 Experiments

4 Conclusions

5 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 15: Compressed k-Nearest Neighbors Classification for Evolving

Compressed k-Nearest Neighbors/

k-NN Algorithm

Uses a sliding window as a search spaceGiven an unclassified instance Xi from a stream S :

• Determines the kNN inside the window• Predicts the most frequent label

Inefficient at prediction time

Memory consuming

Inefficient with high-dimensional data

⇒ Dimensionality reduction

6 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 16: Compressed k-Nearest Neighbors Classification for Evolving

Compressed k-Nearest Neighbors/

k-NN Algorithm

Uses a sliding window as a search spaceGiven an unclassified instance Xi from a stream S :

• Determines the kNN inside the window• Predicts the most frequent label

Inefficient at prediction time

Memory consuming

Inefficient with high-dimensional data

⇒ Dimensionality reduction

6 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 17: Compressed k-Nearest Neighbors Classification for Evolving

Compressed k-Nearest Neighbors/

k-NN Algorithm

Uses a sliding window as a search spaceGiven an unclassified instance Xi from a stream S :

• Determines the kNN inside the window• Predicts the most frequent label

Inefficient at prediction time

Memory consuming

Inefficient with high-dimensional data

⇒ Dimensionality reduction

6 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 18: Compressed k-Nearest Neighbors Classification for Evolving

Compressed k-Nearest Neighbors/

Dimensionality Reduction (DR)

The projection of high-dimensional data into a low-dimensional space by reducingthe input features

Objectif : given an instance Xi ∈ Ra, we wish to obtain Yi ∈ Rm, where m� a

Principal Component Analysis (PCA)Random Projection (RP)Compressed Sensing (CS)Hashing Trick (HT)

7 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 19: Compressed k-Nearest Neighbors Classification for Evolving

Compressed k-Nearest Neighbors/

Dimensionality Reduction (DR)

The projection of high-dimensional data into a low-dimensional space by reducingthe input features

Objectif : given an instance Xi ∈ Ra, we wish to obtain Yi ∈ Rm, where m� a

Principal Component Analysis (PCA)Random Projection (RP)Compressed Sensing (CS)Hashing Trick (HT)

7 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 20: Compressed k-Nearest Neighbors Classification for Evolving

Compressed k-Nearest Neighbors/

Compressed Sensing (CS)

Data compression method that transforms and reconstructs data from fewsamples with h.pMatrix A used to transform instances from Ra → Rm,m� a

• Fourier transform, random matrices (e.g., Bernoulli, Gaussian)

Donoho, Compressed sensing, 2006.8 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 21: Compressed k-Nearest Neighbors Classification for Evolving

Compressed k-Nearest Neighbors/

Compressed Sensing (CS)

CS relies on two principles :

Sparsity : expresses the idea that data may be much smaller and arecompressible. X is s-sparse if ‖X‖0≤ s

Restricted Isometry Property (RIP) : A satisfies RIP ∀ s-sparse instanceX ∈ Ra, if there exists ε ∈ [0, 1] :

(1− ε)‖X‖22≤ ‖AX‖2

2≤ (1 + ε)‖X‖22

A satisfies the RIP with high probability if m ≥ (s log(a))

9 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 22: Compressed k-Nearest Neighbors Classification for Evolving

Compressed k-Nearest Neighbors/

Compressed Sensing (CS)

CS relies on two principles :

Sparsity : expresses the idea that data may be much smaller and arecompressible. X is s-sparse if ‖X‖0≤ sRestricted Isometry Property (RIP) : A satisfies RIP ∀ s-sparse instanceX ∈ Ra, if there exists ε ∈ [0, 1] :

(1− ε)‖X‖22≤ ‖AX‖2

2≤ (1 + ε)‖X‖22

A satisfies the RIP with high probability if m ≥ (s log(a))

9 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 23: Compressed k-Nearest Neighbors Classification for Evolving

Compressed k-Nearest Neighbors/

Compressed Sensing kNN (CS-kNN)

Data stream

Compressed sensing(Compression)

Feature extraction

kNN

10 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 24: Compressed k-Nearest Neighbors Classification for Evolving

Compressed k-Nearest Neighbors/

CS-kNN Algorithm

Algorithm Compressed-kNN. Symbols : S = {X1,X2, . . .} ∈ Ra : stream ;S ′ = {Y1,Y2, . . .} ∈ Rm : transformed data ; C : set of labels ; w : window ; k :number of neighbors.

1: function CS-kNN(S, k,m)2: Init w ← ∅3: for all Xi ∈ S do4: Yi ← CS(Xi ,m) . apply CS5: for all Yj ∈ w do6: compute DYj (Yi )7: c ← Predictc∈C Dw,k (Yi )

11 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 25: Compressed k-Nearest Neighbors Classification for Evolving

Compressed k-Nearest Neighbors/

CS-kNN : Theoretical Guarantees

The distance between two instances Xi and Xj is defined as follows :DXj (Xi ) =

√‖Xi − Xj‖2

Similarly, the k-nearest neighbors distance is defined as :Dw,k (Xi ) = min

(wk ),Xj ∈w

DXj (Xi )

(wk

)denotes the subset of w of size k

TheoremGiven a stream S = {Xi} and ε ∈ [0, 1], if there exists a transformation matrixA :Ra → Rm having the RIP, such that m = O(s log(a)), where s is the sparsity ofdata, then ∀Xi ∈ w :

(1− ε)D2w,k (X) ≤ D2

w,k (AX) ≤ (1 + ε)D2w,k (X)

→ CS-kNN captures the neighborhood up to some ε-divergence

12 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 26: Compressed k-Nearest Neighbors Classification for Evolving

Compressed k-Nearest Neighbors/

CS-kNN : Theoretical Guarantees

The distance between two instances Xi and Xj is defined as follows :DXj (Xi ) =

√‖Xi − Xj‖2

Similarly, the k-nearest neighbors distance is defined as :Dw,k (Xi ) = min

(wk ),Xj ∈w

DXj (Xi )

(wk

)denotes the subset of w of size k

TheoremGiven a stream S = {Xi} and ε ∈ [0, 1], if there exists a transformation matrixA :Ra → Rm having the RIP, such that m = O(s log(a)), where s is the sparsity ofdata, then ∀Xi ∈ w :

(1− ε)D2w,k (X) ≤ D2

w,k (AX) ≤ (1 + ε)D2w,k (X)

→ CS-kNN captures the neighborhood up to some ε-divergence

12 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 27: Compressed k-Nearest Neighbors Classification for Evolving

Experiments/

Outline

1 Introduction

2 Compressed k-Nearest Neighbors

3 Experiments

4 Conclusions

12 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 28: Compressed k-Nearest Neighbors Classification for Evolving

Experiments/

MOA

Massive Online Analysis 1 is a framework for online learning from data streams

It is written in Java

It includes tools for evaluation and a collection of machine learning algorithmsfor data streams

It is an easily extendable framework for new streams, algorithms and evaluationmethods

https://moa.cms.waikato.ac.nz/13 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 29: Compressed k-Nearest Neighbors Classification for Evolving

Experiments/

Results

Accuracy (%)

Dataset CS-kNN HT-kNN PCA-kNN kNNTweet1 78.82 73.77 80.43 79.80Tweet2 78.13 73.02 80.06 79.20Tweet3 76.75 72.40 81.93 78.86RBF 98.90 19.20 99.00 98.89CNAE 70.00 65.00 75.83 73.33Enron 96.02 95.76 94.59 96.18IMDB 69.86 69.65 70.57 70.94Spam 85.39 83.82 96.00 81.17Covt 91.36 77.18 91.55 91.67Overall ∅ 82.80 69.98 85.55 83.34

14 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 30: Compressed k-Nearest Neighbors Classification for Evolving

Experiments/

Results

Memory (MB)

Dataset CS-kNN HT-kNN PCA-kNN kNNTweet1 2.52 2.52 3.03 34.64Tweet2 2.52 2.52 5.97 70.97Tweet3 2.52 2.52 8.84 103.19RBF 2.52 2.52 8.86 13.18CNAE 2.52 2.52 3.09 61.37Enron 2.52 2.52 3.51 70.60IMDB 2.52 2.52 8.81 70.65Spam 2.52 2.52 245.22 1476.11Covt 2.52 2.52 3.02 3.47Overall ∅ 2.52 2.52 32.26 211.57

14 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 31: Compressed k-Nearest Neighbors Classification for Evolving

Experiments/

Results

Time (sec)

Dataset CS-kNN HT-kNN PCA-kNN kNNTweet1 62.55 93.24 622.65 1198.78Tweet2 107.48 120.83 705.71 2029.82Tweet3 126.73 154.22 988.25 2864.55RBF 59.47 168.31 243.26 284.34CNAE 0.87 0.95 3.97 32.19Enron 1.58 1.81 7.21 86.08IMDB 95.62 125.62 1686.88 7892.96Spam 159.92 194.07 11329.91 34231.45Covt 30.94 88.17 161.00 252.69Overall ∅ 71.68 105.25 1749.87 5430.32

14 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 32: Compressed k-Nearest Neighbors Classification for Evolving

Experiments/

Ensemble CS-kNN (CSB)

Ensemble-based method

D label

learner 1

learner 2

learner e

combination

Uses CS-kNN as a base learner under Leveraging Bagging (LB)

Uses several random matrices : one for each ensemble member

Preserves the neighborhood properties of the CS-kNN

⊕ Good accuracy

Computational resources

Bifet, et al., Leveraging bagging for evolving data streams, 2010.15 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 33: Compressed k-Nearest Neighbors Classification for Evolving

Experiments/

Ensemble CS-kNN (CSB)

Ensemble-based method

D label

learner 1

learner 2

learner e

combination

Uses CS-kNN as a base learner under Leveraging Bagging (LB)

Uses several random matrices : one for each ensemble member

Preserves the neighborhood properties of the CS-kNN

⊕ Good accuracy

Computational resources

Bifet, et al., Leveraging bagging for evolving data streams, 2010.15 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 34: Compressed k-Nearest Neighbors Classification for Evolving

Experiments/

Ensemble CS-kNN (CSB)

Ensemble-based method

D label

learner 1

learner 2

learner e

combination

Uses CS-kNN as a base learner under Leveraging Bagging (LB)

Uses several random matrices : one for each ensemble member

Preserves the neighborhood properties of the CS-kNN

⊕ Good accuracy

Computational resources

Bifet, et al., Leveraging bagging for evolving data streams, 2010.15 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 35: Compressed k-Nearest Neighbors Classification for Evolving

Conclusions/

Outline

1 Introduction

2 Compressed k-Nearest Neighbors

3 Experiments

4 Conclusions

15 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 36: Compressed k-Nearest Neighbors Classification for Evolving

Conclusions/

Conclusions

CS-kNN algorithm• Reduces the resource usage

• Preserves the neighborhood

CSB-kNN ensemble method• Improves accuracy

• Is slow

Open source contribution :

https://github.com/marouabahri/CS-kNN

16 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 37: Compressed k-Nearest Neighbors Classification for Evolving

Thank you

This work was done in the context of IoT AnalyticsDigiCosme Project and was funded by Labex DigiCosme.

Page 38: Compressed k-Nearest Neighbors Classification for Evolving

Compressed k-Nearest NeighborsClassification for Evolving Data

Streams

Maroua Bahri1, Albert Bifet1,2, Silviu Maniu3, Rodrigo F. de Mello4,Nikolaos Tziortziotis5

1LTCI, Telecom Paris, 2University of Waikato, 3LRI, Universite Paris-Sud, 4University ofSao Paulo, 5Tradelab

Page 39: Compressed k-Nearest Neighbors Classification for Evolving

Conclusions/

Overview of the data

Dataset #Instances #Attributes #Classes TypeTweets1 1,000,000 500 2 SyntheticTweets2 1,000,000 1,000 2 SyntheticTweets3 1,000,000 1,500 2 SyntheticRBF 1,000,000 200 10 SyntheticCNAE 1,080 856 9 RealEnron 1,702 1,000 2 RealIMDB 120,919 1,001 2 RealSpam 9,324 39,916 2 RealCovt 581,012 54 7 Real

17 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 40: Compressed k-Nearest Neighbors Classification for Evolving

Conclusions/

Results : Standard Deviation

10 30 50 70 90

dimensions

0

5

10

15

sta

nd

ard

de

via

tio

n e

stim

ate

s

CS-ARF

LBcs

SRPcs

HATcs

SAMkNNcs

NBcs

(a) Tweet3

10 30 50 70 90

dimensions

0

2

4

6

8

10

12

sta

nd

ard

de

via

tio

n e

stim

ate

s

CS-ARF

LBcs

SRPcs

HATcs

SAMkNNcs

NBcs

(b) Har

18 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 41: Compressed k-Nearest Neighbors Classification for Evolving

Conclusions/

Results : CS-kNN with Enron Dataset

10 15 20 25 30 35 40 45 50

Dimension

91

92

93

94

95

96

97

Acc

urac

yCS-kNNHT-kNNPCA-kNNkNN

10 15 20 25 30 35 40 45 50

Dimension

10 0

10 1

Mem

ory

CS-kNNHT-kNNPCA-kNNkNN

10 15 20 25 30 35 40 45 50

Dimension

10 0

10 1

10 2

Tim

e

CS-kNNHT-kNNPCA-kNNkNN

19 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification

Page 42: Compressed k-Nearest Neighbors Classification for Evolving

Conclusions/

Classification

1. Process an instance at a time

2. Use a limited amount of memory

3. Work in a limited amount of time

4. Be ready to predict at any instance

Bifet, et al., Data stream mining a practical approach, 2009.

20 Apr 15, 2021 M. Bahri Compressed k-Nearest Neighbors Classification