taco: t unable a pproximate c omputation of o utliers in wireless sensor networks
DESCRIPTION
TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks. ΠΑΟ : Π ροσεγγιστικός υπολογισμός Α κραίων τιμών σε περιβάλλ Ο ντα ασυρμάτων δικτύων αισθητήρων. Outline. Introduction Why outlier detection is important Definition of outlier The TACO Framework - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/1.jpg)
TACO:
Tunable Approximate Computation of Outliers in
Wireless Sensor Networks
8 July 2010 HDMS 2010, Ayia Napa, Cyprus
* Dept. of Informatics, University of Piraeus,
Piraeus, Greece
† Dept. of Informatics,Athens University of
Economics and Business,Athens, Greece
# Dept. of Electronic and Computer Engineering,Technical University of
Crete,Crete, Greece
Nikos Giatrakos* Yannis Kotidis† Antonios Deligiannakis#
Vasilis Vassalos† Yannis Theodoridis*
![Page 2: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/2.jpg)
ΠΑΟ: Προσεγγιστικός υπολογισμός Ακραίων τιμών σε περιβάλλΟντα ασυρμάτων δικτύων αισθητήρων
8 July 2010 HDMS 2010, Ayia Napa, Cyprus
* Dept. of Informatics, University of Piraeus,
Piraeus, Greece
† Dept. of Informatics,Athens University of
Economics and Business,Athens, Greece
# Dept. of Electronic and Computer Engineering,Technical University of
Crete,Crete, Greece
Nikos Giatrakos* Yannis Kotidis† Antonios Deligiannakis#
Vasilis Vassalos† Yannis Theodoridis*
![Page 3: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/3.jpg)
Outline• Introduction
– Why outlier detection is important– Definition of outlier
• The TACO Framework– Compression of measurements at the sensor level (LSH)– Outlier detection within and amongst clusters– Optimizations: Boosting Accuracy & Load Balancing
• Experimental Evaluation• Related Work• Conclusions
4
![Page 4: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/4.jpg)
• Wireless Sensor Networks utility– Place inexpensive, tiny motes in areas of interest– Perform continuous querying operations– Periodically obtain reports of quantities under study– Support sampling procedures, monitoring/ surveillance applications etc
• Constraints– Limited Power Supply– Low Processing Capabilities– Constraint Memory Capacity
• Remark - Data communication is the main factor of energy drain
Introduction
5
![Page 5: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/5.jpg)
• Outliers may denote malfunctioning sensors– sensor measurements are often unreliable– dirty readings affect computations/decisions [Deligiannakis ICDE’09]
• Outliers may also represent interesting events detected by few sensors– fire detected by a sensor
• Take into consideration– the recent history of samples acquired by single motes – correlations with measurements of other motes!
Why Outlier Detection is Useful
6
16 19 24 30 32 40 39
![Page 6: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/6.jpg)
Outlier Definition• Let ui denote the latest W measurements obtained by mote Si
• Given a similarity metric sim: RW→[0,1] and a similarity threshold Φ, sensors Si, Sj are considered similar if:
sim(ui , uj ) > Φ
• Minimum Support Requirement– a mote is classified as outlier if its latest W measurements are not
found to be similar with at least minSup other motes
10
![Page 7: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/7.jpg)
Network organization into clusters [(Younis et al, INFOCOM ’04),(Qin et al, J. UCS ‘07)]
TACO Framework – General Idea
11
Step 1: Data Encoding and Reduction• Motes obtain samples and keep the
latest W measurements in a tumble• Encode W in a bitmap of d<<W size
Clusterhead Regular Sensor
8.2
4.3
…
W d5.1
0
0…
1
![Page 8: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/8.jpg)
TACO Framework – General Idea
12
Step 1: Data Encoding and Reduction• Motes obtain samples and keep the
latest W measurements in a tumble• Encode W in a bitmap of d<<W size
Step 2: Intra-cluster Processing• Encodings are transmitted to
clusterheads• Clusterheads perform similarity tests
based on a given similarity measure and a similarity threshold Φ
• … and calculate support values
Clusterhead Regular Sensor
If Sim(ui,uj)>Φ { supportSi++; supportSj++;}
![Page 9: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/9.jpg)
TACO Framework – General Idea
13
Clusterhead Regular Sensor
Step 1: Data Encoding and Reduction• Motes obtain samples and keep the
latest W measurements in a tumble• Encode W in a bitmap of d<<W size
Step 2: Intra-cluster Processing• Encodings are transmitted to
clusterheads• Clusterheads perform similarity tests
based on a given similarity measure and a similarity threshold Φ
• … and calculate support values
Step 3: Inter-cluster Processing• An approximate TSP problem is solved.
Lists of potential outliers are exchanged.
Additional load-balancing mechanisms and improvements in accuracy devised
![Page 10: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/10.jpg)
TACO Framework
14
Step 1: Data Encoding and Reduction• Motes obtain samples and keep the
latest W measurements in a tumble• Encode W in a bitmap of d<<W size
Clusterhead Regular Sensor
8.2
4.3
…
W d5.1
0
0…
1
![Page 11: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/11.jpg)
• Desired Properties– Dimensionality Reduction Reduced bandwidth consumption
– Similarity PreservationAllows us to later derive initial sim(ui , uj ) during vector comparisons
Data Encoding and Reduction
15
• Locality Sensitive Hashing (LSH)
Ph є F [h(ui)=h(uj)]= sim(ui , uj )
• Practically, any similarity measure satisfying a set of criteria [Charikar, STOC ‘02] may be incorporated in TACO’s framework
![Page 12: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/12.jpg)
16
LSH Example: Random Hyperplane Projection
• Family of n d-dimensional random vectors (rvi)
• Generates for each data vector a bitmap of size n as follows:- Sets biti=1 if dot product of
data vector with i-th random vector is positive
- Sets biti=0 otherwise
Sensor data
(2-dimensional)rv1
rv2
rv3
rv4
1 0 10
TACO encoding:
[(Goemans & Wiliamson, J.ACM ’95),(Charikar, STOC ‘02) ]
![Page 13: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/13.jpg)
17
Computing Similarity• Cosine Similarity: cos(θ(ui,uj))
1 0 1 1 1 1
0 0 0 1 1 1θ(RHP(ui),RHP(uj))=2/6*π=π/3
ui
ui
RHP(ui)
RHP(uj)
n bits θ(ui, ui)
Angle Similarity
Hamming Distance
![Page 14: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/14.jpg)
Supported Similarity Measures
Cosine Similarity
cos(θ(ui , uj))
Correlation Coefficient
corr(ui , uj )=cov(ui , uj) /( σui*σuj
) = Ε[(ui- E[ui])(uj-E[uj])]/ (σui*σuj
)[details in paper]
Jaccard Coefficient
Jaccard(A,B) = |A B|/ |A B|see [Gionis et al, SIGMOD ‘01]
Lp-Norms see [Datar et all, DIMACS SDAM’03]
19
![Page 15: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/15.jpg)
TACO Framework
20
Step 1: Data Encoding and Reduction• Motes obtain samples and keep the
latest W measurements in a tumble• Encode W in a bitmap of d<<W size
Step 2: Intra-cluster Processing• Encodings are transmitted to
clusterheads• Clusterheads perform similarity tests
based on a given similarity measure and a similarity threshold Φ
• … and calculate support values
Clusterhead Regular Sensor
If Sim(ui,uj)>Φ { supportSi++; supportSj++;}
![Page 16: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/16.jpg)
Intra-cluster Processing• Goal: Find potential outliers within the clusters realm• Back to our running example, sensor vectors are considered
similar whenθ(ui , uj) < Φθ
• Translate user-defined similarity threshold Φθ Φh = Φθ * d/π
• For any received pair of bitmaps Xi, Xj, clusterheads can obtain an estimation of the initial similarity based on their hamming distance Dh(Xi,Xj) using:
Dh(Xi,Xj) < Φh
• At the end of the process <Si, Xi, support> lists are extracted for motes that do not satisfy the minSup parameter
21
![Page 17: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/17.jpg)
Intra-cluster Processing
22
Probability of correctly classifying similar motes as such (W=16, θ=5, Φθ=10):
![Page 18: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/18.jpg)
TACO Framework
23
Clusterhead Regular Sensor
Step 1: Data Encoding and Reduction• Motes obtain samples and keep the
latest W measurements in a tumble• Encode W in a bitmap of d<<W size
Step 2: Intra-cluster Processing• Encodings are transmitted to
clusterheads• Clusterheads perform similarity tests
based on a given similarity measure and a similarity threshold Φ
• … and calculate support values
Step 3: Inter-cluster Processing• An approximate TSP problem is solved.
Lists of potential outliers are exchanged.
![Page 19: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/19.jpg)
Boosting TACO Encodings
Obtain the answer provided by the majority of the μ tests
25
0 0 000 111 11 0 0 000 111 11 0 0 000 111 11
10 00 01 01 1 1 10 00 00 01 1 1 10 00 00 01 1 1
d=n·μ
1 1 0 1 0 1
Xi :
Xj :
SimBoosting(Xi,Xj)=1
• Check the quality of the boosting estimation(θ(ui,uj)≤ Φθ):- Unpartitioned bitmaps: Pwrong(d)=1-Psimilar(d)
- Boosting: , Pwrong(d,μ) ≤
• Decide an appropriate μ:- Restriction on μ : Psimilar(d/μ)>0.5 - Comparison of (Pwrong(d,μ) , Pwrong(d))
![Page 20: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/20.jpg)
Comparison Pruning
26
Modified cluster election process, returns B bucket nodes
Introducing a 2nd level of hashing based on the hamming weight of the bitmaps
Comparison pruning is achieved by hashing highly dissimilar bitmaps to different buckets
d
0 d/4 d/4 d/2 d/2 3d/4 3d/4 d
Clusterhead – Bucket Node Regular Sensor
![Page 21: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/21.jpg)
Load Balancing Among Buckets
27
Histogram Calculation Phase:• Buckets construct equi-width histogram
based on the received Xi s hamming weight frequency
0 d/4 d/4 d/2 d/2 3d/4 3d/4 d
Histogram Communication Phase:• Each bucket communicates to the
clusterhead- Estimated frequency counts- Width parameter ci
Hash Key Space Reassignment: • Clusterhead determines a new space
partitioning and broadcasts the corresponding information
SB1 SB2=SC
SB3 SB4
0 0 1
c1=d/12
001
c4=d/12c2=d/163 3 2 2
c3=d/163 3 4 6
[f=(0,0,1), c1=d/12]
[f=(1,0,0), c4=d/12]
[f=(3,3,4,6), c3=d/16]
SB1 [0-3d/8] SB2 (3d/8-9d/16] SB3 (9d/16-11d/16] SB4 (11d/16-d]
![Page 22: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/22.jpg)
Outline• Introduction
– Why is Outlier Detection Important and Difficult• Our Contributions
– Outlier detection with limited bandwidth– Compute measurement similarity over compressed representations of
measurements (LSH) • The TACO Framework
– Compression of measurements at the sensor level– Outlier detection within and amongst clusters
• Optimizations: Load Balancing & Comparison Pruning• Experimental Evaluation• Related Work• Conclusions
28
![Page 23: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/23.jpg)
Sensitivity Analysis• Intel Lab Data -
Temperature
29
10 15 20 25 300
0.10.20.30.40.50.60.70.80.9
1
1/2 Reduction 1/4 Reduction
1/8 Reduction 1/16 Reduction
Similarity AngleTumbleSize=16 support=4
10 15 20 25 300
0.10.20.30.40.50.60.70.80.9
1
1/2 Reduction 1/4 Reduction
1/8 Reduction 1/16 Reduction
Similarity Angle TumbleSize=16 support=4
Avg.
Rec
all
Avg.
Pre
cisi
on
![Page 24: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/24.jpg)
Sensitivity Analysis
30
• Boosting Intel Lab Data - Humidity
16 20 24 28 320
0.10.20.30.40.50.60.70.80.9
1
1 Boosting Group
4 Boosting Groups
8 Boosting Groups
TumbleSizeReduction = 1/8, support=4, Φθ=30
16 20 24 28 320
0.10.20.30.40.50.60.70.80.9
1
1 Boosting Group4 Boosting Groups8 Boosting Groups
TumbleSizeReduction=1/8, support=4, Φθ=30
Avg.
Rec
all
Avg.
Pre
cisi
on
![Page 25: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/25.jpg)
Performance Evaluation in TOSSIM
31
• For 1/8reduction TACO provides on average 1/12 less bandwidth consumption, which reaches a maximum value of 1/15
TACO 1/16 Reduction
TACO 1/8 Reduction
TACO 1/4 Reduction
NonTACO SelectStar0.00E+00
1.00E+04
2.00E+04
3.00E+04
4.00E+04
5.00E+04
6.00E+04
7.00E+04
8.00E+04
9.00E+04
Min
Average
MaxTo
tal B
its T
rans
mitt
ed P
er T
umbl
e
![Page 26: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/26.jpg)
Performance Evaluation in TOSSIM
32
• Network Lifetime: the epoch at which the first mote in the network dies.
• Average lifetime for motes initialized with 5000 mJ residual energy
• Reduction in power consumption reaches a ratio of 1/2.7
TACO 1/4 Reduction
NonTACO SelectStar0
100
200
300
400
500
Epoc
h
![Page 27: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/27.jpg)
TACO vs Hierarchical Outlier Detection Techniques
33
1 2 3 40
0.10.20.30.40.50.60.70.80.9
1
RobustTACO 1/4 Re-ductionTACO 1/8 Re-duction
SupportTumbleSize=16,Corr _Threshold=Cos(30)≈0.87
F-M
easu
re
• Robust [Deligiannakis et al, ICDE ‘09] falls short up to 10% in terms of the F-Measure metric
• TACO ensures less bandwidth consumption with a ratio varying from 1/2.6 to 1/7.8
1 2 3 40.00E+00
5.00E+03
1.00E+04
1.50E+04
2.00E+04
2.50E+04
3.00E+04
3.50E+04
4.00E+04 Series4
TACO - Remaining
TACO - Intercluster
SupportTumbleSize=16, Corr _Threshold=Cos(30)≈0.87
Avg.
Bits
Tra
nsm
itted
Per
Tum
ble
1/4
Red
uctio
n
1/8
Redu
c-tio
n1/
16 R
educ
tion
1/4
Redu
ction
1/8
Redu
ction
1/16
Red
uctio
n
1/4
Redu
ction
1/8
Redu
ction
1/16
Red
uctio
n
1/4
Redu
ction
1/8
Redu
ction
1/16
Red
uc-
tion
![Page 28: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/28.jpg)
Outline• Introduction
– Why is Outlier Detection Important and Difficult• Our Contributions
– Outlier detection with limited bandwidth– Compute measurement similarity over compressed representations of
measurements (LSH) • The TACO Framework
– Compression of measurements at the sensor level– Outlier detection within and amongst clusters
• Optimizations: Load Balancing & Comparison Pruning• Experimental Evaluation• Related Work• Conclusions
34
![Page 29: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/29.jpg)
Related Work - Ours• Outlier reports on par with aggregate query answer [Kotidis et al,
MobiDE’07]– hierarchical organization of motes– takes into account temporal & spatial correlations as well– reports aggregate, witnesses & outliers
• Outlier-aware routing [Deligiannakis et al, ICDE ‘09]– route outliers towards motes that can potentially witness them– validate detection scheme for different similarity metrics (correlation
coefficient, Jaccard index also supported in TACO)• Snapshot Queries [Kotidis, ICDE ’05]
– motes maintain local regression models for their neighbors– models can be used for outlier detection
• Random Hyperplane Projection using Derived Dimensions [Georgoulas et al MobiDE’10]– extends LSH scheme for skewed datasets– up to 70% improvements in accuracy
35
![Page 30: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/30.jpg)
Related Work
• Kernel based approach [Subramaniam et al, VLDB ‘06]• Centralized Approaches [Jeffrey et al, Pervasive ‘06]• Localized Voting Protocols
[(Chen et al, DIWANS ’06),(Xiao et al, MobiDE ‘07) ]• Report of top-K values with the highest deviation
[Branch et al, ICDCS ‘06]• Weighted Moving Average techniques [Zhuang et al,
ICDCS ’07]
36
![Page 31: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/31.jpg)
Συμπεράσματα• Our Contributions
– outlier detection with limited bandwidth• The TACO/ΠΑΟ Framework
– LSH compression of measurements at the sensor level– outlier detection within and amongst clusters– optimizations: Boosting Accuracy & Load Balancing
• Experimental Evaluation– accuracy exceeding 80% in most of the experiments– reduced bandwidth consumption up to a factor of 1/12 for 1/8
reduced bitmaps– prolonged network lifetime up to a factor of 3 for 1/4 reduction
ratio
38
![Page 32: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/32.jpg)
TACO:
Tunable Approximate Computation of Outliers in
Wireless Sensor Networks
Thank you!Nikos Giatrakos Yannis Kotidis Antonios Deligiannakis
Vasilis Vassalos Yannis Theodoridis
![Page 33: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/33.jpg)
Backup Slides
40
![Page 34: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/34.jpg)
TACO Framework
41
Clusterhead Regular Sensor
8.24.3…
W01…
d Step 1: Data Encoding and Reduction• Motes obtain samples and keep the
latest W measurements in a tumble• Encode W in a bitmap of d<<W size
Step 2: Intra-cluster Processing• Encodings are transmitted to
clusterheads• Clusterheads perform similarity tests
based on a given similarity measure and a similarity threshold Φ
• … and calculate support values
Step 3: Inter-cluster Processing• An approximate TSP problem is solved.
Lists of potential outliers are exchanged.If Sim(ui,uj)>Φ {
supportSi++; supportSj++;}
![Page 35: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/35.jpg)
• Comparison Pruning is ensured by the fact that highly dissimilar bitmaps are hashed to different buckets, thus never being tested for similarity
Leveraging Additional Motes for Outlier Detection
42
Introducing a 2nd level of hashing:• Besides cluster election, process
continuous in each cluster so as to select B bucket nodes with
• For , 0≤ Wh(Xi)≤ d equally distribute the hash key space amongst them
• Hash each bitmap to thebucket
• For bitmaps with Wh(Xi) at the edge of a bucket, transmit Xi to the range:
which is guaranteed to contain at most 2 buckets since
d
0 d/4 d/4 d/2 d/2 3d/4 3d/4 d
Clusterhead – Bucket Node Regular Sensor
![Page 36: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/36.jpg)
Leveraging Additional Motes for Outlier Detection
43
Intra-cluster Processing:• Buckets perform bitmap comparisons
as in common Intra-cluster processing• Constraints:
-If , similarity test is performed only in that bucket- For encodings that were hashed to the same 2 buckets, similarity is tested only in the bucket with the lowest SBi
• PotOut formation:-Si PotOut if it is not reported by all buckets it was hashed to-Received support values are added and Si є PotOut iff SupportSi < minSup
d
0 d/4 d/4 d/2 d/2 3d/4 3d/4 d
Clusterhead – Bucket Node Regular Sensor
Ï
![Page 37: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/37.jpg)
Experimental Setup• Datasets:
– Intel Lab Data : • Temperature and Humidity measurements • Network consisting of 48 motes organized into 4 clusters• Measurements for a period of 633 and 487 epochs respectively• minSup=4
– Weather Dataset : • Temperature, Humidity and Solar Iradiance measurements• Network consisting of 100 motes organized into 10 clusters• Measurements for a period of 2000 epochs• minSup=6
44
![Page 38: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/38.jpg)
Experimental Setup• Outlier Injection
– Intel Lab Data & Weather Temperature, Humidity data : • 0.4% probability that a mote obtains a spurious measurement at
some epoch• 6% probability that a mote fails dirty at some epoch
– Every mote that fails dirty increases its measurements by 1 degree until it reaches a MAX_VAL parameter, imposing a 15% noise at the values
– Intel Lab Data MAX_VAL=100– Weather Data MAX_VAL=200
– Weather Solar Irradiance data : • Random injection of values obtained at various time periods to the
sequence of epoch readings• Simulators
– TOSSIM network simulator– Custom, lightweight Java simulator
45
![Page 39: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/39.jpg)
Sensitivity Analysis• Intel Lab Data -
Humidity
46
• Weather Data -Humidity
10 15 20 25 300
0.10.20.30.40.50.60.70.80.9
1
1/2 Reduction1/4 Reduction1/8 Reduction1/16 Reduction
Similarity AngleTumbleSize=16 support=4
10 15 20 25 300
0.10.20.30.40.50.60.70.80.9
1
1/2 Reduction1/4 Reduction1/8 Reduction1/16 Reduction
Similarity AngleTumbleSize=16 support=4
10 15 20 25 300
0.10.20.30.40.50.60.70.80.9
1
1/2 Reduction1/4 Reduction1/8 Reduction1/16 Reduction
Similarity AngleTumbleSize=20 support=6
10 15 20 25 300
0.10.20.30.40.50.60.70.80.9
1
1/2 Reduction1/4 Reduction1/8 Reduction1/16 Reduction
Smilarity Angle TumbleSize=20 support=6
Avg.
Pre
cisi
onAv
g. P
reci
sion
Avg.
Rec
all
Avg.
Rec
all
![Page 40: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/40.jpg)
Sensitivity Analysis• Weather Data -
Solar Irradiance
47
• Boosting Intel Lab Data - Humidity
10 15 20 25 300
0.10.20.30.40.50.60.70.80.9
1
1/2 Reduction1/4 Reduction1/8 Reduction1/16 Reduction
Similarity AngleTumbleSize=32 support=6
10 15 20 25 300
0.10.20.30.40.50.60.70.80.9
1
1/2 Reduction1/4 Reduction1/8 Reduction1/16 Reduction
Similarity AngleTumbleSize=32 support=6
Avg.
Rec
all
Avg.
Pre
cisi
on
16 20 24 28 320
0.10.20.30.40.50.60.70.80.9
1
1 Boosting Group4 Boosting Groups8 Boosting Groups
TumbleSizeReduction = 1/8, support=4, Φθ=30
16 20 24 28 320
0.10.20.30.40.50.60.70.80.9
1
1 Boosting Group4 Boosting Groups8 Boosting Groups
TumbleSizeReduction=1/8, support=4, Φθ=30
Avg.
Rec
all
Avg.
Pre
cisi
on
![Page 41: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/41.jpg)
Performance Evaluation in TOSSIM
48
• Transmitted bits categorization per approach
ToClusterhead Retransmissions Intercluster ToBS0.00E+00
5.00E+03
1.00E+04
1.50E+04
2.00E+04
2.50E+04
3.00E+04
3.50E+04
TACO 1/16 Reduction
TACO 1/8 Reduction
TACO 1/4 Reduction
NonTACO
SelectStar
Avg.
Bits
Tra
nsm
itted
Per
Tum
ble
![Page 42: TACO: T unable A pproximate C omputation of O utliers in Wireless Sensor Networks](https://reader031.vdocument.in/reader031/viewer/2022012919/56816689550346895dda422e/html5/thumbnails/42.jpg)
Bucket Node Introduction
49
Φθ
Cluster Size
10 20
#Buckets #Cmps#Multihash Messages
#Bitmaps PerBucket #Cmps
#Multihash Messages
#Bitmaps PerBucket
121 66 0 12 66 0 122 38,08 0,90 6,45 40,92 1,36 6,684 24,55 7,71 3,65 30,95 8,88 4,08
241 276 0 24 276 0 242 158,06 1,62 12,81 171,80 2,76 13,384 101,10 14,97 7,27 128,63 17,61 8,15
361 630 0 36 630 0 362 363,64 2,66 19,33 394,97 4,30 20,154 230,73 22,88 10,88 291,14 26,28 12,19
481 1128 0 48 1128 0 482 640,10 3,14 25,57 710,95 5,85 26,934 412,76 30,17 14,49 518,57 34,64 16,21