data mining and fusion techniques for wsns as a source of the big data

28
This project is funded by Structural Funds of the European Union and state budget of the Czech Republic Data Mining and Fusion Techniques for WSNs as a Source of The Big Data Mohamed Mostafa Fouad, PhD. Arab Academy for Science, Technology, and Maritime Transport, Cairo - Egypt IT4Innovations, VSB-Technical University of Ostrava, Ostrava - Czech Republic. Member at SRGE Research Group (www.egyptscience.net ).

Upload: mohamed-mostafa

Post on 06-Aug-2015

17 views

Category:

Data & Analytics


1 download

TRANSCRIPT

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

Data Mining and Fusion Techniques for WSNs as a Source of The Big Data

Mohamed Mostafa Fouad, PhD.

Arab Academy for Science, Technology, and Maritime Transport, Cairo - EgyptIT4Innovations, VSB-Technical University of Ostrava, Ostrava - Czech Republic.

Member at SRGE Research Group (www.egyptscience.net).

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

Agenda • WSN: An Overview– WSN Design challenges

• Big Data Challenges– Big Data Classification– Big Data Analysis Challenges

• Sensory Data Processing Techniques– Data Mining – Data Fusion

• Conclusions

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

• WSN: An Overview– WSN Design challenges

• Big Data Challenges– Big Data Classification– Big Data Analysis Challenges

• Sensory Data Processing Techniques– Data Mining – Data Fusion

• Conclusions

1

WSN: An Overview

Agenda

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

WSN: An Overview

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

WSN Design challenges

Resource Constraints

Depletable Energy Source

Security & Privacy

Distribution Strategy

Fault Tolerance

Heterogeneity

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

• WSN: An Overview– WSN Design challenges

• Big Data Challenges– Big Data Classification– Big Data Analysis Challenges

• Sensory Data Processing Techniques– Data Mining – Data Fusion

• Conclusions

2

Big Data Challenges

Agenda

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

Big Data Challenge

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

The Basic Big Data Challenge• The continues and high speeds Data Streaming.

BIG DATA STORAGE

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

Big Data Classification

Data Classification

Structured Semi-structured Unstructured

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

Big Data Analysis ChallengesE.g. cloud computing, WSN, IoT, Social Networks,

Search Engine, Biomedical, Mobile, NFC, etc.

Inconsistence Data

Redundancy Data

Incompleteness Data

Input Data Sources

Produced Data

Pre-Processing

Data Analysis Value Generating

Output Values

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

• WSN: An Overview– WSN Design challenges

• Big Data Challenges– Big Data Classification– Big Data Analysis Challenges

• Sensory Data Processing Techniques– Data Mining – Data Fusion

• Conclusions

3

Sensory Data Processing Tech.

Agenda

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

The Sensor Data Processing Techniques

• Standard data processing models (such as RDBMS) may not be applicable for WSN.

• From our perspective, the new processing techniques should not only specially designed for WSNs but also will have their benefits over the high-volume and high-velocity big data such as:

• Data Mining, and • Data Fusion.

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

• WSN: An Overview– WSN Design challenges

• Big Data Challenges– Big Data Classification– Big Data Analysis Challenges

• Sensory Data Processing Techniques– Data Mining – Data Fusion

• Conclusions

3-A

Data Mining Over WSNs

Agenda

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

Data Mining Over WSNs• The need for extracting knowledge from the sensor data,

collected from WSNs, has become an important issue in real-time decision systems.

• However, the traditional data mining techniques not applicable due to the characteristics of WSNs.

• The data mining algorithms for WSNs could be generally classified into:

–Centralized data mining. –Distributed data mining.

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

Centralized Data Mining Approaches• In the centralized approaches, all sensors send their

data to a centralized computing resources usually the sink node to be processed.

Exam

ples

of C

entr

alize

d D

ata

Min

ing

Appr

oach

es Defining the sensors’ missing data

Mining for WSN-Web based applications

Mining for multiple data streams

• Usually the centralized approach requires a high computational power with non-bounded energy sources.

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

Examples of Centralized Data Mining Approaches

• The Mining used to defining the sensors’ missing data, such as:– Data Stream Association Rule Mining (DSARM) framework.– Adaptive Multiple Regression (AMR) framework (the enhanced

version of DSARM).

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

Examples of Centralized Data Mining Approaches

• The Mining used for sensor-based applications those heavily utilize the World Wide Web (WWW), such as:– the Sensor Web which are considered as another layer added to the

WWW.– The XML language provides the suitable solution to connect the sensors

directly to the web applications. But the problem is the tree structure of the XML document.

– Paik et al. have proposed a reformulation of the association rules for XML streamed data.

– The main idea of the solution is that the association rules are used with the Label Projection Approach to generate frequent XML tree items without any redundancy.

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

Examples of Centralized Data Mining Approaches

• The Mining used for dealing with multiple data streams, such as:

– the MG-join algorithm which used the Discrete Fourier transforms

(DFTs) to reduce the dimensionality of streamed data into a few

numbers of coefficients.

– They used incremental methodology to update the streamed data.

– The main issue that the increase number of coefficients will affect

the performance of the algorithm.

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

Distributed Data Mining Approaches• Each node uses its limited computing resources to

perform the mining process. The main advantage of this approach is reducing the raw data streams to be delivered to the sink node.

• However, it may deplete the network resources in terms of memory footprint and energy consumptions. Ex

ampl

es o

f disti

bute

d D

ata

Min

ing

Appr

oach

es

The on-disk data structure (DSTable)

Trained classifiers for capturing semantic features

Deep neural network (DNN)A Machine Learning approach

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

Example of Distributed Data Mining

• The Stream Mining Application (SMA) for distributed mining in WSN.

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

• WSN: An Overview– WSN Design challenges

• Big Data Challenges– Big Data Classification– Big Data Analysis Challenges

• Sensory Data Processing Techniques– Data Mining – Data Fusion

• Conclusions

3-B

Data Fusion Over WSNs

Agenda

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

Data Fusion Over WSNs• Data fusion is an

important concept in both big data and WSNs. In the big data context, the fusion is achieved at the computational platform while in the WSNs context, the fusion is performed inside the network (i.e. in-network process).

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

Fusion based on Input Sources Relation

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

Fusion based on Level of Abstraction

Fusion based on Level of Abstraction

Low-level fusion

Combining a number of raw input data into

a new and accurate raw data

Medium- level fusion

Provides an abstraction map

of all features andattributes of the

entry data

High-level fusion

combining symbols/decisions

from different sensor sources toestablish a single

accurate symbol/decision

Multilevel fusion

Com

bini

ng m

ore

than

on

e fu

sion

App

roac

h

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

Example of the Fusion based on Level of Abstraction

• The multilevel fusion system which combines the medium-level and the high-level fusions in the automation of obstacle detection application.

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

• WSN: An Overview– WSN Design challenges

• Big Data Challenges– Big Data Classification– Big Data Analysis Challenges

• Sensory Data Processing Techniques– Data Mining – Data Fusion

• Conclusions

4

Conclusions

Agenda

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

Conclusions• The talk has focused on the need to apply pre-processing techniques at the data collected

from the WSNs (in-network pre-processing operations on sensor data). Rather than transmitting amount of continues streaming data to big data storage, such as WSNs’s Data mining and data fusion techniques.

• The advantages and the limitation of centralized and distributed data mining techniques for WSNs have analyzed.

• Moreover, the data fusion techniques, ensuring the accuracy and trustiness of the collected data, and their sub-classes (i.e. abstraction-based and input sources relations-based) have been discussed and analyzed in terms of the energy consumptions and the limited resources of the WSNs.

• It is then concluded that as main sources of big data, it is vital for the sensor data to be in-network processed as this would prolong the WSNs lifetime and contribute to reduction of data volume of the big data, thus accelerating of the values discovery process from this big data.

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic

Thank you

Mohamed Mostafa [email protected]

Ostrava, Faculty of Electrical Engineering and Computer Science (20th April 2015)

This project is funded by Structural Funds of the European Union and state budget of the Czech Republic