mr. apichon witayangkurn [email protected] ......enterprise sensor network key features...

28
Mr. Apichon Witayangkurn [email protected] Department of Civil Engineering The University of Tokyo Hive/Hadoop Messaging Service Sensor Network

Upload: others

Post on 01-Jun-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Mr. Apichon Witayangkurn

[email protected] of Civil Engineering

The University of Tokyo

Hive/Hadoop

Messaging Service

Sensor

Network

Page 2: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Contents

Univ.Tokyo.

Introduction1

What & Why Sensor Network2

Enterprise Sensor Network3

Conclusion and Future work4

2Apichon W.

Page 3: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Introduction

BackgroundSensor technology is very famous and available at low cost in themarket nowadays. ex: weather sensors, co2, radiation and so on.

It is widely used in many fields of research and applications suchas Environment monitoring, Pollution monitoring, Disatermonitoring , Agriculture field monitoring and Traffic monitoring.

Most of applications are developed based on its specification andapplication. Difficult to apply for using in other purpose or withdifferent sensor.

Sharing sensor information among system is difficult due to lackof standardization.

There is Sensor Web Enablement (SWE) from OGC but not focuson a concrete detail of application development.

Univ.Tokyo.3Apichon W.

Page 4: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

What & Why Sensor Network

Sensor Network A group of heterogeneous sensor system connected together using communication infrastructure to exchange information between sensor stations or sensor nodes. All sensor nodes are able to link or synchronize data among each other or main station so that it acts as network. It is driven by the progress of 3 technologies: Sensors, Field platform and Internet.

Univ.Tokyo.4Apichon W.

Sensor Network

Internet

SensorsPlatform

Page 5: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

What & Why Sensor Network

What is needed for Sensor Network

Univ.Tokyo.5Apichon W.

Page 6: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

What & Why Sensor Network

Sensor Service GRID (SSG) – Sensor Middleware

Univ.Tokyo.6Apichon W.

Page 7: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

What & Why Sensor Network

Issues in Sensor Network

How to deal and handle large size sensor network (Nodes and Data)How to scale to larger size with minimizing effortsInsufficient processor, I/O, and storage resources for large-scaleHeterogeneous and vender-specific sensor are difficult to connect withsensor network.It must be able to operate under any network even unstable network.Real-time and Near real-timeIt must provide channel or interface for 3rd party application to connectwith and use data in sensor service.Standardization interface to be compatible with other softwareRapid installation and ease of use.Visualization with GIS enableLow cost??

Univ.Tokyo.7Apichon W.

Page 8: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Enterprise Sensor Network

Key FeaturesLarge-scale support with cloudMassive data and real-time data processingFlexible data communicationEasy integration, installation and ease of useHigh-frequency and multi-dimension support Open standard and integrating supportSpatial data support

Univ.Tokyo.8Apichon W.

The Goal: “Design and develop a prototype of sensor network system supported various sensors, support any network topology and can easily scale from small to large size with minimizing efforts and human operation”

Page 9: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Enterprise Sensor Network

Univ.Tokyo.9Apichon W.

System Overview

Page 10: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Enterprise Sensor Network

Sensor Stations (SOSes)SOS is a sensor station installed and deployed at field site.It handles feeding data from sensors as well as sending data tocloud service.It can be fixed-station or mobile station with mission support.A combination of SOS Service and Web Server.It support both push and pull data feeding.Divided into 3 types based on its features

Rich-node: fully functions with web UI and 2-way controlDump-node: data feeder only (storage, processing cost)Virtual-node: Share resource, no HW, more than one node

Univ.Tokyo.10Apichon W.

Page 11: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Enterprise Sensor Network

Sensor Station Design

Univ.Tokyo.11Apichon W.

Page 12: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Enterprise Sensor Network

Messaging Service as communication mediumEnable 2-ways control between station and cloud servicesSupport multiple ConnectorsSupport various type of message storageLoad balance and cluster support

Univ.Tokyo.12Apichon W.

(Source: ActiveMQ, Apache)

Page 13: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Enterprise Sensor Network

Network of BrokersBrokers can be linked together to form a network or cluster of brokers. A network of brokers can use various network topologies, such as hub-and-spoke, daisy chain, or mesh.

Univ.Tokyo.13Apichon W.

Page 14: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Enterprise Sensor Network

Sensor Cloud Service It is a sensor data middleware which provides users with aplatform to receive data from remote field sensor networksincluding data interface and virtualization.

Typically characterized by the features:High PerformanceScalabilityReliabilityOpen Architecture

Univ.Tokyo.14Apichon W.

Spatial Database

Arbitrary Processing

Services

Spatial Query

Cloud Service (Hadoop/Hive)

Web Services

Web Interface

Sensor Virtualization

Synchronization

Services

Proprietary

API

Open Standard

API

Command Services

3rd App Connectors

Page 15: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Key Technology

Univ.Tokyo.15Apichon W.

What is Hadoop An open source framework, Free !!Distributed applications for large dataParallel processingRun on Commodity machinesScalableVery Famous

Hive is a data warehousing package on Hadoop with SQL-like.

In 2011, Facebookclaimed that they had the largest Hadoop cluster in the world with 30 PB of storage with nearly 10,000 nodes.

Hive provide a SQL-like language called HiveQL via Web GUI and JDBC

Page 16: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Key Technology

Univ.Tokyo.16Apichon W. (D2)

Project under Hadoop umbrella Common—A set of components and interfaces for distributed filesystems and general I/O (serialization, Java RPC, persistent datastructures).MapReduce—A distributed data processing model and executionenvironment that runs on large clusters of commodity machines.HDFS—A distributed filesystem that runs on large clusters ofcommodity machines.Hive—A distributed data warehouse. Hive manages data stored inHDFS and provides a query language based on SQL (and which istranslated by the runtime engine to MapReduce jobs) for querying thedata.Sqoop—A tool for efficiently moving data between relationaldatabases and HDFS.

Page 17: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Hadoop main component

Key Technology

TaskTracker

JobTracker

Secondary NameNode

DataNode

NameNodeNameNode is the bookkeeper of HDFS; it keeps track of how your files arebroken down into file blocks, which nodes store those blocks, and the overallhealth of the distributed filesystem.

Datanodes are the workhorses of the filesystem. They store and retrieveblocks when they are told to (by clients or the namenode), and they report backto the namenode periodically with lists of blocks that they are storing.

Secondary NameNode (SNN) is an assistant daemon for monitoring the stateof the cluster HDFS and the SNN help snapshots NameNode to help minimizethe downtime and loss of data.

JobTracker is the liaison between your application and Hadoop. Once yousubmit your code to your cluster, the JobTracker determines the execution planby determining which files to process, assigns nodes to different tasks, andmonitors all tasks as they’re running.

TaskTrackers is responsible for executing the individual tasks that theJobTracker assigns and manage the execution of individual tasks on eachslave node.

Univ.Tokyo.17Apichon W.

Page 18: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Key Technology

Univ.Tokyo.18Apichon W. (D2)

Hadoop main component

(Source: Lam., 2011)

add more node

1 PC

KeepMetadata

&Distribute

JobStore & ProcessData

Page 19: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Key Technology

HiveHive is a data warehousing package built on top of Hadoop.

Its target users remain data analysts who are comfortable with SQL and who need to do ad hoc queries , summarization , and data analysis on Hadoop-scale data.

You interact with Hive by issuing queries in a SQL-like language called HiveQL via Web GUI and JDBC.

Univ.Tokyo.19Apichon W. (D2)

1

2 3

Page 20: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Key Technology

HiveQL

Univ.Tokyo.20Apichon W.

(Source: White., 2011)

Page 21: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Key Technology

How Hadoop benefit Sensor NetworkScalability —Commodity hardware scales easily in many cases.Twenty Hadoop nodes may cost only as much as a singleredundant database slave pair.Operational concerns —Removing as many single-point-of-failure cases as possible is crucial to smooth operation of aworld-class service.Data processing speed —Many system-wide calculations weresimply not possible to perform with a monolithic system.

Spatial Processing & Custom functionSpatial Query: find point in polygonSpecific custom function: interpolation, forecasting, model

Univ.Tokyo.21Apichon W.

Page 22: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Hive with Spatial and Custom Function

Use JTS (Java Topology suite)Pure Java native library for spatial functionIt can be easily attached to map/reduce task because hadoop is java native platformGood performance and Open Source

Use User-Defined Function – custom developmentUDF (User-Defined Function)UDAF (User-Defined Aggregate Function)UDTF (User-Defined Table Function)Create spatial function such as “within” using JTS and make it as UDFThen it can run on hive and auto generate to map/reduce.

Use Join Method and Lateral View

Spatial Data Processing & Custom Function

Univ.Tokyo.22Apichon W.

Page 23: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Univ.Tokyo.23Apichon W.

Spatial Data Processing & Custom Function

Example of Spatial Custom FunctionJTS (Java Topology suite), Use UDF (User-defined function)Identify location of GPS point (Lat,Lng) by search in shape polygons

Prefecture

City

Grid

300,000++ points/sec

139.702777 35.694152

Tokyo

Shinjuku-ku

Code:533944151

Page 24: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Performance Comparisons of Spatial Data Processing Techniques for a Large Scale Mobile Phone Dataset

App vs. RDBMS vs. Hadoop

Univ.Tokyo.24Apichon W.

Remark: 1 day data = 20 million records

21 Hours

1 min !!!

Page 25: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Sensor Network with Cloud

Hive and PostgreSQL (Programming view point)

Univ.Tokyo.25Apichon W.

PostgreSQL

SQL

Hadoop

Hive (Metastore)

Hibernate Spring

MapReduce

Specific data processing

Java

Servlet

RMI

Java

Page 26: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Conclusion

ConclusionWe designed Enterprise Sensor Network to address current issues in development of sensor network such as

handling large number of sensor node and sensor data, real-time data processingflexible data communicationeasy integration and installation

We purposed Messaging Service and Hadoop distributed platformas main technologies to overcome those issues.

On sensor station side, we designed the system as services. Web server and SOS service are separated and communicate each other via RMI.

Univ.Tokyo.26Apichon W.

Page 27: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Conclusion

ConclusionSOS service is a combination of several services to handle specific operation such as SOS interface, Command Service, Scheduler Service, Data Synchronization Service and Data Feeder Service.

Data Feeder Service was designed to be able to develop custom feeder for vender-specific sensor and can plug to the services.

A combination of Sensor Station, Messaging Service and Sensor Cloud Service support sensor network system to archive Real-time, Scalability and Robustness.

Univ.Tokyo.27Apichon W.

Page 28: Mr. Apichon Witayangkurn apichon@iis.u-tokyo.ac.jp ......Enterprise Sensor Network Key Features Large-scale support with cloud Massive data and real-time data processing Flexible data

Hive/Hadoop

Messaging Service

Sensor

Network

Mr. Apichon Witayangkurn

Email: [email protected]

Department of Civil Engineering