re-engineering iot legacy analytics solutions with …...title powerpoint presentation author phil...

31
Copyright © 2015 KNIME.com AG Re-engineering IoT Legacy Analytics Solutions with Big Data Rosaria Silipo & Bernd Wiswedel KNIME.com [email protected]

Upload: others

Post on 09-Jun-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

Re-engineering IoT Legacy Analytics Solutions with Big Data

Rosaria Silipo & Bernd WiswedelKNIME.com

[email protected]

Page 2: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

Variety, Volume, Velocity

Variety:• integrating heterogeneous data (and tools)

Volume:• from small files...

• ...to distributed data repositories (Hadoop)

• bring the tools to the data

Velocity:• from distributing computationally heavy

computations...

• ...to real time scoring of millions of records/sec.

2

Page 3: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

Every Minute…

3

Page 4: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

IoT

4

Page 5: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG 5

The IoT Legacy

Page 6: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

Energy Usage Prediction from Smart Meters Data

• Read Smart Meter Energy Data (176 millions rows)

• Clean Up and Aggregate total Energy Usage by hour, week, day, month, year

• Calculate Behavioral Measures for each Smart Meter

• Cluster Smart Meters with Similar Behavior (k-Means)

• Predict Energy Usage in Clustered Smart Meters (Auto-Regressive Time Series Prediction)

6

Workflow 1

Workflow 2

Workflow 3

Page 7: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

Workflow 1: PrepareData

7

~ 2 days

Page 8: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG 8

Big Data Options

Page 9: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

Big Data Support

• KNIME Big Data Access Nodes

– preconfigured connectors

– in database processing

• Big Data Platforms

– HDFS, Hive, Impala, HP Vertica, Hortonworks, ParStream, Actian, MapR, any big data platform really!

• Spark MLlib integration (coming soon)

• Streaming Executor (coming soon)

Page 10: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

Hadoop Sandboxes

• Hortonworks:

http://hortonworks.com/products/hortonworks-sandbox/

• Cloudera:

http://www.cloudera.com/content/cloudera/en/downloads/quickstart_vms.html

• Virtual Box

https://www.virtualbox.org/

• VMWare Player

http://www.vmware.com/

10

Page 11: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

Access Big Data

Select TableIn-DB

ProcessingInto

KNIME

… as easy as 1,2,3,… 4

11

4321

Page 12: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

1. Database Connector

Generic Database Connector

– Can connect to any JDBC source

– Register new JDBC driver via preferences page

12

Access Big Data

Page 13: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

1. Register JDBC Driver

13

Open KNIME and go toFile -> Preferences

Increase connection timeout for long running retrieval operations

Access Big Data

Page 14: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

1. Dedicated Connectors

Dedicated pre-configured connectors

– Bundling necessary JDBC drivers

– Easy to use

– DB specific behavior/capability

Some dedicated connectors are part of the open source KNIME Analytics Platform, some belong to the commercial KNIME Big Data Extension

14

works for most Hadoop HIVE installations, including Hortonworks

free

Access Big Data

Page 15: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

2. Data Table Selection

16

Select Table

Page 16: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

3. In-Database Processing

• Filter rows and columns

• Join tables/queries

• Sort your data

• Write your own query

• Aggregate* your data

17

Similar Settings as GroupBy node

Similar Settings as Joiner node

* Database GroupBy node exposes DB specific aggregation methods

In-DB Processing

Page 17: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

3. Queries for average Measures

18

In-DB Processing

Page 18: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

3. Average Monthly Values

20

In-DB Processing

Page 19: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

4. Import Data from Database

21

< 30 min

1 2

3

4

Into KNIME

Page 20: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

New Big Data Platform?

22

No problem!Just change the connector node!

Page 21: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

Other Useful Database Nodes

• Drop table

– missing table handling

– cascade option

• Execute any SQL statement

• Manipulate existing queries

23

Executes severalqueries separatedby ; and new line

Page 22: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

HDFS File Handling

• KNIME & Extensions -> KNIME File Handling Nodes

• HDFS Connection and HDFS File Permission nodes

24

Page 23: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

Hive/Impala Loader

25

• Upload a KNIME data table to Hive/Impala

Page 24: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG 26

Next Steps

Page 25: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

Following Workflows: k-Means

27

K = 30

Page 26: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

Following Workflows: AR Model

28

Page 27: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

Model Factory: Concept

Data Preparation on

HadoopK-Means A-R Model

Choosing best k for k-Means minimizing the RMS

prediction error from A-R Model on Spark

Loop, control, and final results remain in KNIME

Page 28: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

Model Factory: Workflow

30

Select the best Model or use an Ensemble Model

Page 29: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

References

• Whitepaper “KNIME opens the Doors to Big Data”

http://www.knime.org/files/big_data_in_knime_1.pdf

• Blog Post “Integrating Big data is as Easy as 1,2,3, … 4”

http://www.knime.org/blog/integrating-big-data-is-as-easy-as-1-2-3-4

• The Big Data Extension Product Description http://www.knime.org/knime-big-data-extension

34

Page 30: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

Resources

• KNIME (www.knime.org)• BLOG for news, tips and tricks(www.knime.org/blog)

• FORUM for questions and answers (tech.knime.org/forum)

• EXAMPLE SERVER for example workflows

• LEARNING HUB (www.knime.org/learning-hub)

• KNIME TV channel on

• KNIME on @KNIME

• KNIME on

https://www.facebook.com/KNIMEanalytics

35

Page 31: Re-engineering IoT Legacy Analytics Solutions with …...Title PowerPoint Presentation Author Phil Winters Created Date 6/11/2015 3:58:41 PM

Copyright © 2015 KNIME.com AG

Events and Trainings

• KNIME (https://www.knime.org/about/events)• User Training 15-16 June Zurich (https://www.knime.org/knime-

user-training-june-2015)

• Developer Training 17-18 June Zurich(https://www.knime.org/knime-developer-training-june-2015)

• Meetup.com• 16 June, Zurich, Meetup on Big Data

(http://www.meetup.com/Zurich-KNIME-Users/events/221570664/)

36

Thank you!