ingesting drone data into big data platforms

30
1 © Hortonworks Inc. 2011 – 2017. All Rights Reserved Ingesting Drone Data into Big Data Platforms Timothy Spann (@PaasDev) Oracle Code NYC 2017 https://github.com/tspannhw/IngestingDroneData [BRK1238-NYC]

Upload: timothy-spann

Post on 21-Apr-2017

5.906 views

Category:

Data & Analytics


4 download

TRANSCRIPT

Page 1: Ingesting Drone Data into Big Data Platforms

1 ©HortonworksInc.2011– 2017.AllRightsReserved

IngestingDroneDataintoBigDataPlatformsTimothySpann(@PaasDev)OracleCodeNYC2017https://github.com/tspannhw/IngestingDroneData

[BRK1238-NYC]

Page 2: Ingesting Drone Data into Big Data Platforms

2 ©HortonworksInc.2011– 2017.AllRightsReserved

Agenda

• Drones 101• Metadata Insights• Ingestion Formats: MQTT, Files, REST JSON, Images, FTP

and more.• Apache NiFi• Big Data Storage: HDFS, HBase, Phoenix, Hive

Page 3: Ingesting Drone Data into Big Data Platforms

3 ©HortonworksInc.2011– 2017.AllRightsReserved

Code Walk Through

• Processing Images with TensorFlow for Image Recognition • Writing a Java 8 Processor for Sentiment Analysis• Writing a Java 8 Processor for analyzing HTML • Writing a Java 8 Microservice for retrieving Phoenix/Hbase

data • Writing a Java 8 Microservice for retrieving Hive data • Writing a NiFi flow to wire it all together

Page 4: Ingesting Drone Data into Big Data Platforms

4 ©HortonworksInc.2011– 2017.AllRightsReserved

UAV / Drones

A drone is an unmanned aircraft, better known as a unmanned aerial vehicle (UAV) or unmanned aircraft systems (UAS).

For enterprise purposes, we are thinking of serious drones that contain high resolution video/still cameras with onboard sensors, LIDAR and GPS.

For almost all drones, they are regulated by the Federal Aviation Administration (FAA) and there are strict regulations you must register your drone. If you aren’t prepare for the regulations or cost, keep your drone in your house or under half a pound.

Page 5: Ingesting Drone Data into Big Data Platforms

5 ©HortonworksInc.2011– 2017.AllRightsReserved

Page 6: Ingesting Drone Data into Big Data Platforms

6 ©HortonworksInc.2011– 2017.AllRightsReserved

MetaData EXIF Geotagging

Photographs taken from UAVs, the JPEGs will have extra data stored in EXIF.Apache NiFi can extract this information, which is very helpful.The most important information it contains is GPS information for latitude and longitude.

GPS Altitude Ref: Sea levelGPS Altitude: 21 metresGPS Longitude: -73° 4' 50.41GPS Latitude: 40° 45' 16.51

Page 7: Ingesting Drone Data into Big Data Platforms

7 ©HortonworksInc.2011– 2017.AllRightsReserved

hdfs dfs -get/drone/meta/Bebop2_20160920085308-0400.json{"date":"2016-09-20T08:53:08","Compression":"JPEG","ExifVersion":"2.10","ComponentsConfiguration":"YCbCr","file.group":"root","CompressionType":"Baseline","Image Description":"{\"product_id\":\"090C\",\"uuid\":\"CE112B4276E13339AFBAE10E9ED794E3\",\"run_date\":\"2016-09-20T083537-0400\",\"filename\":\"Bebop_2_2016-09-20T085308-0400_CE112B.jpg\",\"media_date\":\"2016-09-20T085308-0400\"}","NumberofComponents":"3","Component2":"Cbcomponent:Quantizationtable1,Samplingfactors1horiz/1vert","Focal Length":"1.83mm","Component 1":"Ycomponent:Quantizationtable0,Samplingfactors2horiz/2vert","YCbCr Positioning":"Center ofpixelarray","tiff:ResolutionUnit":"Inch","uuid":"9576a956-ec32-4314-bc8b-bd5bb43af4f8","Date/TimeOriginal":"2016:09:2008:53:08","ShutterSpeedValue":"1/249sec","X Resolution":"72dotsperinch","tiff:Make":"PARROT","path":"/","PhotometricInterpretation":"YCbCr","Component3":"Crcomponent:Quantizationtable1,Samplingfactors1horiz/1vert","Unique ImageID":"[32bytes]","F-Number":"F2.3","modified":"2016-09-20T08:53:08","FocalLength35":"6mm","tiff:BitsPerSample":"8","ExposureProgram":"Program action(high-speedprogram)","GPSVersionID":"2.200","GPSLatitudeRef":"N","meta:creation-date":"2016-09-20T08:53:08","exif:FNumber":"2.2999999166745724","GPSAltitudeRef":"Sea level","Exposure Time":"8597231/2147483647sec","GPS Longitude":"-73° 4'50.41\"","Creation-Date":"2016-09-20T08:53:08","ISOSpeedRatings":"426","Make":"PARROT","Orientation":"Top,leftside(Horizontal/normal)","MeteringMode":"(Other)","tiff:Orientation":"1","GPSLongitudeRef":"W","tiff:Software":"Dragon3.9.0","exif:FocalLength":"1.8300001180412324","filename":"Bebop2_20160920085308-0400.jpg","XMPValueCount":"9","geo:long":"-73.080669","file.owner":"root","Software":"Dragon3.9.0","ExifImageHeight":"1088pixels","tiff:YResolution":"72.0","YResolution":"72dotsperinch","GPSLatitude":"40° 45'16.51\"","dc:description":"{\"product_id\":\"090C\",\"uuid\":\"CE112B4276E13339AFBAE10E9ED794E3\",\"run_date\":\"2016-09-20T083537-0400\",\"filename\":\"Bebop_2_2016-09-20T085308-0400_CE112B.jpg\",\"media_date\":\"2016-09-20T085308-0400\"}","geo:lat":"40.754585","FlashPixVersion":"1.00","DataPrecision":"8bits","White Balance":"Flash","tiff:ImageLength":"1088","description":"{\"product_id\":\"090C\",\"uuid\":\"CE112B4276E13339AFBAE10E9ED794E3\",\"run_date\":\"2016-09-20T083537-0400\",\"filename\":\"Bebop_2_2016-09-20T085308-0400_CE112B.jpg\",\"media_date\":\"2016-09-20T085308-0400\"}","dcterms:created":"2016-09-20T08:53:08","dcterms:modified":"2016-09-20T08:53:08","Last-Modified":"2016-09-20T08:53:08","file.permissions":"rwxrwxrwx","exif:ExposureTime":"0.00400339765660623","Last-Save-Date":"2016-09-20T08:53:08","GPSAltitude":"21metres","absolute.path":"/opt/demo/dronedata/","ColorSpace":"Undefined","File Size":"853159bytes","meta:save-date":"2016-09-20T08:53:08","file.creationTime":"2016-09-20T12:53:10+0000","Date/TimeDigitized":"2016:09:2008:53:08","FileName":"apache-tika-941370357006559178.tmp","Content-Type":"image/jpeg","Aperture Value":"F2.3","X-Parsed-By":"org.apache.tika.parser.DefaultParser,org.apache.tika.parser.jpeg.JpegParser","FileModifiedDate":"Tue Sep2023:33:24UTC2016","tiff:XResolution":"72.0","file.lastModifiedTime":"2016-09-20T12:53:10+0000","exif:DateTimeOriginal":"2016-09-20T08:53:08","Date/Time":"2016:09:2008:53:08","ExifImageWidth":"1920pixels","Image Height":"1088pixels","Image Width":"1920pixels","Unknown tag(0xc62f)":"[19bytes]","ResolutionUnit":"Inch","tiff:Model":"Bebop2","exif:IsoSpeedRatings":"426","MaxApertureValue":"F2.3","ExposureMode":"Auto exposure","Model":"Bebop 2","file.lastAccessTime":"2016-09-20T23:33:24+0000","tiff:ImageWidth":"1920","WhiteBalanceMode":"Auto whitebalance"}

Page 8: Ingesting Drone Data into Big Data Platforms

8 ©HortonworksInc.2011– 2017.AllRightsReserved

import paho.mqtt.client as mqttclient = mqtt.Client()client.username_pw_set("username","password")client.connect(“MQTT_Broker", 14162, 60)

Sources and Formats

{ "product_id": "090C", "uuid": "CE112B4276E13339AFBAE10E9ED794E3", "run_date": "2016-09-20T083537-0400", "filename": "Bebop_2_2016-09-20T084230-0400_CE112B.jpg", "media_date": "2016-09-20T084230-0400" }

http://localhost:9999/dronelist

Page 9: Ingesting Drone Data into Big Data Platforms

9 ©HortonworksInc.2011– 2017.AllRightsReserved

ReportingDroneDataIngest

- NiFipullsinBeBop 2Droneimages- NiFiroutesandparsesmetadatafromdroneimagesincludinggeodata

- NiFiusesTensorFlow Inceptionv3torecognizeobjectsinimage

- NiFistoresimages,metadataandenricheddatainHadoop.

- NiFiingestssocialandweatherfeeds- Java8ProcessorRunsSentiment- SpringBootAppDisplaysHiveData- PythonSentimentAnalysisscripts

Page 10: Ingesting Drone Data into Big Data Platforms

10 ©HortonworksInc.2011– 2017.AllRightsReserved

DevelopedbytheNSAoverthelast8years.

"NSA'sinnovatorsworkonsomeofthemostchallengingnationalsecurityproblemsimaginable,""Commercialenterprisescoulduseittoquicklycontrol,manage,andanalyzetheflowofinformationfromgeographicallydispersedsites– creatingcomprehensivesituationalawareness"

-- LindaL.Burger,DirectoroftheNSA

NiFi Developed by the National Security Agency

Page 11: Ingesting Drone Data into Big Data Platforms

11 ©HortonworksInc.2011– 2017.AllRightsReserved

• Foragileandimmediatecreation,configuration,controlofdataflowsVisual CommandandControl

• EnsurestrustofyourdataDataLineage(Provenance)

• Becausenotalldataisofequal importanceDataPrioritization

• Sincenotallsenders/receivers/connectionsworkperfectlyallthetimeDataBuffering/Back-Pressure

• Adapttodifferentsituations withdifferentrequirementsControl LatencyvsThroughput

• Securityofdata,anddataaccessSecure ControlPlane/DataPlane

• ScalabilityScaleoutClustering

• Ecosystem flexibilityandgrowthExtensibility

ApacheNiFi:Designedfor8challengesofglobalenterprisedataflow

Page 12: Ingesting Drone Data into Big Data Platforms

12 ©HortonworksInc.2011– 2017.AllRightsReserved

FlowFile• Unitofdatamovingthroughthesystem• Content+Attributes(key/valuepairs)

Processor• Performsthework,canaccessFlowFiles

Connection• Linksbetweenprocessors• Queuesthatcanbedynamicallyprioritized

Terminology

Page 13: Ingesting Drone Data into Big Data Platforms

13 ©HortonworksInc.2011– 2017.AllRightsReserved

Typical NiFi Cluster Logical View

13

Page 14: Ingesting Drone Data into Big Data Platforms

14 ©HortonworksInc.2011– 2017.AllRightsReserved

ConnectingDataBetweenEcosystemsWithoutCoding:180+Processors

AllApacheprojectlogosaretrademarksoftheASFandtherespectiveprojects.

Hash

Extract

Merge

Duplicate

Scan

GeoEnrich

Replace

ConvertSplit

Translate

RouteContent

RouteContext

RouteText

ControlRate

DistributeLoad

GenerateTableFetch

JoltTransformJSON

PrioritizedDelivery

Encrypt

Tail

Evaluate

Execute

AllApacheprojectlogosaretrademarksoftheASFandtherespectiveprojects.

Fetch

HTTP

Syslog

Email

HTML

Image

HL7

FTP

UDP

XML

SFTP

AMQP

WebSocket

Page 15: Ingesting Drone Data into Big Data Platforms

15 ©HortonworksInc.2011– 2017.AllRightsReserved

HadoopDistributedFileSystem(HDFS)

§ Fault-Tolerance§ Multiplecopiesprovideperformanceboost§ ReplicationLevelisconfigurable§ Fullchecksums§ Rackawareness§ Filessplitintoblocksdistributedonthree*servers§ CommodityHardware§ Nearlimitlesshorizontalscalability§ LookslikeLinuxFileSystem§ WebUI

HadoopScalableStorageandCompute

HiveLLAPHighPerformanceSQLDataMart

Page 16: Ingesting Drone Data into Big Data Platforms

16 ©HortonworksInc.2011– 2017.AllRightsReserved

What Are Apache HBase and Phoenix?

FlexibleSchemaMillisecondLatencySQLandNoSQL InterfacesStoreandProcessPetabytesofDataScaleoutonCommodityServersIntegratedwithYARN100%OpenSourceOnTopofHDFS

YARN:DataOperatingSystem

HBase

RegionServer

1 ° ° ° ° ° ° ° ° ° °

° ° ° ° ° ° ° ° ° ° N

HDFS(PermanentDataStorage)

HBase

RegionServer

HBase

RegionServer

Flexible SchemaExtreme Low Latency

Directly Integrated with HadoopSQL and NoSQL Interfaces

Page 17: Ingesting Drone Data into Big Data Platforms

17 ©HortonworksInc.2011– 2017.AllRightsReserved

WhatAreApachePhoenixandHBase?

Apache HBase is distributed database modeled after Google’s BigTable and designed toprovide real-time access to data in Hadoop. Apache Phoenix provides an ANSI SQLinterface to HBase.

Features:• Real-TimeDataManagementforHadoop• PB+Scale• ANSISQLInterface• SecondaryIndexes• Cross-DCReplication• Fine-GrainedSecurity• DevelopinJDBC,ODBC,.NET,andmore

Page 18: Ingesting Drone Data into Big Data Platforms

18 ©HortonworksInc.2011– 2017.AllRightsReserved

WhatIsApacheHive?

ApacheHive isaSQLdatawarehouseinfrastructurethatdeliversfast,scalableSQLprocessingonHadoopandintheCloud.

Features:• ExtensiveSQL:2011Support• ACIDTransactions• In-MemoryCaching• Cost-BasedOptimizer• User-BasedDynamicSecurity• ReplicationandDisasterRecovery• JDBCandODBCSupport• CompatiblewitheverymajorBITool• Provenat300+PBScale• OnTopofHDFS

Page 19: Ingesting Drone Data into Big Data Platforms

19 ©HortonworksInc.2011– 2017.AllRightsReserved

CODE!!!!

Page 20: Ingesting Drone Data into Big Data Platforms

20 ©HortonworksInc.2011– 2017.AllRightsReserved

pythonclassify_image.py --image_file /opt/demo/dronedata/Bebop2_20160920083655-0400.jpgsolardish,solarcollector,solarfurnace(score=0.98316)windowscreen(score=0.00196)manholecover(score=0.00070)radiator(score=0.00041)doormat,welcomemat(score=0.00041)

bazel-bin/tensorflow/examples/label_image/label_image --image=/opt/demo/dronedata/Bebop2_20160920083655-0400.jpgtensorflow/examples/label_image/main.cc:204]solardish(577):0.983162Itensorflow/examples/label_image/main.cc:204]windowscreen(912):0.00196204Itensorflow/examples/label_image/main.cc:204]manholecover(763):0.000704005Itensorflow/examples/label_image/main.cc:204]radiator(571):0.000408321Itensorflow/examples/label_image/main.cc:204]doormat(972):0.000406186

TensorFlow via Python or C++ Binary (Java Library Is New!)

Page 21: Ingesting Drone Data into Big Data Platforms

21 ©HortonworksInc.2011– 2017.AllRightsReserved

/opt/demo/sentiment/run.shpython/opt/demo/sentiment/sentiment.py "$@”

fromnltk.sentiment.vader importSentimentIntensityAnalyzerimportsyssid =SentimentIntensityAnalyzer()ss =sid.polarity_scores(sys.argv[1])print('Compound{0}Negative{1}Neutral{2}Positive{3}'.format(

ss['compound'],ss['neg'],ss['neu'],ss['pos']))

or

ifss['compound']==0.00: print('Neutral')elif ss['compound']<0.00: print('Negative')else: print('Positive')

Sentiment Analysis via PythonPrettyeasyandIwillshowyou

AnotherexamplewithTextBlob.

YouwillneedPython2.7andPIPinstalled.

Page 22: Ingesting Drone Data into Big Data Platforms

22 ©HortonworksInc.2011– 2017.AllRightsReserved

https://pip.pypa.io/en/latest/installing/http://www.nltk.org/install.html

wget https://bootstrap.pypa.io/get-pip.pypythonget-pip.pysudo pip install -U nltksudo pip install -U numpysudo pipinstall-Utextblobsudo python-mtextblob.download_corpora

Installing Sentiment Analysis Libraries for Python 2.7

Installing TensorFlow for Python 2.7See:https://www.tensorflow.org/install/install_mac

InstallationisgettingeasierforTensorFlow,butyouwillneedbuildtoolsandPythoninstalled.YouneedtofindoutifyouhaveaGPUsupportedbyCUDA.Ifsoyourperformancewillbegreatlyimproved.

Page 23: Ingesting Drone Data into Big Data Platforms

23 ©HortonworksInc.2011– 2017.AllRightsReserved

Page 24: Ingesting Drone Data into Big Data Platforms

24 ©HortonworksInc.2011– 2017.AllRightsReserved

Page 25: Ingesting Drone Data into Big Data Platforms

25 ©HortonworksInc.2011– 2017.AllRightsReserved

Page 26: Ingesting Drone Data into Big Data Platforms

26 ©HortonworksInc.2011– 2017.AllRightsReserved

InstallationDownload the binary from here: http://hortonworks.com/downloads/#dataflow

Or here:https://nifi.apache.org/download.html

Or on Mac:brew install nifi

https://nifi.apache.org/docs/nifi-docs/html/getting-started.html#starting-nifi

bin/nifi.sh start

Page 27: Ingesting Drone Data into Big Data Platforms

27 ©HortonworksInc.2011– 2017.AllRightsReserved

à https://hortonworks.com/hadoop-tutorial/learning-ropes-apache-nifi/

à https://github.com/jfrazee/awesome-nifi

à https://dzone.com/articles/getting-started-with-apache-nifi-and-hdf

à https://nifi.apache.org/docs.html

à https://community.hortonworks.com/articles/4356/getting-started-with-nifi-expression-language-and.html

à https://github.com/tspannhw/rpi-sensehat-mqtt-nifi

à https://github.com/tspannhw/iot-scripts

à https://dzone.com/articles/apache-nifi-10-cheatsheet

LearningMore

Page 28: Ingesting Drone Data into Big Data Platforms

28 ©HortonworksInc.2011– 2017.AllRightsReserved

à https://www.parrot.com/us/drones/parrot-bebop-2#parrot-bebop-2

à https://en.wikipedia.org/wiki/Unmanned_aerial_vehicle

à http://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/label_image/README.md

à https://www.tensorflow.org/tutorials/image_recognition

à https://github.com/tensorflow/tensorflow/blob/master/tensorflow/java/src/main/java/org/tensorflow/examples/LabelImage.java

à https://community.hortonworks.com/content/kbentry/83100/deep-learning-iot-workflows-with-raspberry-pi-mqtt.html

à https://sites.google.com/site/pud2gpxkmlcsv/

à http://knowbeforeyoufly.org/

à https://www.faa.gov/uas/getting_started/

Resources

Page 29: Ingesting Drone Data into Big Data Platforms

29 ©HortonworksInc.2011– 2017.AllRightsReserved

à KenKranz – DirectorofUASBigData atCognizant.@kenkranz

à JoeWitt- SeniorDirectorofEngineeringatHortonworks.@joewitt26

à ChrisCasano – SolutionEnngineer ManageratHortonworks.

à TomMcCuch – Sr.TechnicalDirectoratHortonworks.@tmccuch

à IngestingDroneDataintoBigDataPlatformshttps://github.com/tspannhw/IngestingDroneData

Thanks

Page 30: Ingesting Drone Data into Big Data Platforms

30 ©HortonworksInc.2011– 2017.AllRightsReserved

Contact:

TimothySpann@PaaSDeV

www.meetup.com/futureofdata-princetonhttps://dzone.com/users/297029/bunkertor.htmlcommunity.hortonworks.com/users/9304/tspann.html