![Page 1: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/1.jpg)
Big Spatial DataBruce Sanderson
![Page 2: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/2.jpg)
Z R Y Q PT R Y T OR E A D T H I S
IF YOU CAN THEN
YOU ARE A MUCH BETTER PERSON
THAN I CAN EVER HOPE TO BE IN THIS LIFETIME OR THE NEXT
GOODBYE CRUEL WORLD IT WAS NICE KNOWING YOU BUT NOW IT IS TIME TO GO FOR GOOD
![Page 3: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/3.jpg)
BIG DATA
![Page 4: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/4.jpg)
THE PROBLEM
WITH BIG DATA
IS NOT THAT IT’S BIG
![Page 5: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/5.jpg)
It’s that there is LOTS OF IT
![Page 6: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/6.jpg)
![Page 7: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/7.jpg)
![Page 8: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/8.jpg)
IN ORDER TO SEE CLEARLY WE NEED
BETTER TOOLS
THAT BRING THINGS BACK
INTO FOCUS
![Page 9: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/9.jpg)
![Page 10: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/10.jpg)
Agenda
• Our Hadoop implementation
• Big Spatial Data workflows
• What should you be doing?
![Page 11: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/11.jpg)
![Page 12: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/12.jpg)
SPATIAL DATA
MANGEMENT WINDOW
TOO SHORT
![Page 13: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/13.jpg)
UNSTRUCTURED
DATA
![Page 14: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/14.jpg)
Drivers
• Volume of data
• Need to load/process quicker
• Data that doesn’t fit into RDBMS
![Page 15: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/15.jpg)
How do you get started?
But have Big Data needs
If you’re not a Big Data expert…
![Page 16: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/16.jpg)
Basic Hadoop Stack
Hadoop Distributed File System (HDFS)
MapReduce
Yet Another Resource Negotiater(YARN)
Commodity
Servers
![Page 17: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/17.jpg)
MapReduce
• MapReduce is a programming model and an
associated implementation for processing
and generating large data sets with a
parallel, distributed algorithm on a cluster.
• Yuck!
![Page 18: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/18.jpg)
Name Node
DataNode 1 DataNode 3DataNode 2
HDFS Client
Secondary Node
![Page 19: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/19.jpg)
Hive
Map Reduce
Hadoop Distributed File System (HDFS)
HBase Impala Spark
Pig Cascading
Sqoop
Flume
![Page 20: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/20.jpg)
What Next?
![Page 21: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/21.jpg)
![Page 22: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/22.jpg)
Load data!
hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt
![Page 23: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/23.jpg)
• Command line interface for transferring data
between relational databases and HDFS
• Support joins and where clauses
• Quest Data Connector - Oracle
![Page 24: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/24.jpg)
• SQL on Hadoop
• External tables
• Schema on read
hive> CREATE EXTERNAL TABLE well(id INT,
api STRING, name STRING) ROW FORMAT
DELIMITED FIELDS TERMINATED BY ',' LINES
TERMINATED BY '\n' STORED AS TEXTFILE
LOCATION '/home/admin/userdata';
![Page 25: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/25.jpg)
• SQL on HDFS
• Bypasses MapReduce
• Most operations in-memory
![Page 26: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/26.jpg)
• Up to 100x faster than Hadoop MapReduce
• Java, Scala, Python, R
• Access diverse data sources
- HDFS, HBase, Hive, S3
• SparkSQL
![Page 27: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/27.jpg)
ArcMap
ArcPyHDFS
Snakebite
• Pure Python client
• Allows us to use ArcPy!
![Page 28: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/28.jpg)
Big Spatial Data workflows
•Vehicle Tracking Analysis
•Directional Survey Analysis
•Custom GeoAnalytics Interface
![Page 29: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/29.jpg)
GPS Data
Vehicle Tracking Analysis
HDFS
ArcMap
ArcPy
put
![Page 30: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/30.jpg)
![Page 31: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/31.jpg)
GeoJson
Directional Survey Analyses
HDFS
Directional
Surveys
Sqoop
FME ArcMapSDE
![Page 32: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/32.jpg)
Wells Leases Rigs
jdbc
Hive Tables
Charts Renderer Tiles
GeoAnalytics Interface
HDFS
User i/f
![Page 33: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/33.jpg)
What did we learn?
• There is a need for some Big Data tools
• Mostly using Spark directly against HDFS
![Page 34: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/34.jpg)
What does that mean for you?
![Page 35: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/35.jpg)
10.4 Release
• Big Data built-into ArcGIS
• Push button deployment
• Any number of nodes
•Spark is part of framework!
![Page 36: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/36.jpg)
• ‘Real-Time GIS: The Road Ahead’
• Room 14 B, Wed 1:30-2:45pm
![Page 37: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/37.jpg)
![Page 38: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/38.jpg)
![Page 39: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/39.jpg)
![Page 40: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/40.jpg)
So you no longer need to be a Big Data expert to use it in
your geospatial environment.
![Page 41: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/41.jpg)
Hadoop
Invest in Spark
Learn Java, Scala, or Python
No Hadoop
Wait for 10.4!
Study Spark
Learn Java, Scala, or Python
![Page 42: Big Spatial Data - Esri · HBase Impala Spark Pig Cascading Sqoop Flume. What Next? Load data! hadoop fs -put N:\07_2014_gps.txt \user\vehicle\data.txt •Command line interface for](https://reader030.vdocument.in/reader030/viewer/2022040613/5f0733127e708231d41bcd6b/html5/thumbnails/42.jpg)
Summary
• Do you need Big Spatial Data Tools?
- Yes – but probably worth waiting for 10.4
•Spark