arcgis workflows for optimizing image management ... · running arcgis image server in the cloud...

38
Cody Benkelman & Randall Rebello ArcGIS Workflows for Optimizing Image Management & Services in the Cloud

Upload: others

Post on 11-Mar-2020

38 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Cody Benkelman & Randall Rebello

ArcGIS Workflows

for Optimizing Image Management

& Services in the Cloud

Page 2: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Objectives

• Making imagery accessible

• Store massive volumes of imagery on inexpensive cloud storage

• Use elastic compute for ArcGIS image services

• Focus on Imagery

- Large in Volume

- Static

- Basis for visualization and analytics

Page 3: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Running ArcGIS Image Server in the Cloud

• Advantages

- Lower cost resilient storage

- Lower cost enterprise compute

- Simpler install

- Simpler scalability

• Disadvantages

- Data needs to be uploaded

- Different storage types

- Infrastructure changes

- Potential security concerns

- Potential for complex data access policies

Page 4: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Outline – 5 key topics

1. Data storage options

- Cloud storage, Data formats

2. Accessing imagery from cloud storage

3. Cloud based architectures

- Cloud formation templates

- ArcGIS Enterprise with separate Image Server

4. Data model for massive scalability

- Mosaic Dataset, Imagery Workflows landing page (best practices)

5. Data access security

Page 5: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Data Storage Options

Page 6: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Storage configurations

NAS

Standalone

Direct Access

Enterprise

Cloud

NFSHttp

Elasticity

Simple/

Object Store

Page 7: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Amazon Storage Options

Instance Store

(80GB)

$0 (incl. with EC2)

NFS

EBS (Elastic Block Storage)

SSD / HDD Up to 16 TB/Disk

(~$25 - $125/TB/Month)

S3

Simple Storage

~$10-$23/TB/Month

99.999999999% Durability99.99% Reliability/year

≈10-50MB/s

Http

Page 8: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Azure Storage Options

Temporary

Disk

(80GB)

NFS+REST

Azure Files

5TB, max 60MBps/Share

Azure Blobs

500TB/container

max 60MBps/Blob

REST

Blobs

Page 9: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Optimizations

Optimize data structures

Cache to Instance Storage

Object

Store• Use Inexpensive Scalable Storage

- Minimize Number of requests

- Minimize Size of requests

- Minimize Repeat requests

• Enable Elasticity

• Enable Caching

Http

Driver Driver Driver Enable http access

Page 10: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Raster Storage Formats & Compression

• Raw/Striped TIF

- Disadvantage : Sequential access

- Compression :

- None

• Tiled GeoTIFF

- Advantage : Tiled Access

- Compression :

- None

- Lossy JPEG (8bit & 12bit)

- Lossless Deflate LZW

• JPEG2000

- Advantage : Tiled Access, Higher Compression

- Disadvantage : Computationally expensive

- Compression : Lossy or Lossless

• MRF

- Advantage : Tiled Access

- Compression

- Same As GeoTIFF

- PNG – Generic lossless

- LERC – Controlled Lossy – Very Fast –

Optimum for higher bit depth & categorical

• COG (cloud optimized Geotiff)

- Advantage : Tiled Access, Pyramids up front

- Compression

- Same As GeoTIFF

Page 11: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

File Format Considerations / Optimization

Metadata

Metadata

IndexMetadata

PyramidNon optimum

accessEnables partial

access

Striped TIF

• Tiling of imagery – Enables partial access

• Compression – Reduce storage and transfer – Weigh against additional compute requirements

• Data access complexity – Reduce subsequent requests

• Pyramids

GoodSlow

Tiled TIF

Page 12: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

File Format Considerations / Optimization

Metadata

Metadata

IndexMetadata

Index

Metadata

Pyramid

Pyramid

Enable separate caching of

metadata, index & pixels

Non optimum

accessImproved access

to pyramids

Striped TIF Cloud Optimized Geo TIF (COG)

Index

Metadata

• Tiling of imagery – Enables partial access

• Compression – Reduce storage and transfer – Weigh against additional compute requirements

• Data access complexity – Reduce subsequent requests

• Pyramids

Cache Locally

Good OptimumSlow

MRF

Page 13: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

MRF (Meta Raster) Format

• NASA designed format

• Split raster into three (or more) files

- MRF – Small XML, with links to Index and DataFile

- Index – Simple index to tiles

- DataFile – Data file

- Auxiliary files – e.g. Aux.xml

• Open Source Implemented in GDAL 2.1+

• Multiple compression options

• Supported for read and write in ArcGIS

• Multiple implementation modes

- Metadata, Index and data file can be on different volumes

- MRF file and index is small and can easily be on a local file system

- Data can be on slower storage

Index

Metadata

Pyramid

Datafile

http://esriurl.com/MRF

Page 14: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Getting imagery into the Cloud

• AWS command line, CloudBerry, S3 Browser, ….

• Import/Export

- Ship hard drives for large projects

• OptimizeRasters or Transfer Files from your office

- Or Run OptimizeRasters to create proxies for files already in the cloud

• Note: File Names on S3 are case sensitive

Page 15: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

OptimizeRasters

- Python toolbox & command line python tool.

- Convert from any GDAL supported format to MRF or COG

and upload in single process.

- Predefined templates with best practices

- Modify parameters to define compression, tile size etc.

- Includes data upload and download from cloud.

- Works with data on S3, Azure, GoogleCloud storage and also

from any local storage.

- Can run independent of ArcGIS.

- Can be scheduled to run periodically for new data.

- Logging support.

- Runs in parallel to speed up the processing

- Supports obfuscation

- Support Lambda (serverless compute) (with some

restrictions on data size) https://github.com/Esri/OptimizeRasters

Page 16: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Cloud Connection Transfer Files (ArcGIS Pro)New at Pro 2.3/ArcMap 10.7

• Create Cloud Storage Connection

- *.acs file

• Transfer Files

- Use *.acs connection as destination,

- Provide Access Key, Secret Access Key,

and Bucket Name

• Supports numerous public clouds

Page 17: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Raster Proxies

Cloud Connections

Accessing Cloud Storage in ArcGIS

Page 18: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Accessing Cloud Storage in ArcGIS

ArcGIS typically access files through a file system; cannot generically access files in

cloud storage. Options:

1. Use “Raster Proxies” to link to imagery on cloud storage

- ArcGIS uses GDAL to read these Raster Proxies to access data on cloud

- Available as of ArcMap 10.5, ArcGIS Pro 2.0

- Requires install for Boto3 for S3

2. Use Cloud Connection

Create *.acs connection file

Page 19: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Raster Proxies

• Work with most GDAL Readable formats

• Reference the source files

• Enables caching of data

- Speed up subsequent reads

• Most optimum when referencing MRF

• Can have any raster extension

• Use like any other Rasters

• Need to consider cache location

• Need to manage cache

• Create using OptimizeRasters

• Read Optimize Rasters Help

Datafile

XXX File

Http

IndexMetadata

VSI Curl / Http

Local Index

RasterProxy

Cache

Most GDAL

Readable file

MRFIndex

RasterProxy

CacheLocalIndex

Page 20: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Using ACS cloud connection

• *.ACS file allows ArcGIS to connect directly to cloud storage

• If allowed by cloud permissions, can do listing of files & folders

• Access files directly using local file path

• Intended for cloud server to access cloud storage

Page 21: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

ACS cloud connection vs. Proxy files

Advantages of Proxy Files• More control of the process &

configuration – can be more optimized.

• Easier to replicate

• Faster when working with large number

of files - the ingest process can preset

common attributes (does not require

opening each image)

Advantages of *.ACS• More integrated

• Simpler use of Cloud store as a

virtual drive that can be explored

with Catalog

• Enables a single server machine to

have multiple profiles (i.e., different

services access different cloud

storages with different credentials)

Page 22: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

For Image Server

Cloud Based Architectures

Page 23: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

ArcGIS

Online

Client consuming

Imagery Items

Siloed (single-machine) ArcGIS Server +

Image server

Imagery S3 Storage

Elastic Load Balancer

Client consuming

Image Services

Image Services

shared on AGOL

Image Server

EC2

Image Services

Image Server

EC2

Image Services

Auto Scaling group

VPC

• Used in case where the mosaic are not updated frequently

• Mosaic sit in File GDB

• Each server site is independent

• Allows seamless recovery when one goes down other comes up

• Can not run raster analytics

• Load Balancer distributes the load on the server machines

• Each machine has its own config store

S3 storage for other

files ( zip, fgdb)

Page 24: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

ArcGIS

Online

Client consuming

Imagery Items

Siloed (single-machine) ArcGIS Server +

Image server + RDS

Imagery S3 Storage

Elastic Load Balancer

Client consuming

Image Services

Image Services

shared on AGOL

EC2

Image Services

EC2

Image Services

Auto Scaling group

RDS

Postgres RDS

VPC

Image Server Image Server

S3 storage for other

files ( zip, etc)

• Used in case where the mosaic are updated frequently

• Mosaic sit in Postgres (RDS)

• Each server site is independent

• Allows seamless recovery when one goes down other comes up

• Can not run raster analytics

• Load Balancer distributes the load on the server machines

• Each machine has its own config store

Page 25: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Federated with

Portal

S3 Storage User

Raster Store,

zip, fgdb

Enterprise + ArcGIS Server + Image server

• Used to support Raster Analytics

• Different stack for dynamic image services and Raster Analytics

• Mosaic sit in FileGDB in the config store

• If the mosaic are not going to be updated they can sit each machine

• All server machines are in cluster

• Config store is on a different EC2 machine

• Portal and Image server same VPC

Imagery S3 Storage

Professional Imagery /

Geospatial Analysts

Client consuming

Imagery ItemsArcGIS Portal

EC2

Arc GIS Server

Elastic IP for Portal

VPC

ArcGIS Pro

Auto Scaling group

Elastic Load

Balancer

EC2

Configure Store

Client consuming

Hosted Image

Services

VPC

Image Services

Image Server

Image Services

Image Server

EC2 EC2

Image Services

Image Server

EC2

Dynamic Image

ServicesRaster Analytics

Elastic Load

Balancer

Client consuming

Dynamic Image

Services

Page 26: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Federated with

Portal

S3 Storage User

Raster Store,

zip, fgdb

Imagery S3 Storage

Professional Imagery /

Geospatial Analysts

Client consuming

Imagery ItemsArcGIS Portal

EC2

Elastic IP for Portal

VPC

ArcGIS Pro

Auto Scaling group

Elastic Load

Balancer

EC2

Config Store

Client consuming

Hosted Image

Services

VPC

Image Services

Image Server

Image Services

Image Server

EC2 EC2

Image Services

Image Server

EC2

Dynamic Image

ServicesRaster Analytics

Elastic Load

Balancer

Client consuming

Dynamic Image

ServicesEnterprise + ArcGIS Server + Image server + RDS

• Used to support Raster Analytics

• Mosaic sit in Postgres (RDS)

• All server machines are in cluster

• Config store is on a different ec2 machine

• Different stack for dynamic image services and Raster Analytics

RDS

Postgres RDS

Arc GIS Server

Page 27: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Cloud formation templates (10.6.1)

• Siloed (single-machine) ArcGIS Server + Image server

https://s3.amazonaws.com/arcgisstore1061/9270/templates/arcgis-siloed-server-VPC.template

• Siloed (single-machine) ArcGIS Server + Image server + RDS

https://s3.amazonaws.com/arcgisstore1061/9270/templates/arcgis-siloed-server-VPC.template

• Enterprise + ArcGIS Server + Image server

https://s3.amazonaws.com/arcgisstore1061/9270/templates/arcgis-allinone-windows.template (Portal)

https://s3.amazonaws.com/arcgisstore1061/9270/templates/arcgis-server-windows.template ( AIS)

• Enterprise + ArcGIS Server + Image server + RDS

https://s3.amazonaws.com/arcgisstore1061/9270/templates/arcgis-allinone-windows.template (Portal)

https://s3.amazonaws.com/arcgisstore1061/9270/templates/arcgis-server-windows.template ( AIS)

https://s3.amazonaws.com/arcgisstore1061/9270/docs/index.html

Page 28: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Mosaic Datasets

Automation

Imagery Workflows (Best Practices)

Data model for massive scalability

Page 29: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Mosaic Dataset

Optimum Model for Image Data Management

• Catalog all types of raster data

• Image data remains in original files

• Define – in Geodatabase:

- Metadata

- Processing to be applied

- Default viewing rules

• Access – In all ArcGIS applications

- As Image

- Dynamic Mosaic, Processed on the fly

- As Catalog

- Footprints, Detailed metadata

Page 30: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Source Imagery

SourceMosaic

Datasets

2007

2004

2010

Source Mosaic Datasets

Page 31: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Source Imagery

SourceMosaic

Datasets

DerivedMosaic Dataset

Combine into Derived Mosaic Dataset

Use TABLE

Raster Type

Advantage: All image data

available in a single location

Page 32: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Source Imagery

SourceMosaic

Datasets

DerivedMosaic Dataset

RGB

CIR

NDVI

Pan sharpen

…many other functions

Multispectral image service

f

Single image service with multiple server functions

Share through Image server – with raster functions (processing templates)

Page 33: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Source Imagery

SourceMosaic

Datasets

DerivedMosaic Dataset

When Derived parent is updated, downstream

functions are already applied

Share through Image server – with raster functions (processing templates)

RGB

CIR

NDVI

Pan sharpen

…many other functions

Multispectral image service

f

Single image service with multiple server functions

Page 34: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Finding MDCS and related resourceshttp://esriurl.com/ImageryWorkflows

• Landing page on

Resource Center

• ArcGIS Online

(AGOL) group with

downloadable

samples

Page 35: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

For Rasters

Data Access Control

Page 36: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Data Access Control Options

• Case 1 : Public

- Make the data public and have everyone access to read and list. In this case any one can access the data and list all the files

only for read ( no write )

- (https://aws.amazon.com/articles/5050/)

• Case 2 : Obfuscated

- Set Bucket policy to only read the data but not list. In this case the user must know the path, otherwise cannot access the data

- (http://docs.aws.amazon.com/AmazonS3/latest/dev/example-bucket-policies.html#example-bucket-policies-use-case-2)

• Case 3 : VPC

- Give permission only to the machine with the specific VPC ( Virtual Private Cloud ). Use the bucket policy to allow access to the

data to list as well as read. All requests coming from the machine within this VPC will have access permission

- (http://docs.aws.amazon.com/AmazonS3/latest/dev/example-bucket-policies-vpc-endpoint.html)

• Case 4 : Restricted IP

- Give Permission to a specific IP address, then the only allowed requests must coming from the given IP address

- (http://docs.aws.amazon.com/AmazonS3/latest/dev/example-bucket-policies.html#example-bucket-policies-use-case-3)

• Case 5 : IAM Role

- Access credentials updated by the Amazon service, the access key will provide the access to S3. These keys have to be stored

as environmental variables.

- Requires use of /vsis3/ vs /vscurl

Page 37: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with

Use of Obfuscation

• A simple alternative to common access control in cloud

• Less maintenance of user name and passwords

• Reduced security Overhead which can slow down the speed of access

• The folders in which the files are obfuscated so that it is not possible to guess the

contents

• The bucket policy is set up so that users cannot query or get listings of the bucket

content. Only users who know the URLs of the files can access them or share them

• The largest limitation to the file obfuscation method is that once a URL has been

shared, access cannot be easily revoked for a single user

Page 38: ArcGIS Workflows for Optimizing Image Management ... · Running ArcGIS Image Server in the Cloud •Advantages ... -Auxiliary files ... services access different cloud storages with