geospatial analysis in the cloud

Post on 16-May-2015

999 Views

Category:

Technology

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Presented at the Government Cloud Service Oriented Architecture Workshop

TRANSCRIPT

Use of Cloud Computing for scalable geospatial data processing and access

Andrew TurnerCTO, FortiusOneandrew@fortiusone.com

Partner: U.S. Federal Geographic Data Committee

What is GeoCommons?A Brief History

Vulnerability Identification

Chicago

Denver

Route 2

Route 1Los Angeles

Atlanta Fiber Density

Electric Transmission Line

Density

Baseline connectivity of a fiber network provider in NYC. This particular provider is a good proxy for the structure of the entire island of Manhattan since they have about 80% of the right of ways on the island and a large number of egress points off the island. The higher the peak in the map the more frequently used the path is as a possible routing path.

WTC

Holland Tunnel

Columbus Circle

Lastly a scenario is run where just 10,000 sq ft. of damage is done to the Holland Tunnel and the impact calculated. The result is a 8.6% loss of network connectivity, 134 times the impact of the WTC simulation. The dramatic impact is seen in the image from the loss as well as the stress put on the GW Bridge route out of the city.

GeoCommons: Version 1

Find interesting data

Find interesting data

Map arelevant area

Find interesting data

Map arelevant area

Visualize to find meaning

Find interesting data

Map arelevant area

Visualize to find meaning

Layer, Modify,and Analyze

Find interesting data

Map arelevant area

Visualize to find meaning

Collaborate with others

Layer, Modify,and Analyze

Find interesting data

Map arelevant area

Visualize to find meaning

Collaborate with others

Publish and share results

Layer, Modify,and Analyze

Visualization

Analysis

Applying Lessons Learned

Modularize

MakerFinder

CoreRESTfulInterfaces

Application Programming Interface

Relational Databases Don’t Scale Well

Datasets as Databases

MakerFinder

Core

KML

Shapefile

CSV (Excel)

GeoRSS

Documents

Datasets as Databases

MakerFinder

Core

KML

Shapefile

CSV (Excel)

GeoRSS

Documents

Upload

Datasets as Databases

MakerFinder

Core

KML

Shapefile

CSV (Excel)

GeoRSS

Documents

Upload

Parse & Store

Datasets as Databases

MakerFinder

Core

KML

Shapefile

CSV (Excel)

GeoRSS

Documents

Upload

Parse & Store

Datasets as Databases

MakerFinder

Core

KML

Shapefile

CSV (Excel)

GeoRSS

Documents

Upload

Parse & Store

Datasets as Databases

MakerFinder

Core

KML

Shapefile

CSV (Excel)

GeoRSS

Documents

Upload

Parse & Store

Datasets as Databases

MakerFinder

Core

KML

Shapefile

CSV (Excel)

GeoRSS

Documents

Upload

Parse & Store

Download

Datasets as Databases

MakerFinder

Core

KML

Shapefile

CSV (Excel)

GeoRSS

Documents

Upload

Parse & Store Analyze

Download

Datasets as Databases

MakerFinder

Core

KML

Shapefile

CSV (Excel)

GeoRSS

Documents

Upload

Parse & Store Analyze

Download

Visualize

Geospatial Catalog and Server

Delivery Mechanisms

Appliances

• Sun 4150• RAID Array

Web Scaled Racks

• 3 Appliances• Network File Storage• Load Balancer• Monitoring and Tunnels• Production & Staging racks• Racks in office for development

Limits in Scaling

Limits in Development

Limits in Scaling

People

Limits in Development

Limits in Scaling

PeoplePower

Limits in Development

Limits in Scaling

PeoplePowerSize

Limits in Development

Limits in Scaling

PeoplePowerSizeCost

Limits in Development

Limits in Scaling

PeoplePowerSizeCostTime

Limits in Development

Limits in Scaling

PeoplePowerSizeCostTime

Limits in Development

Limits in Scaling

PeoplePowerSizeCostTime

Limits in Development

Testing on “clean” machines

Limits in Scaling

PeoplePowerSizeCostTime

Limits in Development

Testing on “clean” machines

Deployment testing of upgrades

Limits in Scaling

PeoplePowerSizeCostTime

Limits in Development

Testing on “clean” machines

Deployment testing of upgrades

Controlled Environments

url

Leveraging the Cloud

http

://w

ww

.flic

kr.c

om/p

hoto

s/kk

y/70

4056

791

Amazon Web Services

Management Consoles

Processing via MapReduce

Launching New Instances

Elastic Computing Cluster - EC2

• Virtual Servers

• Machine Images (AMI)

• On-Demand

CentOS AMI

Elastic Computing Cluster - EC2

• Virtual Servers

• Machine Images (AMI)

• On-Demand

CentOS AMI

build

Elastic Computing Cluster - EC2

• Virtual Servers

• Machine Images (AMI)

• On-Demand

CentOS AMI

build

bundle

register

Elastic Computing Cluster - EC2

• Virtual Servers

• Machine Images (AMI)

• On-Demand

CentOS AMI

build

bundle

register

instantiate

Elastic Computing Cluster - EC2

• Virtual Servers

• Machine Images (AMI)

• On-Demand

CentOS AMI

build

bundle

register

instantiate

Elastic Computing Cluster - EC2

• Virtual Servers

• Machine Images (AMI)

• On-Demand

CentOS AMI

build

bundle

register

instantiate

Elastic Computing Cluster - EC2

• Virtual Servers

• Machine Images (AMI)

• On-Demand

CentOS AMI

build

bundle

register

instantiate

Elastic Computing Cluster - EC2

• Virtual Servers

• Machine Images (AMI)

• On-Demand

CentOS AMI

build

bundle

register

instantiate

Elastic Block Store - EBS

Create EBS

100 GB

Elastic Block Store - EBS

attach

Create EBS

100 GB

Elastic Block Store - EBS

attach

Create EBS

snapshot100 GB

Elastic Block Store - EBS

attach

Create EBS

snapshot100 GB

Diff v1S3

Elastic Block Store - EBS

attach

Create EBS

snapshot100 GB

Diff v2Diff v1S3

Elastic Block Store - EBS

attach

Create EBS

snapshot100 GB

Diff v2

Create & AttachDiff v1S3

Elastic Block Store - EBS

attach

Create EBS

snapshot100 GB

Diff v2

Create & AttachDiff v1S3

Elastic Block Store - EBS

attach

Create EBS

snapshot100 GB

Diff v2

Create & AttachDiff v1S3

Elastic Block Store - EBS

attach

Create EBS

snapshot100 GB

Diff v2

Create & AttachDiff v1S3

Public Datasets

Additional Benefits

• Federation

• Tile generation

• Content-delivery System

• Simple Queue System (SQS)

tiles/openstreetmap/9/74/97.png

tiles/openstreetmap/9/74/98.png

tiles/bluemarble/9/74/97.png

tiles/bluemarble/9/74/98.pngS3 Storage

Cloud Architecture

• EC2 image of current system architecture

• EBS image stored to S3 of default database

• Current application release in S3

• Start an EC2, attach data, attach code, startup

Default

Datasets

v1.4.3

Cloud Architecture

• EC2 image of current system architecture

• EBS image stored to S3 of default database

• Current application release in S3

• Start an EC2, attach data, attach code, startup

createinstance

Default

Datasets

v1.4.3

Cloud Architecture

• EC2 image of current system architecture

• EBS image stored to S3 of default database

• Current application release in S3

• Start an EC2, attach data, attach code, startup

createinstance

Default

Datasets

v1.4.3

Cloud Architecture

• EC2 image of current system architecture

• EBS image stored to S3 of default database

• Current application release in S3

• Start an EC2, attach data, attach code, startup

attach data

createinstance

Default

Datasets

v1.4.3

Cloud Architecture

• EC2 image of current system architecture

• EBS image stored to S3 of default database

• Current application release in S3

• Start an EC2, attach data, attach code, startup

attach data

createinstance

Default

Datasets

v1.4.3

Backup BackupBackup

Snapshot

Cloud Architecture

• EC2 image of current system architecture

• EBS image stored to S3 of default database

• Current application release in S3

• Start an EC2, attach data, attach code, startup

attach data

createinstance

Default

Datasets

v1.4.3

Backup BackupBackup

Snapshot

Cache Downloads

S3

Scaling

• RESTful architecture

• Caching for speed, and CDN support

• Amazon Web Services

• CloudWatch

• Elastic Scaling

• Load Balancer

Private Instances

First Users: Meedan, Media

Repeatable

Repeatable

Data Federation

community

Geospatial Federated Search Search

Geocoding

Geocoding - Scale as Required

TIGER/LineSQLite

Geocoding Engine

API

UploadCSV

GeocodeCacheResults

Geocoding - Scale as Required

TIGER/LineSQLite

Geocoding Engine

API

UploadCSV

GeocodeCacheResults

Best Practices Applied to the Government

• Built using open, established tools

• Full choice - Linux, Windows

• Full Control

• Repeatable processes

• Continual backup

• Scaling dynamic and large datasets

• Synchronous and Asynchronous analysis

Level of Maturity

• Widely adopted

• Broad support and ecosystem

• Full stack support

Perceived Impediments to Adoption

• Single Vendor (open-source alternatives arising)

• Maintenance and Location

• Data Security

top related