aws summit auckland - big data & analytics -end to end on aws

38
© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Russell Nash, Solutions Architect, Amazon Web Services Peter McCallum, Head of Data and Insights, Qrious Big Data & Analytics End to End on AWS Technical 101

Upload: amazon-web-services

Post on 20-Jan-2017

252 views

Category:

Technology


1 download

TRANSCRIPT

© 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.

Russell Nash, Solutions Architect, Amazon Web Services

Peter McCallum, Head of Data and Insights, Qrious

Big Data & Analytics

End to End on AWS

Technical 101

Business

101 Technical

201 Technical

301 Technical

401 Technical

Session Depth

Ingest Store Process Analyse

Databases

Database

Data

Log

Data

Mobile High Velocity

Data

Analytics

Database

INGEST STORE

Devices

Web Servers

App Servers

Mobile

Amazon

Redshift

MPP SQL Database

Optimised for Analytics

Gigabytes to Petabytes

Fully relational

Amazon

Redshift

“When our analysts first started to do queries

on Amazon Redshift they thought it was

broken because it was working so fast”

- FT CTO John O’Donovan

Database

Data

ETL

INGEST

Amazon

Redshift

STORE

Mobile High Velocity

Data

Devices

Web Servers

App Servers

Mobile

Databases

Log

Data

ETL

AWS Database

Migration Service

Amazon

RedshiftSource

Database

ETL

Storage

INGEST

Application

STORE

Mobile High Velocity

Data

Devices

Web Servers

App Servers

Mobile

Database

Data

Amazon

RedshiftDatabases

Log

Data

SearchAmazon

Elasticsearch

Database

Data

Storage

INGEST

Amazon

Redshift

STORE

Mobile High Velocity

Data

Devices

Web Servers

App Servers

Mobile

Databases

Log

Data

Amazon

Elasticsearch

Amazon

S3

Amazon

S3

Object Storage

Low Cost

Highly Scalable

11 9’s of durability

Impala PIG

Amazon

EMR

HDFS

Impala PIG

Amazon

EMR

Amazon

S3

EMRFS

EMRFS

Amazon

S3

EMRFS

Impala PIG

Amazon

EMR

EMRFS

CPU

c3 family

cc1.4xlarge

cc2.8xlarge

Memory

m2 family

r3 family

Disk/IO

d2 family

i2 family

General

m1 family

m3 family

Instance Types

Batch Machine Spark and Large

process learning interactive HDFS

Cost & Time

# CPUs

Time

# CPUs

Time

Wall clock time: 1 hourWall clock time: 10 hours

Spot Price – M3.2XL

On-Demand Spot-Price

$0.08$0.75

Database

Data

INGEST

Amazon

Redshift

STORE

Mobile High Velocity

Data

Devices

Web Servers

App Servers

Mobile

Databases

Log

Data

Amazon

Elasticsearch

Amazon

S3

Stream

ProcessorAmazon

Kinesis

NoSQL

Availability

Zone

Availability

Zone

Availability

Zone

Data Sources

Data Sources

Data Sources

Data Sources

Data Sources

S3

DynamoDB

Redshift

Amazon Kinesis

Stream

AWS Lambda

KCL App

EMR

Elasticsearch

Kinesis

Database

Data

INGEST

Amazon

Redshift

STORE

Mobile High Velocity

Data

Devices

Web Servers

App Servers

Mobile

Databases

Log

Data

Amazon

Elasticsearch

Amazon

S3

Amazon

Kinesis

NoSQLAmazon

DynamoDB

NoSQL

scalability

data complexity

RDBMS

key/value

document

graph

NoSQL Database

Key/Value + Document

Very Low Latency

Fully Managed

Amazon

DynamoDB

Low Latency

DynamoDB

INGEST STORE PROCESS

DatabasesAmazon

Redshift

Amazon

Kinesis

Amazon

S3

Impala

Amazon

Redshift

Amazon

EMR

Database

Data Inte

ractive

Ba

tch

Str

ea

min

g

Hadoop

Amazon

Elasticsearch

Log

Data

Mobile High Velocity

Data

Devices

Web Servers

App Servers

MobileAmazon

DynamoDB

INGEST STORE PROCESS

Impala

Amazon

Redshift

Database

Data

Inte

ractive

Ba

tch

PIG

Str

ea

min

g

Amazon

EMR

Hadoop

Amazon

Kinesis

Amazon

S3

Amazon

Elasticsearch

Mobile High Velocity

Data

Devices

Web Servers

App Servers

MobileAmazon

DynamoDB

Databases

Log

Data

Amazon

Redshift

FAST

INDUSTRY

SUPPORTRICH

FLEXIBLE

INGEST STORE PROCESS

Impala

Amazon

Redshift

AWS

Lambda

Kinesis

Consumers

Database

Data

Inte

ractive

Ba

tch

Str

ea

min

g

PIG

Amazon

EMR

Hadoop

Amazon

Kinesis

Amazon

S3

Amazon

Elasticsearch

Mobile High Velocity

Data

Devices

Web Servers

App Servers

MobileAmazon

DynamoDB

Databases

Log

Data

Amazon

Redshift

INGEST STORE PROCESS

Impala

Amazon

Redshift

Kinesis

Consumers

Database

Data

Inte

ractive

Ba

tch

Str

ea

min

g

PIG

ANALYSE

Amazon

EMR

Hadoop

Amazon

QuickSight

Amazon

Kinesis

Amazon

S3

Amazon

Elasticsearch

Mobile High Velocity

Data

Devices

Web Servers

App Servers

Mobile

AWS

Lambda

Amazon

DynamoDB

Databases

Log

Data

Amazon

Redshift

INGEST STORE PROCESS

Impala

Amazon

Redshift

Kinesis

Consumers

Database

Data

Inte

ractive

Ba

tch

Str

ea

min

g

PIG

ANALYSE

Amazon

EMR

Hadoop

Amazon

Kinesis

Amazon

S3

Amazon

Elasticsearch

Mobile High Velocity

Data

Devices

Web Servers

App Servers

Mobile

Amazon

QuickSight

Amazon

Machine

LearningL

MLlib

AWS

Lambda

Amazon

DynamoDB

Databases

Log

Data

Amazon

Redshift

AWS Training & Certification

Intro Videos & Labs

Free videos and labs to

help you learn to work

with 30+ AWS services

– in minutes!

Training Classes

In-person and online

courses to build

technical skills –

taught by accredited

AWS instructors

Online Labs

Practice working with

AWS services in live

environment –

Learn how related

services work

together

AWS Certification

Validate technical

skills and expertise –

identify qualified IT

talent or show you

are AWS cloud ready

Learn more: aws.amazon.com/training

Your Training Next Steps:

Visit the AWS Training & Certification pod to discuss your

training plan & AWS Summit training offer

Register & attend AWS instructor led training

Get Certified

AWS Certified? Visit the AWS Summit Certification Lounge to pick up your swag

Learn more: aws.amazon.com/training

Thank you!