cit 668: system architecture - · pdf fileelastic block store (ebs) provides ... –...

47
CIT 668: System Architecture Amazon Web Services

Upload: lamnhan

Post on 07-Mar-2018

219 views

Category:

Documents


5 download

TRANSCRIPT

CIT 668: System Architecture

Amazon Web Services

Topics

1. AWS Global Infrastructure 2. Foundation Services

1. Compute 2. Storage 3. Database 4. Network

3. AWS Economics

Amazon Services Architecture

Regions and Availability Zones AWS resources are either

– Global – Tied to a region – Tied to an availability zone

Regions are completed isolated from each other. Availability zones are data centers within a region.

https://aws.amazon.com/about-aws/globalinfrastructure/

Edge Locations Content delivery network (CDN)

– Goal: serve content with low latency, high availability. – Solution: cache content in multiple geographically

distributed data centers so a DC is near each user.

Traditional Content Delivery Content Delivery Network

Edge Locations

Foundation Services

• Compute • Storage • Database • Networking

Compute

Elastic Compute Cloud (EC2) – Create virtual servers in the cloud in seconds. – Setup with any OS and software. – Manage with administrative access.

Auto Scaling – Create and remove EC2 instances based on triggers. – Time and date based triggers. – Resource based triggers.

Amazon EC2 is

• A web service that enables you to launch and manage server instances

• Designed to make web-scale computing easier for developers.

• A simple web service interface that provides programmable control of your cloud resources

EC2 Features

Elastic—Allows you to instantiate one to thousands of server instances either manually or automatically. Flexible—Choice of multiple instance types, OS, and software packages. Available—SLA commitment 99.95% availability in each region. Pay as You Go—Pay for resources as you need them, though reserved instances offer lower pricing for longer commitments.

Amazon Machine Image

Virtual root disk image – Contains OS – Contains most applications

Start a VM by – Booting an AMI – Creates an instance

Catalog of pre-built AMIs – OS: Linux (many distros), OpenSolaris, Windows – Software: Apache, MySQL, Oracle, WordPress, etc. – Available at http://aws.amazon.com/amis

Instance

• An instance is a VM running the OS and software on an AMI.

• You can launch many instances of the same AMI.

• Other users can launch instances of that AMI too.

• Each instance is a separate and independent virtual server.

EC2 Instance Types General Purpose

– Balanced compute, memory, network resources. – Useful for typical server applications.

Compute Optimized – Many vCPUs with lowest cost per vCPU. – High traffic web apps, video encoding, analytics.

GPU Instances – Provide access to GPUs with hundreds of CUDA cores. – Gaming and 3D graphics.

Memory Optimized – High memory with lowest cost per GiB of RAM. – Databases and distributed caches.

Storage Optimized – Large storage and high speed storage (SSD) versions. – Large databases and fileservers.

General Purpose (m1) Instance Types

Small Instance

1.7GiB RAM

1 Virtual Core

1 EC2 Compute Unit

160GB instance storage

32-bit or 64-bit

Large Instance

7.5GiB RAM

2 Virtual Cores

4 EC2 Compute Units

2 x 420GB instance storage

64-bit platform

Extra Large Instance

15GiB RAM

4 Virtual Cores

8 EC2 Compute Units

4 x 420GB instance storage

64-bit platform

1 EC Compute Unit = Early 2006 1.7 GHz Xeon CPU

Access Identifiers

AWS uses a set of different access identifiers – Use public key cryptography – Public identifier kept on service on instance

• Can be shared with anyone

– Private identifier kept on your PC • Must keep secret

Elastic Block Store Volume

• An addressable virtual disk • Can be attached to an instance

– Format – Mount – Store files

• Volumes have lifetime independent of instance – Disk storage persists even if instance terminated

Block Device Mapping

Map system devices to AWS block storage.

– VM Device Name – AWS Volume ID – Status – Timestamp – DeleteOnTermination

Security Group

• A Security Group defines the set of permitted inbound connections for an instance. – Each group is a named access control list. – Entries specify allowed protocols, ports, and IPs. – Essentially a firewall.

• A single Security Group can be applied to multiple instances.

• Multiple Security Groups can be applied to a single instance.

S3 and EBS Instance Lifecycles

http://shlomoswidler.com/2009/07/ec2-instance-life-cycle.html

Data remains accessible if instance is rebooted or (EBS-only) stopped. Data cannot be recovered after an instance is terminated.

S3-backed Instance EBS-backed Instance

S3 and EBS-backed Instance Differences

EC2 Resources

Persistent Resources

• Elastic IP Addresses • Elastic Block Storage

Volumes • Elastic Load Balancers • Security Groups • Amazon Machine Images

Ephemeral Resources

• Instances, including – Instance memory state – Instance disk state – Non-elastic IP address – DNS name

How can you maintain a running system if your servers are transient and unreliable?

AMI Types

Public—AMIs made available by Amazon and the EC2 community. Private—AMIs that you own and create; may be developed from Public AMIs. Shared—AMIs built by developers and shared with the EC2 community. Paid—AMIs that you purchase or that come with a service contract from a company such as Red Hat.

Security Credentials

Credentials to Administer Instances – AWS Management Console: Amazon account – Query and Third Party UIs: Secret access key – SOAP, EC2 CLI: X.509 certificate and private key

Credentials to Connect to an Instance – Amazon EC2 key pair – Windows administrator password

Credentials to Build Instances – UNIX: X.509 certificate and private key – Windows: Amazon account

Instance Network Addresses

EC2 instances assigned 2 IPs at launch – Private RFC1918 IP address for internal use – Public IP address NAT-mapped to private IP

EC2 instances assigned 2 DNS names at launch – Internal: resolves only inside EC2 – Public: associated with instance until stopped

Elastic IP addresses – Static IP addresses you map to an instance – Can keep and remap elastic IP addresses – Charged only for allocated but unused elastic IPs

Using Tags

Can tag – AMIs – Instances – EBS Volumes – EBS Snapshots

but not – Elastic IPs – Key pairs – Security groups

Foundation Services

• Compute • Storage • Database • Networking

Storage Elastic Block Store (EBS) provides

– Off-instance storage – Persistence beyond instance lifetime – High availability and reliability – Attach and detach from running instance – Exposure as device with an instance

Simple Storage Service (S3) provides – Highly available and reliable storage for objects. – Objects can be up to 5TB in size. – Objects are accessible simply via a URL.

Amazon Glacier provides – Cheap, reliable long term backup with 24 hour turnaround.

Elastic Block Store (EBS)

EBS Volumes are up to 1TB in size – Attach to any EC2 instance in same AZ – Create snapshots at any time – Create new volumes based on snapshots

Reliability – Annual Failure Rate (AFR) of 0.1-0.5% – Commodity hard disk AFR is ~4% – About as reliable as a RAID set – Use snapshots for backups

Pricing per GB-month

EBS Snapshots

Snapshots saved to S3 – Not visible by S3 API. – Snapshots are EBS

volumes themselves.

Snapshots are fast – Use Copy on Write

(CoW), i.e. – Only changed blocks

since last snapshot need to update.

http://blog.rightscale.com/2008/08/20/amazon-ebs-explained/

S3 Features

• An Internet-scale data storage service – All data is stored redundantly in multiple AZs – Data is located in the region you specify

• Stores objects from 1 byte to 5TB in size • Objects are stored in a bucket and retrieved via

a unique, developer-assigned URL • You can have 100 named buckets • Each bucket can store an unlimited objects in a flat namespace.

Applications of S3

Fast, scalable, and reliable web file hosting – Especially useful for audio and video files

http://aws.amazon.com/articles/1073

Amazon Glacier

Cloud based backup and long term storage – Durable: data stored on multiple devices at

multiple sites. – Cheap: as low as 0.01¢ per GB-month. – Slow: retrieval guaranteed within 24 hours; usually

requires 3-5 hours.

Organize data in vaults. – Store archives (up to 40TB) in vaults. – Can have up to 1000 vaults. – Jobs notify user of completion using Amazon SNS.

Foundation Services

• Compute • Storage • Database • Networking

Databases

Amazon Relational Database Service (RDS) – Managed relational database services. – Access via standard database protocols and SQL.

Amazon SimpleDB – Non-relational (NoSQL) flexible database service. – Access via web service requests. – Table size limited to 10GB.

Amazon DynamoDB – Scalable NoSQL database service introduced in 2012. – No table size limits, automatically partitions and scales.

Relational Database Service (RDS)

Users created their own DB instances – DB types: MySQL, Oracle, PostgreSQL, MSSQL. – Instance types with different CPU, RAM, storage. – Can create replicated DB instances across AZs.

Amazon manages – Software installation and updates. – Backups. – System administration.

SimpleDB

• Cloud-based non-relational (NoSQL) data store • Data is stored in domains (tables)

– Tables limited to 10GB in size. – Domains have a set of attributes (columns) – Attributes can have up to 256 values – Domains can have up to a billion items (rows)

• SimpleDB can be queried using a simple version of SQL via web service requests. – Does not support JOIN operations

Attributes can be added Dynamically

Initial model for person domain

Effect of adding Middle name attribute

DynamoDB

Highly reliable and scalable key/value store. – Stores associative arrays rather than tables. – Keys can have multiple values.

Fast – High throughput (built on SDs). – Very low latency (<10ms). – Users reserve desired throughput. – DynamoDB reconfigures itself to meet reservations.

Foundation Services

• Compute • Storage • Database • Networking

Networking

Virtual Private Cloud (VPC) – Logically isolated segment of AWS cloud. – Complete control over virtual network, including – Subnets, routing tables, IP address ranges, etc. – Security and privacy.

AWS Direct Connect – Dedicated private 1 or 10Gbps connection to AWS. – Available in about a dozen major data centers. – Consistent latency and throughput. – Lower data transfer pricing.

AWS Economics

AWS Economics

AWS prices its resources based on – Time: An hour of CPU time – Volume: GB of transferred data – Count: Number of messages queued – Time and Space: GB-month of data storage

Billing is done at beginning of month

EC2 Instance Pricing

Linux Instances Windows Instances

Instance Pricing Options

Reserved Instances – Reservations for 1 to 3 years. – Price discounted by up to 65%. – Instance type can change within instance class. – Instances can be moved between AZs.

Spot Instances – Bid on spare Amazon compute capacity. – If bid exceeds current SpotPrice, you have an instance. – Your instance runs until SpotPrice exceeds your bid. – Useful for large computations whose results are not needed

at a specific time.

EC2 Communication Charges

Key Points

• AWS architecture – Global infrastructure – Fundamental Services – Application Services – Management and Administration

• Fundamental Services – Compute: EC2, auto-scaling – Storage: EBS, S3 – Database: RDS, SimpleDB, DynamoDB – Networking: VPC, Direct Connect

Key Points

• AMIs are virtual disk images – A single AMI may have many instances

• Instances are running VMs – Run in an AZ located in a region. – Use keypair to access via ssh.

• On instance termination – Local storage is lost except EBS volumes. – DNS name and IP address are lost. – Use elastic IPs or own DNS for permanent addresses.

• EC2 bills for time, data transfer, and storage.