AWS Certified Developer Exam Guide
January 2017
codebyamir.com
AWS Global Infrastructure 4 Overview 4
Regions 4 Availability Zones 4
IAM - Identity and Access Management 4 Overview 4 Console URL 5 Limits 5 Active Directory Federation 5 Web Identity Federation 5 Pricing 6
EC2 - Elastic Compute Cloud 7 Overview 7 Pricing Models 7 Instance Types 7 Instance Metadata 7 AMI 8 IAM Roles for EC2 8
EBS - Elastic Block Store 8 Overview 8 Volume types 8 Limits 8
VPC - Virtual Private Cloud 9 Overview 9 VPC Resources 9 Limits 9 Default VPC vs Custom VPC 10 NAT - Network Address Translation 10
NAT Instance 10 NAT Gateway 10
Network ACL 10 Security Groups vs Network ACLs 10
ELB - Elastic Load Balancing 11 Overview 11 Limits 12
S3 - Simple Storage Service 12 Overview 12 Encryption 12
2
codebyamir.com
Consistency Model 12 Object Structure 12 Storage Tiers 13 Static Web Hosting 13 Versioning 13 Transfer Acceleration 13 CORS - Cross Origin Resource Sharing 13 Snowball 13 Limits 14 Pricing 14
Glacier 14 Overview 14
CloudFront 14 Overview 14 Concepts 14
CloudFormation 15 Overview 15
Limits 15 Defaults 15
SNS - Simple Notification Service 16 Overview 16
Basics 16 Limits 16
SQS 17 Overview 17
Limits 17 Defaults 18
SWF - Simple Workflow Service 18 Overview 18
DynamoDB 18 Overview 18
Basics 18 Data model 19 Limits 19
Scan vs Query 19 Provisioned Throughput 19
Elastic Beanstalk 20 Overview 20 Limits 20
3
codebyamir.com
RDS - Relational Database Service 20 Overview 20
Limits 21
SDK & Tools 21 Overview 21
Other 21 Overview 21
HTTP Response Codes 21
AWS Global Infrastructure
Overview
Regions A region is a geographic area where AWS resources exist - US East (N. Virginia), US West (Oregon), EU (Ireland). Certain services may not be available in all regions. The pricing for services may also vary between regions.
Availability Zones An availability zone is a datacenter. Each region consists of 2 or more availability zones. The latency between availability zones in same region is very low.
IAM - Identity and Access Management
Overview IAM is global across all regions and consists of users, groups, and roles. Roles can be applied to both users and AWS services (EC2 instances, Lambda, etc). A role can only be assigned to EC2 instance during launch time. However, you can modify the role permissions at any time. Policies can be defined in JSON format and attached to users to determine which resources they may access. Each policy must have a Statement element. Two types of user credentials:
● Programmatic access - Access/secret keys ● Management console - Password
4
codebyamir.com
An IAM user cannot be renamed from AWS console. It must be done from AWS CLI or SDK.
Console URL An IAM user cannot use the root account login URL (http://aws.amazon.com/console) to access AWS management console. The IAM login URL will include the account number or alias: https://codebyamir.signin.aws.amazon.com/console/ https://1234567890.signin.aws.amazon.com/console/
Limits ● Groups a user can be a member of: 10
● Access keys assigned to a user: 2
● MFA devices in use by a user: 1
● MFA devices in use by the AWS root account: 1
Active Directory Federation Users can login to AWS management console using AD using SAML (Secure Assertive Markup Language) The AWS endpoint for SAML is https://signin.aws.amazon.com/saml
1. User authenticates to corporate ADFS site 2. Receives SAML token 3. Browser posts the SAML token to AWS endpoint 4. Behind the scenes, the endpoint requests temporary security token and then constructs sign-in URL
(AssumeRoleWithSAML ) 5. The sign-in URL is sent to the browser and user is redirected to the console
Web Identity Federation Users can also access AWS resources using other web logins such as Facebook, Google, or Amazon. Accessed via IAM through Web Identity Federation Playground
1. Authenticate with Identity Provider a. When you sign in with Facebook, it will trigger an event and you will receive an object that
contains the access token. You can take this access token and proceed with Step 2 2. Obtain Temporary Security Credentials
a. Now that you have an id token, you can obtain temporary security credentials by making an AssumeRoleWithWebIdentity request. You will assume a role that we created for you, shown in Step
3. Access AWS resource
5
codebyamir.com
a. You can now make calls to AWS resources using your temporary security credentials (Secret Access Key, Access Key ID, and Session Token), with permissions defined by the Access Policy below.
Pricing There is not cost associated with created IAM users, groups, and roles.
6
codebyamir.com
EC2 - Elastic Compute Cloud
Overview Provides scalable compute capacity in the cloud. Allows provisioning of instances within minutes.
Pricing Models ● On-demand: fixed rate per hour with no commitment
○ Test/Dev environments ○ Supplement reserved instances - Black Friday traffic spikes
● Reserved: discounted hourly charge with 1 or 3 year terms ○ Production environments that are going to be around for awhile
● Spot: bid whatever price you want for instance capacity, good for workflows that can withstand interruption
○ Research computing workflows - flexible start and end times ○ Spot instances may be terminated by EC2 when the spot price rises above your bid price ○ If you terminate the instance, you will pay for the partial hour. If EC2 terminates it, you won’t be
charged for that partial hour.
Instance Types
Instance Metadata Instance metadata is data about your instance that you can use to configure or manage the running instance. Instance metadata is divided into categories. This can be useful when writing scripts for EC2 instances that are automatically provisioned.
7
codebyamir.com
$ curl http://169.254.169.254/latest/meta-data/public-ipv4 52.14.6.51 $ curl http://169.254.169.254/latest/meta-data/instance-id i-0b554c16f819cb191
AMI AMI’s are machine images that can be used to launch instances in the same region. If you need the instance in another region, you need to copy the AMI to that region first.
IAM Roles for EC2 It is best practice to not store access keys on the filesystem of an EC2 instance. Instead, create an IAM role and assign it to the instance during launch time. This way your application won’t need to store the credentials in a config file that could potentially be compromised or inadvertently shared via Github.
EBS - Elastic Block Store
Overview EBS volumes are persistent block-based disk storage attached to EC2 instances. You can attach multiple EBS volumes to an EC2 instance. You cannot attach an EBS volume to multiple EC2 instances. You use the AttachVolume API call to attach a volume to an instance. A root volume can only be detached from a stopped instance. EBS volumes can be snapshotted and copied to another region. They can be used in RAID configurations. EBS volumes incur charges even if they are not attached to anything.
Volume types ● General Purpose SSD (GP2) < 10,000 IOPS ● Provisioned IOPs SSD > 10,000 IOPS ● Magnetic - Cheap but slow rotational disk storage
Limits
● Maximum 5000 EBS volumes
● Maximum 10,000 EBS snapshots
● Maximum total volume per disk type (GP2, PIOPS..) 20TiB
● Maximum total provisioned IOPS is 40,000
8
codebyamir.com
● Can be attached to only one EC2 instance (no shared volumes)
VPC - Virtual Private Cloud
Overview A VPC lets you provision a logically isolated section of the cloud so you can launch resources in a virtual network. Security groups and access control lists allow layered security inside a VPC.
● A VPC cannot span regions, but can span availability zones within same region. ● 1 subnet = 1 availability zone ● Multiple VPC’s can be connected using direct peering. ● Largest CIDR block for VPC is /16 and smallest is /28
VPC Resources VPC’s can consist of Internet Gateway (1), subnets, route tables, network ACL’s, and security groups.
● Security groups are stateful and work on instance level ● Network access control lists are stateless and work on the subnet level
Limits ● VPCs per region: 5 ● Subnets per VPC: 200 ● Security groups per VPC: 500 ● Internet Gateways per VPC: 1
9
codebyamir.com
Default VPC vs Custom VPC Default VPC is user friendly, allowing you to immediately deploy instances. All subnets in default VPC have a route out to the Internet. Each EC2 instance launched in default VPC has a public and private IP address.
NAT - Network Address Translation EC2 instances launched in a private subnet won’t have access to the Internet. There are a few solutions to this.
NAT Instance This is simply an EC2 instance that handles translating addresses for traffic originating from private subnets. Provision from community AMI and then disable Source/Destination Check. NAT instance must be in public subnet and have an EIP in order to work.
NAT Gateway This is a new service that does the same thing as NAT instance with easier setup. No need to patch them. Scales to 10Gbps. Not associated with security groups.
Network ACL Your VPC automatically comes with a default network ACL and it allows all outbound and inbound traffic. Any custom network ACL’s deny all traffic until you add rules. You can associate an ACL with multiple subnets. However, a subnet can only be associated with one ACL at a time.
Security Groups vs Network ACLs ● Security groups : Act as a firewall for associated Amazon EC2 instances, controlling both inbound and
outbound traffic at the instance level
● Network access control lists (ACLs) : Act as a firewall for associated subnets, controlling both inbound
and outbound traffic at the subnet level. Evaluated sequentially (stop at first match).
Security Group Network ACL
Operates at the instance level (first layer of
defense)
Operates at the subnet level (second layer of
defense)
Supports allow rules only Supports allow rules and deny rules
10
codebyamir.com
Is stateful: Return traffic is automatically allowed,
regardless of any rules
Is stateless: Return traffic must be explicitly
allowed by rules
We evaluate all rules before deciding whether to
allow traffic
We process rules in number order when
deciding whether to allow traffic
Applies to an instance only if someone specifies
the security group when launching the instance,
or associates the security group with the
instance later on
Automatically applies to all instances in the
subnets it's associated with (backup layer of
defense, so you don't have to rely on someone
specifying the security group)
ELB - Elastic Load Balancing
Overview Elastic Load Balancing automatically distributes incoming application traffic across multiple Amazon EC2 instances. It enables you to achieve fault tolerance in your applications, seamlessly providing the required amount of load balancing capacity needed to route application traffic.
The service offers two types of load balancers:
1. Classic Load Balancer (layer 4)
Works at the network protocol level and does not look inside of the actual network packets, remaining unaware of the specifics of HTTP and HTTPS. In other words, it balances the load without necessarily knowing a whole lot about it.
2. Application Load Balancer (layer 7)
More sophisticated and more powerful. It inspects packets, has access to HTTP and HTTPS headers, and (armed with more information) can do a more intelligent job of spreading the load out to the target.
Load balancers are charged by the hour and per GB of usage. Listeners must be configured for each load balancer. This is a process that checks for connection requests to your load balancer. The listener configuration consists of:
● Front-end protocol and port (client to LB) ● Back-end protocol and port (LB to instance)
Supported protocols are
● HTTP, HTTPS, TCP, and SSL
11
codebyamir.com
Supported ports are all (1-65536) for EC2-VPC
Limits ● Load balancers per region: 20 ● Listeners per load balancer: 100
S3 - Simple Storage Service
Overview Object-based storage for static files. Objects are stored in buckets. The buckets are in a universal namespace which means that a bucket name must be unique globally. The maximum file size is 5TB (using multipart uploads). There is no maximum bucket size. Largest PUT operation file size is 5GB. Object URL format - https://s3-<region>.amazonaws.com/<bucket>/<object> Static website hosting URL format - https://<bucket-name>.s3-website-<AWS-region>.amazonaws.com
Encryption ● In transit with SSL/TLS
● At Rest using Server side encryption
○ S3 Managed Keys - SSE-S3 ○ AWS Key Management System - SSE-KMS ○ Server Side Encryption with Customer Keys - SSE-C
Consistency Model ● Read after Write consistency
○ PUTs of new objects - When you upload a new file to S3, you can read the object immediately. ● Eventual consistency
○ overwrite PUTs or deletes. When you update or delete a file, it takes time to propagate (eventual consistency).
Object Structure Key - object name Value - byte data Version ID Metadata
12
codebyamir.com
S3 is designed to sort objects in alphabetical order which can affect your design decisions. When you upload log data in format like YYYY-MM-DD-HH-MM-SS.log. Adding a random salt to beginning of the filename which will optimize the storage in S3.
Storage Tiers
● Standard - 99.9% availability and 99.999999999% (11 9’s) durability ● Infrequently accessed (IA) - for data that is accessed less frequently, but requires immediate access
when needed. Lower fee than S3 but you are charged a retrieval fee. ● Reduced Redundancy Storage (RRS) - Provides 99.9% availability and durability
Static Web Hosting Feature that can be enabled which is used to host static website from the S3 bucket.
Versioning This is a feature that can be used to have backup copies of objects in case of accidental overwrite or deletion. Once enabled, it cannot be disabled just suspended. You can manage the versions by creating a lifecycle policy to delete previous versions after X days.
Transfer Acceleration This is a new feature of S3 that leverages the AWS backbone network to speed up the uploads of data to S3. It does incur an additional charge.
CORS - Cross Origin Resource Sharing If you need to load a resource from bucket 2 on a page hosted on bucket1, you will need to add a CORS configuration in the permissions of bucket 2.
Snowball Snowball is a petabyte-scale secure appliance used to transfer large volumes of data into and out of the S3. You create a job in the AWS Management Console and a Snowball appliance will be automatically shipped to you. Once it arrives, attach the appliance to your local network, download and run the Snowball client to establish a connection, and then use the client to select the file directories that you want to transfer to the appliance. The client will then encrypt and transfer the files to the appliance at high speed. Once the transfer is complete, you ship the appliance back to Amazon.
13
codebyamir.com
Limits You can have 100 buckets per account.
Pricing Charge for storage, number of requests, and storage management.
Glacier
Overview Amazon Glacier is an extremely low-cost storage service that provides secure, durable, and flexible storage for data backup and archival. Use Amazon Glacier if low storage cost is paramount, and you can accept retrieval times of a few hours.
CloudFront
Overview CloudFront is Amazon’s content delivery network (CDN). It is a distributed network of servers that allows delivery of web content to user based on closest geographical location.
Concepts ● Edge location - CDN endpoint for CloudFront. There are over 60 edge locations across the globe. ● Origin - origin of the files that CDN will distribute
○ S3 bucket ○ EC2 instance ○ Elastic Load Balancer ○ Route53 record
● Distribution - the name given to the resource you create which consists of a collection of edge locations ○ Web - Used for static assets (CS, JSS, images) ○ RTMP - Used for media streaming
● Objects are cached on edge location based on time to live value (TTL). Default TTL is 86400 seconds (24 hrs). Cache invalidations flush the cache from a distribution however this incurs a charge.
14
codebyamir.com
CloudFormation
Overview CloudFormation allows you to deploy infrastructure using scripts.
● Free, but resources it creates are billed normally
● Service to create resources based on templates
○ JSON -based text file
● Supports for example creating
○ VPCs
○ Subnets
○ Gateways
○ Route Tables
○ Network ACLs
○ Elastic IPs
○ EC2 Instances
○ EC2 Security Groups
○ Auto Scaling Groups
○ Elastic Load Balancers
○ RDS Database Instances
○ RDS Security Groups in VPC
● Can be used to bootstrap Chef and Puppet
● Supports software installation by application bootstrapping scripts
● Differs from Elastic Beanstalk in that CloudFormation creates resources whereas Elastic Beanstalk is
used to create applications
● Deleting stack also deletes stack resources
○ Deletion policy must be used to avoid resource deletion
Limits ● No limit for templates or stacks
● Maximum number of resources in stack is 200 ● Maximum number of AWS CloudFormation stacks per AWS root account is 200 ● 60 parameters and 60 outputs are allowed in template
Defaults ● By default CloudFormation stack is rolled back if stack creation fails
15
codebyamir.com
SNS - Simple Notification Service
Overview Basics
● Basically a system to send messages based on actions and triggers
● Uses PUSH -mechanism
● Supports message sending in following formats
○ HTTP / HTTPS
○ Email / Email-JSON
○ SQS
○ SMS
● Can be used with other services such as EC2, S3 and SQS
● Can be used to "fan-out" SQS -messages
● SNS message will contain following in JSON format
○ MessageID
○ Timestamp
○ TopicArn
○ Type
○ UnsubscribeURL
○ Message
○ Subject
○ Signature
○ SignatureVersion
Limits ● Topic name is limited to 256 characters
● Maximum 10 million subscribers per topic
● Maximum 100,000 topics per account
● SNS message can contain maximum 256KB of text data, including XML, JSON and unformatted text
16
codebyamir.com
SQS
Overview Simple Queue Service is a web service that provides a message queue so that applications can queue messages in a temporary repository for consumption by another component. It was the very first service on the AWS platform when it was rolled out in 2004. SQS is a pull-based service.
● FIFO (first in, first out) support introduced in November 2016, but exam has not been updated for it
● Message are sent one or more times and in any order
● Supports short and long polling
○ In short polling SQS Service gets subset of your messages or nothing
○ In long polling SQS Service waits until at least one message is available in queue
○ In simple terms short polling goes to check whether message exists and comes back either with
data or not. Long polling waits as long as there is some data.
● VisibilityTimeout -parameter controls how long message is not visible once pulled from the queue
● ChangeMessageVisibility -parameter can be used to prolong VisibilityTimeout period. Time is added to
the to time how long message has been hidden
● ReceiveMessageWaitTimeSeconds -parameter controls whether long polling is on or not
● DelaySeconds -parameter can be used to hid message from all clients from queue. This happens
before VisibilityTimeout -parameter kicks in
● GetQueueAttributes API -action with "ApproximateNumberOfMessages" returns the number of
messages waiting in the queue
● GetQueueAttributes API -action with "ApproximateNumberOfMessagesNotVisible" returns the number
of messages in flight
● Dead Letter Queue is used for messages that can't be processed and need further investigation
● MessageRetentionPeriod -parameter determines how long SQS holds the message in the queue
Limits
● Minimum message size is 1KB
● Maximum message size is 256KB
● At max 120,000 messages can be inflight
● Message can only contain XML, JSON and unformatted text
● Message can contain 10 metadata attributes
● Queue name can be up to 80 characters long
● Queue name is case-sensitive
17
codebyamir.com
Defaults ● VisibilityTimeout minimum is 0 seconds
● VisibilityTimeout default is 30 seconds
● VisibilityTimeout maximum time is 12 hours
● ReceiveMessageWaitTimeSeconds is 0
● MessageRetention minimum retention period is 1 minute
● MessageRetention default retention period is 4 days
● MessageRetention maximum retention period is 14 days
● Long Polling maximum timeout value is 20 seconds
SWF - Simple Workflow Service
Overview The Amazon Simple Workflow Service (SWF) makes it easy to build applications that coordinate work across distributed components. SWF consists of domain, worker, and decider. SWF is a task-oriented service. A domain is a container that isolates workflow, activity types, and execution. The API call is RegisterDomain. An activity worker is a program that receives activity tasks, performs them, and provides results back. The coordination logic in a workflow is contained in a software program called a decider . The decider schedules activity tasks, provides input data to the activity workers, processes events that arrive while the workflow is in progress, and ultimately ends (or closes) the workflow when the objective has been completed. A task is assigned only once and never duplicated. Maximum workflow length is 1 year.
DynamoDB
Overview Basics
● NoSQL database on SSD storage
● Key-value store
● Managed by Amazon and data is automatically replicated on three (3) availability zones within selected
region
● Supports eventually and strongly consistent data models
○ Eventually means that data is stored at least in one zone
18
codebyamir.com
○ Strongly consistent means that data is stored on all availability zones
○ This means that eventually consistent read can return data that is not the latest
● Has additional features
○ Streams, which is like having transactional logs
○ Triggers, as the name say, work like traditional RDBMS triggers but used AWS Lamdba for
actions
● Not ACID in RDBMS terms (e.g no Oracle SCN) even though has strongly consistent data model
Data model ● Schema-less
● Table is collection of items and item consists from one or several attributes
● Table must have primary key
● Supports two types of primary keys (1. hash key) and (2. hash key + sort key)
● Data is spread using hash key/attribute (primary key) and meaning so that unique primary key will
spread the data evenly on partitions
● Data is stored in partitions
Limits ● Item size in table must be 1 byte to 400KB (Item key and attributes)
● Maximum, default amount of tables is 256
Scan vs Query A query finds items in a table using only the primary key attribute values. You must provide a partition key attribute and a distinct value to search for. Query results are always sorted by the sort key in ascending order. Set ScanIndexForward parameter to sort in descending order. A scan operation examines every item in the table. By default, a scan returns all the of the attributes for every item. You can use ProjectionExpression parameter to filter the output. A query operation is more efficient than a scan operation.
Provisioned Throughput ● Read provisioned throughput
Units are in 4KB increments Eventually consistent reads (default) = 2 reads per second Strongly consistent reads = 1 read per second
● Write provisioned throughput
All writes are 1KB with 1 write per second
19
codebyamir.com
Elastic Beanstalk Overview
● You upload your application and this service handles provisioning of underlying infrastructure
● Free service, but it creates resources that are billed by usage
● Automates capacity provisioning, load balancing, auto scaling and application development
● Supports
○ Apache Tomcat for Java
○ Apache HTTP for PHP/Python/Node.js
○ Nginx for Node.js
○ Passenger or Puma for Ruby
○ Microsoft IIS 7.5, 8.0, and 8.5 for .NET
○ Java SE
○ Docker
○ Go
● Uses EC2, RDS, ELB, Auto Scaling, S3 and SNS resources to create needed environment
● Supports currently Amazon Linux AMI and Windows Server 2012 R2 AMI
● Supports running multiple environments at the same time
● Supports using multiple Availability Zones
● If underlying infrastructure stops responding, auto scaling will automatically launch a new instance
● By default, application is publicly accessible from "myapp.elasticbeanstalk.com"
Limits ● Supports to 75 applications and 1,000 application versions
● By default, up to 200 environments across all of your applications can be run
RDS - Relational Database Service
Overview AWS RDS provides a managed DB platform, which offers features, such as automated backup, patch management, automated failure detection and recovery. The scaling is not automated and the user needs to plan it with a few clicks.
● Managed service (no SSH)
● Supports Multi-AZ configurations
● Supports MySQL, MariaDB, AuroraDB, Postgres, Oracle and MSSQL
20
codebyamir.com
● Parameters of DB Instances are handled by parameter file (think Oracle spfile)
● Offers automated backups
● Backup storage for active databases is free
● Databases can be restored within 5 minute timeframe (transaction logs are taken often)
● Price is based on data in or out of your RDS into Internet (inside AWS free)
Limits ● Storage limited from 5GB to 6TB
● Backup window from 0 to 35 days
SDK & Tools
Overview We can use the AWS SDK in different languages to write code that accesses the services programmatically Supported languages - Android, iOS, JavaScript, .Net, Java, Node.js, PHP, Python, Ruby, Go, C++ The SDK’s that have have default regions are configured to point to us-east-1. The Amazon Linux AMI comes with the AWS CLI tools pre-installed.
Other
Overview
HTTP Response Codes ● 200 - OK ● 3xx - Redirect ● 4xx - Client error ● 5xx - Server error
21