lassoing the clouds: best practices on aws › s.deshong.net › ... · lassoing the clouds: best...
TRANSCRIPT
Lassoing the Clouds: Best Practices on AWS
Brian DeShong May 26, 2017
Who am I?
Worked primarily in web developmentPHP (17 years!!!), MySQL, Oracle, Linux, ApacheHighly-trafficked, scalable web applicationsFrequent speaker at PHP conferences, Atlanta PHP user groupiOS / Mac development for ~5 yearsFloodWatch for iOSYahoo!, Half Off Depot
Who am I?
Worked primarily in web developmentPHP (17 years!!!), MySQL, Oracle, Linux, ApacheHighly-trafficked, scalable web applicationsFrequent speaker at PHP conferences, Atlanta PHP user groupiOS / Mac development for ~5 yearsFloodWatch for iOSYahoo!, Half Off Depot
Who am I?
Worked primarily in web developmentPHP (17 years!!!), MySQL, Oracle, Linux, ApacheHighly-trafficked, scalable web applicationsFrequent speaker at PHP conferences, Atlanta PHP user groupiOS / Mac development for ~5 yearsFloodWatch for iOSYahoo!, Half Off Depot
Who am I?
Worked primarily in web developmentPHP (17 years!!!), MySQL, Oracle, Linux, ApacheHighly-trafficked, scalable web applicationsFrequent speaker at PHP conferences, Atlanta PHP user groupiOS / Mac development for ~5 yearsFloodWatch for iOSYahoo!, Half Off Depot
Agenda
Sometimes I’m going to tell you something specific I’d recommendOther times, tell you the pros and cons and let you choose based on your situationSome of these concepts are AWS-specific, but others are applicable to any hosting situation, VPS or notShow of hands — users of AWS in Production? Multiple regions vs. AZs?
Agenda• Running web servers
Sometimes I’m going to tell you something specific I’d recommendOther times, tell you the pros and cons and let you choose based on your situationSome of these concepts are AWS-specific, but others are applicable to any hosting situation, VPS or notShow of hands — users of AWS in Production? Multiple regions vs. AZs?
Agenda• Running web servers
• Serving static content
Sometimes I’m going to tell you something specific I’d recommendOther times, tell you the pros and cons and let you choose based on your situationSome of these concepts are AWS-specific, but others are applicable to any hosting situation, VPS or notShow of hands — users of AWS in Production? Multiple regions vs. AZs?
Agenda• Running web servers
• Serving static content
• Security-related concerns
Sometimes I’m going to tell you something specific I’d recommendOther times, tell you the pros and cons and let you choose based on your situationSome of these concepts are AWS-specific, but others are applicable to any hosting situation, VPS or notShow of hands — users of AWS in Production? Multiple regions vs. AZs?
Agenda• Running web servers
• Serving static content
• Security-related concerns
• Databases
Sometimes I’m going to tell you something specific I’d recommendOther times, tell you the pros and cons and let you choose based on your situationSome of these concepts are AWS-specific, but others are applicable to any hosting situation, VPS or notShow of hands — users of AWS in Production? Multiple regions vs. AZs?
Agenda• Running web servers
• Serving static content
• Security-related concerns
• Databases
• Logging
Sometimes I’m going to tell you something specific I’d recommendOther times, tell you the pros and cons and let you choose based on your situationSome of these concepts are AWS-specific, but others are applicable to any hosting situation, VPS or notShow of hands — users of AWS in Production? Multiple regions vs. AZs?
Regions + Availability Zones
https://aws.amazon.com/about-aws/global-infrastructure/
Region: comprised of 2 or more data centersEach data center is called an Availability ZoneAZs typically separated by many milesLow latency links between them
Operating Web Servers
Amazon Machine
Images (AMIs)
What is an Amazon Machine Image?
What is an Amazon Machine Image?
• Provides information required to launch an EC2 instance
What is an Amazon Machine Image?
• Provides information required to launch an EC2 instance
• You specify the AMI to use when launching a new instance
What is an Amazon Machine Image?
• Provides information required to launch an EC2 instance
• You specify the AMI to use when launching a new instance
• Amazon Linux by default
What is an Amazon Machine Image?
• Provides information required to launch an EC2 instance
• You specify the AMI to use when launching a new instance
• Amazon Linux by default
• CentOS, Ubuntu, Windows, etc.
What is an Amazon Machine Image?
• Provides information required to launch an EC2 instance
• You specify the AMI to use when launching a new instance
• Amazon Linux by default
• CentOS, Ubuntu, Windows, etc.
• Some options on using AMIs…
“Just Enough OS”
“Just Enough OS”• Startup a bare instance, JeOS
“Just Enough OS”• Startup a bare instance, JeOS
• Install what you need at initial boot time
• “User data”: shell script that runs at initial boot
“Just Enough OS”• Startup a bare instance, JeOS
• Install what you need at initial boot time
• “User data”: shell script that runs at initial boot
• Avoids AMI creation and maintenance
“Just Enough OS”• Startup a bare instance, JeOS
• Install what you need at initial boot time
• “User data”: shell script that runs at initial boot
• Avoids AMI creation and maintenance
• Instances take longer to be ready for service
“Just Enough OS”
web01Amazon Linux
vim
openssh
openssl
kernel
sudo
. . .
User data script runs only at initial boot timeInstall Apache, PHP 7Setup Apache to start at bootSetup a groups, filesystemStart Apache
This all takes time!Benefits such that, if you want to upgrade to PHP 7.1, you just change the user data script and launch new instancesVirtual machines are by their nature disposable
“Just Enough OS”
web01Amazon Linux
vim
openssh
openssl
kernel
sudo
. . .
#!/bin/bashyum update -yyum install -y httpd24 php70chkconfig httpd ongroupadd wwwusermod -a -G www ec2-userchown -R root:www /var/wwwchmod 2775 /var/wwwfind /var/www -type d -exec chmod 2775 {} +find /var/www -type f -exec chmod 0664 {} +// Download and put application code into placeservice httpd start
User data script
User data script runs only at initial boot timeInstall Apache, PHP 7Setup Apache to start at bootSetup a groups, filesystemStart Apache
This all takes time!Benefits such that, if you want to upgrade to PHP 7.1, you just change the user data script and launch new instancesVirtual machines are by their nature disposable
Fully Baked AMIweb01
Amazon Linux
vim
openssh
openssl
kernel
sudo PHP 7.1
ImageMagick
nginx
Benefits: The machine is ready to go as soon as it’s fully booted Startup time can be a lot faster, because all necessary data was baked into the machine image Stability of packages, more control Amazon Linux installs security updates at boot out of the boxDownside is maintenance: Want to upgrade PHP 7.1? Have to bundle a new AMI Then you need to replace all of your machines with machines using the new AMI Think back to 2014: OpenSSL vulnerabilities and upgrading across the board. Not fun!
Fully Baked AMIweb01
Amazon Linux
vim
openssh
openssl
kernel
sudo PHP 7.1
ImageMagick
nginx
• Added users, groups
Benefits: The machine is ready to go as soon as it’s fully booted Startup time can be a lot faster, because all necessary data was baked into the machine image Stability of packages, more control Amazon Linux installs security updates at boot out of the boxDownside is maintenance: Want to upgrade PHP 7.1? Have to bundle a new AMI Then you need to replace all of your machines with machines using the new AMI Think back to 2014: OpenSSL vulnerabilities and upgrading across the board. Not fun!
Fully Baked AMIweb01
Amazon Linux
vim
openssh
openssl
kernel
sudo PHP 7.1
ImageMagick
nginx
• Added users, groups• Filesystem setup
Benefits: The machine is ready to go as soon as it’s fully booted Startup time can be a lot faster, because all necessary data was baked into the machine image Stability of packages, more control Amazon Linux installs security updates at boot out of the boxDownside is maintenance: Want to upgrade PHP 7.1? Have to bundle a new AMI Then you need to replace all of your machines with machines using the new AMI Think back to 2014: OpenSSL vulnerabilities and upgrading across the board. Not fun!
Fully Baked AMIweb01
Amazon Linux
vim
openssh
openssl
kernel
sudo PHP 7.1
ImageMagick
nginx
• Added users, groups• Filesystem setup• Nginx starts at boot
Benefits: The machine is ready to go as soon as it’s fully booted Startup time can be a lot faster, because all necessary data was baked into the machine image Stability of packages, more control Amazon Linux installs security updates at boot out of the boxDownside is maintenance: Want to upgrade PHP 7.1? Have to bundle a new AMI Then you need to replace all of your machines with machines using the new AMI Think back to 2014: OpenSSL vulnerabilities and upgrading across the board. Not fun!
Fully Baked AMIweb01
Amazon Linux
vim
openssh
openssl
kernel
sudo PHP 7.1
ImageMagick
nginx
• Added users, groups• Filesystem setup• Nginx starts at boot• Places code on disk at boot
Benefits: The machine is ready to go as soon as it’s fully booted Startup time can be a lot faster, because all necessary data was baked into the machine image Stability of packages, more control Amazon Linux installs security updates at boot out of the boxDownside is maintenance: Want to upgrade PHP 7.1? Have to bundle a new AMI Then you need to replace all of your machines with machines using the new AMI Think back to 2014: OpenSSL vulnerabilities and upgrading across the board. Not fun!
Fully Baked AMIweb01
Amazon Linux
vim
openssh
openssl
kernel
sudo PHP 7.1
ImageMagick
nginx
• Added users, groups• Filesystem setup• Nginx starts at boot• Places code on disk at boot• Monitoring scripts, packages
Benefits: The machine is ready to go as soon as it’s fully booted Startup time can be a lot faster, because all necessary data was baked into the machine image Stability of packages, more control Amazon Linux installs security updates at boot out of the boxDownside is maintenance: Want to upgrade PHP 7.1? Have to bundle a new AMI Then you need to replace all of your machines with machines using the new AMI Think back to 2014: OpenSSL vulnerabilities and upgrading across the board. Not fun!
Fully Baked AMIweb01
Amazon Linux
vim
openssh
openssl
kernel
sudo PHP 7.1
ImageMagick
nginx
• Added users, groups• Filesystem setup• Nginx starts at boot• Places code on disk at boot• Monitoring scripts, packages• etc…
Benefits: The machine is ready to go as soon as it’s fully booted Startup time can be a lot faster, because all necessary data was baked into the machine image Stability of packages, more control Amazon Linux installs security updates at boot out of the boxDownside is maintenance: Want to upgrade PHP 7.1? Have to bundle a new AMI Then you need to replace all of your machines with machines using the new AMI Think back to 2014: OpenSSL vulnerabilities and upgrading across the board. Not fun!
Fully Baked AMIweb01
Amazon Linux
vim
openssh
openssl
kernel
sudo PHP 7.1
ImageMagick
nginx
• Added users, groups• Filesystem setup• Nginx starts at boot• Places code on disk at boot• Monitoring scripts, packages• etc…
web02Amazon Linux
vim
openssh
openssl
kernel
sudo PHP 7.1
ImageMagick
nginx
web03Amazon Linux
vim
openssh
openssl
kernel
sudo PHP 7.1
ImageMagick
nginx
web04Amazon Linux
vim
openssh
openssl
kernel
sudo PHP 7.1
ImageMagick
nginx
Benefits: The machine is ready to go as soon as it’s fully booted Startup time can be a lot faster, because all necessary data was baked into the machine image Stability of packages, more control Amazon Linux installs security updates at boot out of the boxDownside is maintenance: Want to upgrade PHP 7.1? Have to bundle a new AMI Then you need to replace all of your machines with machines using the new AMI Think back to 2014: OpenSSL vulnerabilities and upgrading across the board. Not fun!
Considerations
Bundling an AMI takes time — on the order of an hour or moreMight not always make sense to invest this time if you’re seldom going to need to add new machinesBut there’s a risk that if you do need them quickly, you have to wait for them to start
Considerations• How quickly do you need it in the event of a failure?
• …or due to an increase in demand?
Bundling an AMI takes time — on the order of an hour or moreMight not always make sense to invest this time if you’re seldom going to need to add new machinesBut there’s a risk that if you do need them quickly, you have to wait for them to start
Considerations• How quickly do you need it in the event of a failure?
• …or due to an increase in demand?
• Do you have the time and resources to maintain a fully baked AMI?
Bundling an AMI takes time — on the order of an hour or moreMight not always make sense to invest this time if you’re seldom going to need to add new machinesBut there’s a risk that if you do need them quickly, you have to wait for them to start
Considerations• How quickly do you need it in the event of a failure?
• …or due to an increase in demand?
• Do you have the time and resources to maintain a fully baked AMI?
• My recommendation:
• Start with “JeOS” + configure at boot
• You can always create custom AMIs later, if needed
Bundling an AMI takes time — on the order of an hour or moreMight not always make sense to invest this time if you’re seldom going to need to add new machinesBut there’s a risk that if you do need them quickly, you have to wait for them to start
Fault Tolerance
No Single Point of Failure!
Have at least two of everythingSpread web servers between AZsIf one AZ loses power, you stay up!But more resources has a cost associated with it!
No Single Point of Failure!• Do not have a single point of failure!
• This can be a server, AZ, or even a region
Have at least two of everythingSpread web servers between AZsIf one AZ loses power, you stay up!But more resources has a cost associated with it!
No Single Point of Failure!• Do not have a single point of failure!
• This can be a server, AZ, or even a region
• Always have at least two of everything
Have at least two of everythingSpread web servers between AZsIf one AZ loses power, you stay up!But more resources has a cost associated with it!
No Single Point of Failure!• Do not have a single point of failure!
• This can be a server, AZ, or even a region
• Always have at least two of everything
• If an EC2 instance dies, the other remains in service
Have at least two of everythingSpread web servers between AZsIf one AZ loses power, you stay up!But more resources has a cost associated with it!
No Single Point of Failure!• Do not have a single point of failure!
• This can be a server, AZ, or even a region
• Always have at least two of everything
• If an EC2 instance dies, the other remains in service
• Holy grail: spread out across multiple regions
Have at least two of everythingSpread web servers between AZsIf one AZ loses power, you stay up!But more resources has a cost associated with it!
Use AZs Effectively
There’s a trade-off between redundancy and cost, though!More hardware in more AZs means more money
Use AZs Effectively
web
us-east-1a
web web
web db (write)
db (read)
There’s a trade-off between redundancy and cost, though!More hardware in more AZs means more money
Use AZs Effectively
There’s a trade-off between redundancy and cost, though!More hardware in more AZs means more money
Use AZs Effectively
us-east-1c
web web web
web db (read)
us-east-1b
web web web
web db (read)
db (standby)
web
us-east-1a
web web
web db (write)
db (read)
There’s a trade-off between redundancy and cost, though!More hardware in more AZs means more money
Leverage Auto Scaling
web web web
Leverage Auto Scaling
web web web
• Machine resources • If CPU > 80% for 5 minutes, scale up
Leverage Auto Scaling
web web web
• Machine resources • If CPU > 80% for 5 minutes, scale up
web web
Leverage Auto Scaling
web web
• Machine resources • If CPU > 80% for 5 minutes, scale up
• Threshold of unhealthy instances • Want 5 web servers minimum? • If one dies, replace it
web web web
Leverage Auto Scaling
web web
• Machine resources • If CPU > 80% for 5 minutes, scale up
• Threshold of unhealthy instances • Want 5 web servers minimum? • If one dies, replace it
• Scale down • If CPU < 50% for X minutes, scale down
web web web
Leverage Auto Scaling
web web
• Machine resources • If CPU > 80% for 5 minutes, scale up
• Threshold of unhealthy instances • Want 5 web servers minimum? • If one dies, replace it
• Scale down • If CPU < 50% for X minutes, scale down
web web
Load Balancers
Don’t use Sticky Sessions!
Don’t use Sticky Sessions!• By default, ELB sends request to instance with smallest load
Don’t use Sticky Sessions!• By default, ELB sends request to instance with smallest load
• “Sticky sessions” pin your user to an instance behind ELB
Don’t use Sticky Sessions!• By default, ELB sends request to instance with smallest load
• “Sticky sessions” pin your user to an instance behind ELB
• Uses a cookie to route the same client to a consistent target
Don’t use Sticky Sessions!• By default, ELB sends request to instance with smallest load
• “Sticky sessions” pin your user to an instance behind ELB
• Uses a cookie to route the same client to a consistent target
• If instance fails, ELB stops routing to that instance; chooses another
Don’t use Sticky Sessions!• By default, ELB sends request to instance with smallest load
• “Sticky sessions” pin your user to an instance behind ELB
• Uses a cookie to route the same client to a consistent target
• If instance fails, ELB stops routing to that instance; chooses another
• But you want to spread traffic around!
Don’t use Sticky Sessions!• By default, ELB sends request to instance with smallest load
• “Sticky sessions” pin your user to an instance behind ELB
• Uses a cookie to route the same client to a consistent target
• If instance fails, ELB stops routing to that instance; chooses another
• But you want to spread traffic around!
• As your pool of machines grows, the requests are balanced between them
Sticky Sessions UI
SSL Termination
web web web
web web web
Load Balancer
HTTP, port 80
Terminate SSL on your ELBNo more mod_ssl, etc. on your web servers!Single point of SSL terminationNo need to maintain dependencies on web serversNo certificate files sitting on filesystem
SSL Termination
web web web
web web web
Load Balancer
HTTP, port 80
Terminate SSL on your ELBNo more mod_ssl, etc. on your web servers!Single point of SSL terminationNo need to maintain dependencies on web serversNo certificate files sitting on filesystem
SSL Termination
web web web
web web web
Load Balancer
HTTP, port 80 HTTPS, port 443
Terminate SSL on your ELBNo more mod_ssl, etc. on your web servers!Single point of SSL terminationNo need to maintain dependencies on web serversNo certificate files sitting on filesystem
SSL Termination
web web web
web web web
Load Balancer
mod_ssl mod_ssl
mod_ssl mod_ssl mod_ssl
mod_ssl
HTTP, port 80 HTTPS, port 443
Terminate SSL on your ELBNo more mod_ssl, etc. on your web servers!Single point of SSL terminationNo need to maintain dependencies on web serversNo certificate files sitting on filesystem
SSL Termination
web web web
web web web
Load Balancer
HTTP, port 80 HTTPS, port 443
Terminate SSL on your ELBNo more mod_ssl, etc. on your web servers!Single point of SSL terminationNo need to maintain dependencies on web serversNo certificate files sitting on filesystem
SSL Termination
web web web
web web web
Load Balancer
HTTP, port 80
Terminate SSL on your ELBNo more mod_ssl, etc. on your web servers!Single point of SSL terminationNo need to maintain dependencies on web serversNo certificate files sitting on filesystem
SSL Termination
web web web
web web web
Load BalancerAWS Certificate Manager
SSL Cert
www.foo.com
HTTP, port 80
Terminate SSL on your ELBNo more mod_ssl, etc. on your web servers!Single point of SSL terminationNo need to maintain dependencies on web serversNo certificate files sitting on filesystem
Use AWS Certificate Manager
Can be used on ELBs, with API Gateway, CloudFront
Use AWS Certificate Manager• AWS Certificate Manager is amazing for managing SSL certificates, and free!
Can be used on ELBs, with API Gateway, CloudFront
Use AWS Certificate Manager• AWS Certificate Manager is amazing for managing SSL certificates, and free!
• Uses WHOIS contact information
Can be used on ELBs, with API Gateway, CloudFront
Use AWS Certificate Manager• AWS Certificate Manager is amazing for managing SSL certificates, and free!
• Uses WHOIS contact information
• Automatically renews your certificate
• A single click, and it’s renewed
• Updated everywhere it’s used
Can be used on ELBs, with API Gateway, CloudFront
Use AWS Certificate Manager• AWS Certificate Manager is amazing for managing SSL certificates, and free!
• Uses WHOIS contact information
• Automatically renews your certificate
• A single click, and it’s renewed
• Updated everywhere it’s used
• Can import external certificates, too
Can be used on ELBs, with API Gateway, CloudFront
Use AWS Certificate Manager• AWS Certificate Manager is amazing for managing SSL certificates, and free!
• Uses WHOIS contact information
• Automatically renews your certificate
• A single click, and it’s renewed
• Updated everywhere it’s used
• Can import external certificates, too
• SSL all the things!
Can be used on ELBs, with API Gateway, CloudFront
Serving Static Content
Serve from static storage!
These assets don’t change between releasesIt’s okay for them to be cached in the end user’s browser
Serve from static storage!
• Never serve static content from your web servers
These assets don’t change between releasesIt’s okay for them to be cached in the end user’s browser
Serve from static storage!
• Never serve static content from your web servers
• JavaScript, CSS, images, fonts, etc…
These assets don’t change between releasesIt’s okay for them to be cached in the end user’s browser
Serve from static storage!
• Never serve static content from your web servers
• JavaScript, CSS, images, fonts, etc…
• Don’t use your computing resources
These assets don’t change between releasesIt’s okay for them to be cached in the end user’s browser
Serve from static storage!
• Never serve static content from your web servers
• JavaScript, CSS, images, fonts, etc…
• Don’t use your computing resources
• Get the content to the end user as quickly as possible
These assets don’t change between releasesIt’s okay for them to be cached in the end user’s browser
AWS Simple Storage Service (S3)
AWS Simple Storage Service (S3)• AWS’s object storage service
AWS Simple Storage Service (S3)• AWS’s object storage service
• You pay by storage utilized, number of requests, and bandwidth
AWS Simple Storage Service (S3)• AWS’s object storage service
• You pay by storage utilized, number of requests, and bandwidth
• S3 storage is made up of buckets of objects
AWS Simple Storage Service (S3)• AWS’s object storage service
• You pay by storage utilized, number of requests, and bandwidth
• S3 storage is made up of buckets of objects
• Perfect for storing static assets
AWS Simple Storage Service (S3)• AWS’s object storage service
• You pay by storage utilized, number of requests, and bandwidth
• S3 storage is made up of buckets of objects
• Perfect for storing static assets
• Store content at build time
S3
All have a 99.99% availability guaranteeS3 Standard• 11 copies of the image• 99.999999999% durability• 2.3 cents / GB• .004 / 10,000 GET requests• 10 GB of data = 23 cents/month to store it, $4 for 10 million GET requests directly to S3• Plus data transfer costs• Point being: it’s CHEAP! You can’t store bytes on your own storage in a data center for this
Standard-IA• You pay less for storage• But you pay more for accessing these objects• For data that is accessed less frequently, but requires rapid access when needed• Long-term storage, backups, and as a data store for disaster recovery
RRS - what I’d recommend for static content• 99.99% durability• On the off chance that AWS loses an object, you can put it back!• Placing content in S3 should be part of your deployment process
S3
All have a 99.99% availability guaranteeS3 Standard• 11 copies of the image• 99.999999999% durability• 2.3 cents / GB• .004 / 10,000 GET requests• 10 GB of data = 23 cents/month to store it, $4 for 10 million GET requests directly to S3• Plus data transfer costs• Point being: it’s CHEAP! You can’t store bytes on your own storage in a data center for this
Standard-IA• You pay less for storage• But you pay more for accessing these objects• For data that is accessed less frequently, but requires rapid access when needed• Long-term storage, backups, and as a data store for disaster recovery
RRS - what I’d recommend for static content• 99.99% durability• On the off chance that AWS loses an object, you can put it back!• Placing content in S3 should be part of your deployment process
S3Standard Storage Class
All have a 99.99% availability guaranteeS3 Standard• 11 copies of the image• 99.999999999% durability• 2.3 cents / GB• .004 / 10,000 GET requests• 10 GB of data = 23 cents/month to store it, $4 for 10 million GET requests directly to S3• Plus data transfer costs• Point being: it’s CHEAP! You can’t store bytes on your own storage in a data center for this
Standard-IA• You pay less for storage• But you pay more for accessing these objects• For data that is accessed less frequently, but requires rapid access when needed• Long-term storage, backups, and as a data store for disaster recovery
RRS - what I’d recommend for static content• 99.99% durability• On the off chance that AWS loses an object, you can put it back!• Placing content in S3 should be part of your deployment process
S3Standard Storage Class
All have a 99.99% availability guaranteeS3 Standard• 11 copies of the image• 99.999999999% durability• 2.3 cents / GB• .004 / 10,000 GET requests• 10 GB of data = 23 cents/month to store it, $4 for 10 million GET requests directly to S3• Plus data transfer costs• Point being: it’s CHEAP! You can’t store bytes on your own storage in a data center for this
Standard-IA• You pay less for storage• But you pay more for accessing these objects• For data that is accessed less frequently, but requires rapid access when needed• Long-term storage, backups, and as a data store for disaster recovery
RRS - what I’d recommend for static content• 99.99% durability• On the off chance that AWS loses an object, you can put it back!• Placing content in S3 should be part of your deployment process
S3Standard Storage Class Standard - Infrequently Accessed
All have a 99.99% availability guaranteeS3 Standard• 11 copies of the image• 99.999999999% durability• 2.3 cents / GB• .004 / 10,000 GET requests• 10 GB of data = 23 cents/month to store it, $4 for 10 million GET requests directly to S3• Plus data transfer costs• Point being: it’s CHEAP! You can’t store bytes on your own storage in a data center for this
Standard-IA• You pay less for storage• But you pay more for accessing these objects• For data that is accessed less frequently, but requires rapid access when needed• Long-term storage, backups, and as a data store for disaster recovery
RRS - what I’d recommend for static content• 99.99% durability• On the off chance that AWS loses an object, you can put it back!• Placing content in S3 should be part of your deployment process
S3Standard Storage Class Standard - Infrequently Accessed Reduced Redundancy Storage
All have a 99.99% availability guaranteeS3 Standard• 11 copies of the image• 99.999999999% durability• 2.3 cents / GB• .004 / 10,000 GET requests• 10 GB of data = 23 cents/month to store it, $4 for 10 million GET requests directly to S3• Plus data transfer costs• Point being: it’s CHEAP! You can’t store bytes on your own storage in a data center for this
Standard-IA• You pay less for storage• But you pay more for accessing these objects• For data that is accessed less frequently, but requires rapid access when needed• Long-term storage, backups, and as a data store for disaster recovery
RRS - what I’d recommend for static content• 99.99% durability• On the off chance that AWS loses an object, you can put it back!• Placing content in S3 should be part of your deployment process
Use CloudFront
CloudFront operates dozens of POPs around the globePlace CloudFront in front of S3 bucketGlobal users retrieve content from nearest POPOptimized network path from POPs back to RegionsCan be served over both HTTP and HTTPSDDoS protection, too
Your content lives at its home — “origin”End user retrieves content from the POP closest to themIf POP doesn’t have content, retrieves from originIf it does have content, returns itCloudFront caches it, respecting cache headersWhen you push a new build to Production, it’s good to preface static content with a unique valueBut leave your previous releases in place, in case emails or other external things refer to assets from old releases
Use CloudFront
CloudFront operates dozens of POPs around the globePlace CloudFront in front of S3 bucketGlobal users retrieve content from nearest POPOptimized network path from POPs back to RegionsCan be served over both HTTP and HTTPSDDoS protection, too
Your content lives at its home — “origin”End user retrieves content from the POP closest to themIf POP doesn’t have content, retrieves from originIf it does have content, returns itCloudFront caches it, respecting cache headersWhen you push a new build to Production, it’s good to preface static content with a unique valueBut leave your previous releases in place, in case emails or other external things refer to assets from old releases
Use CloudFront
CloudFront operates dozens of POPs around the globePlace CloudFront in front of S3 bucketGlobal users retrieve content from nearest POPOptimized network path from POPs back to RegionsCan be served over both HTTP and HTTPSDDoS protection, too
Your content lives at its home — “origin”End user retrieves content from the POP closest to themIf POP doesn’t have content, retrieves from originIf it does have content, returns itCloudFront caches it, respecting cache headersWhen you push a new build to Production, it’s good to preface static content with a unique valueBut leave your previous releases in place, in case emails or other external things refer to assets from old releases
Use CloudFront
CloudFront operates dozens of POPs around the globePlace CloudFront in front of S3 bucketGlobal users retrieve content from nearest POPOptimized network path from POPs back to RegionsCan be served over both HTTP and HTTPSDDoS protection, too
Your content lives at its home — “origin”End user retrieves content from the POP closest to themIf POP doesn’t have content, retrieves from originIf it does have content, returns itCloudFront caches it, respecting cache headersWhen you push a new build to Production, it’s good to preface static content with a unique valueBut leave your previous releases in place, in case emails or other external things refer to assets from old releases
Use CloudFront
CloudFront operates dozens of POPs around the globePlace CloudFront in front of S3 bucketGlobal users retrieve content from nearest POPOptimized network path from POPs back to RegionsCan be served over both HTTP and HTTPSDDoS protection, too
Your content lives at its home — “origin”End user retrieves content from the POP closest to themIf POP doesn’t have content, retrieves from originIf it does have content, returns itCloudFront caches it, respecting cache headersWhen you push a new build to Production, it’s good to preface static content with a unique valueBut leave your previous releases in place, in case emails or other external things refer to assets from old releases
Security
Identity and Access Management
How many of you using AWS have access key values in your code? Example: user Bob can access an S3 bucket named “data-exports” and create objectsExample: determine who can launch EC2 instances, DynamoDB database tables they can access, etc.Roles are not associated with a user or groupTrusted entities assume roles
Identity and Access Management
• Controls AWS services a user can access
How many of you using AWS have access key values in your code? Example: user Bob can access an S3 bucket named “data-exports” and create objectsExample: determine who can launch EC2 instances, DynamoDB database tables they can access, etc.Roles are not associated with a user or groupTrusted entities assume roles
Identity and Access Management
• Controls AWS services a user can access
• Which actions they can perform on those services
How many of you using AWS have access key values in your code? Example: user Bob can access an S3 bucket named “data-exports” and create objectsExample: determine who can launch EC2 instances, DynamoDB database tables they can access, etc.Roles are not associated with a user or groupTrusted entities assume roles
Identity and Access Management
• Controls AWS services a user can access
• Which actions they can perform on those services
• Which resources are available
How many of you using AWS have access key values in your code? Example: user Bob can access an S3 bucket named “data-exports” and create objectsExample: determine who can launch EC2 instances, DynamoDB database tables they can access, etc.Roles are not associated with a user or groupTrusted entities assume roles
Identity and Access Management
• Controls AWS services a user can access
• Which actions they can perform on those services
• Which resources are available
• Concepts of “Users” and “Roles”
How many of you using AWS have access key values in your code? Example: user Bob can access an S3 bucket named “data-exports” and create objectsExample: determine who can launch EC2 instances, DynamoDB database tables they can access, etc.Roles are not associated with a user or groupTrusted entities assume roles
Use IAM Roles on EC2 Instances
API requests are signed with access key and secret access key Example: a back end server running cron jobs may need access to write to an S3 bucketYou can have an IAM role for “cron server,” which grants access to only the necessary resources
Use IAM Roles on EC2 Instances• Assign an IAM Role to an EC2 instance
API requests are signed with access key and secret access key Example: a back end server running cron jobs may need access to write to an S3 bucketYou can have an IAM role for “cron server,” which grants access to only the necessary resources
Use IAM Roles on EC2 Instances• Assign an IAM Role to an EC2 instance
• Enables you to obtain temporary access keys
• Can be used to access AWS resources
API requests are signed with access key and secret access key Example: a back end server running cron jobs may need access to write to an S3 bucketYou can have an IAM role for “cron server,” which grants access to only the necessary resources
Use IAM Roles on EC2 Instances• Assign an IAM Role to an EC2 instance
• Enables you to obtain temporary access keys
• Can be used to access AWS resources
• AWS SDKs make requests with credentials from IAM Role
API requests are signed with access key and secret access key Example: a back end server running cron jobs may need access to write to an S3 bucketYou can have an IAM role for “cron server,” which grants access to only the necessary resources
Use IAM Roles on EC2 Instances• Assign an IAM Role to an EC2 instance
• Enables you to obtain temporary access keys
• Can be used to access AWS resources
• AWS SDKs make requests with credentials from IAM Role
• No storing keys in your code base
API requests are signed with access key and secret access key Example: a back end server running cron jobs may need access to write to an S3 bucketYou can have an IAM role for “cron server,” which grants access to only the necessary resources
Use IAM Roles on EC2 Instances• Assign an IAM Role to an EC2 instance
• Enables you to obtain temporary access keys
• Can be used to access AWS resources
• AWS SDKs make requests with credentials from IAM Role
• No storing keys in your code base
• Much more flexible and maintainable
API requests are signed with access key and secret access key Example: a back end server running cron jobs may need access to write to an S3 bucketYou can have an IAM role for “cron server,” which grants access to only the necessary resources
Security Groups
Controls who can get in, and what can go outhttp://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Security.html#VPC_Security_Comparison
Example: don’t let public hit web servers directly, but DO allow ELBs to connect to them
Security Groups
• Virtual firewall to control inbound and outbound traffic
Controls who can get in, and what can go outhttp://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Security.html#VPC_Security_Comparison
Example: don’t let public hit web servers directly, but DO allow ELBs to connect to them
Security Groups
• Virtual firewall to control inbound and outbound traffic
• Typically attached to EC2 instances, load balancers, RDS instances
Controls who can get in, and what can go outhttp://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Security.html#VPC_Security_Comparison
Example: don’t let public hit web servers directly, but DO allow ELBs to connect to them
Security Groups
• Virtual firewall to control inbound and outbound traffic
• Typically attached to EC2 instances, load balancers, RDS instances
• Only allow traffic in on the necessary ports
Controls who can get in, and what can go outhttp://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Security.html#VPC_Security_Comparison
Example: don’t let public hit web servers directly, but DO allow ELBs to connect to them
Security Groups
• Virtual firewall to control inbound and outbound traffic
• Typically attached to EC2 instances, load balancers, RDS instances
• Only allow traffic in on the necessary ports
• Restrict internal tool access to known IP addresses
Controls who can get in, and what can go outhttp://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/VPC_Security.html#VPC_Security_Comparison
Example: don’t let public hit web servers directly, but DO allow ELBs to connect to them
Principle of Least Privilege
Principle of Least Privilege• Give users access to only the resources they need
Principle of Least Privilege• Give users access to only the resources they need
• This applies to internal and external users
Principle of Least Privilege• Give users access to only the resources they need
• This applies to internal and external users
• Examples:
• Don’t let an IAM role access every single S3 bucket! Specify each
• Don’t allow every port in on a Security Group! Only what needs to be public
Relational Databases
AWS Relational Database Service
AWS Relational Database Service
• Removes the usual maintenance associated with running databases
AWS Relational Database Service
• Removes the usual maintenance associated with running databases
• Eases burden of software patches
AWS Relational Database Service
• Removes the usual maintenance associated with running databases
• Eases burden of software patches
• Backups / snapshots are incredibly convenient
AWS Relational Database Service
• Removes the usual maintenance associated with running databases
• Eases burden of software patches
• Backups / snapshots are incredibly convenient
• Can scale instances up and down in size
Operate in Multiple AZs
Enhanced durabilitySynchronous replication to keep standby up-to-dateAutomatic failover in the event of a failureNo administrative intervention neededDNS hostname is modified under the hoodPlanned maintenance and backupsFor patches, applied first on standbyThen failover, then apply on primary
Operate in Multiple AZs
Primary
us-east-1a
Standby
us-east-1b
Enhanced durabilitySynchronous replication to keep standby up-to-dateAutomatic failover in the event of a failureNo administrative intervention neededDNS hostname is modified under the hoodPlanned maintenance and backupsFor patches, applied first on standbyThen failover, then apply on primary
Operate in Multiple AZs
Primary
us-east-1a
Standby
us-east-1b
Synchronous replication
Enhanced durabilitySynchronous replication to keep standby up-to-dateAutomatic failover in the event of a failureNo administrative intervention neededDNS hostname is modified under the hoodPlanned maintenance and backupsFor patches, applied first on standbyThen failover, then apply on primary
Operate in Multiple AZs
Primary
us-east-1a
Standby
us-east-1b
Synchronous replication
your-name.cluster-abc123.us-east-1.rds.amazonaws.com
Enhanced durabilitySynchronous replication to keep standby up-to-dateAutomatic failover in the event of a failureNo administrative intervention neededDNS hostname is modified under the hoodPlanned maintenance and backupsFor patches, applied first on standbyThen failover, then apply on primary
Operate in Multiple AZs
Primary
us-east-1a
Standby
us-east-1b
Synchronous replication
your-name.cluster-abc123.us-east-1.rds.amazonaws.com
Enhanced durabilitySynchronous replication to keep standby up-to-dateAutomatic failover in the event of a failureNo administrative intervention neededDNS hostname is modified under the hoodPlanned maintenance and backupsFor patches, applied first on standbyThen failover, then apply on primary
Operate in Multiple AZs
Primary
us-east-1a
Standby
us-east-1b
Synchronous replication
your-name.cluster-abc123.us-east-1.rds.amazonaws.com
Enhanced durabilitySynchronous replication to keep standby up-to-dateAutomatic failover in the event of a failureNo administrative intervention neededDNS hostname is modified under the hoodPlanned maintenance and backupsFor patches, applied first on standbyThen failover, then apply on primary
Operate in Multiple AZs
Primary
us-east-1a
Standby
us-east-1b
Patch: standby first
Synchronous replication
your-name.cluster-abc123.us-east-1.rds.amazonaws.com
Enhanced durabilitySynchronous replication to keep standby up-to-dateAutomatic failover in the event of a failureNo administrative intervention neededDNS hostname is modified under the hoodPlanned maintenance and backupsFor patches, applied first on standbyThen failover, then apply on primary
Operate in Multiple AZs
Primary
us-east-1a
Standby
us-east-1b
Patch: standby firstPatch: primary next
Synchronous replication
your-name.cluster-abc123.us-east-1.rds.amazonaws.com
Enhanced durabilitySynchronous replication to keep standby up-to-dateAutomatic failover in the event of a failureNo administrative intervention neededDNS hostname is modified under the hoodPlanned maintenance and backupsFor patches, applied first on standbyThen failover, then apply on primary
Distribute read operations
Writer
us-east-1a
There’s a cost associated with it, though!And you should have some sort of a need for itRead-only hostname, round-robin between read replicasDon’t have to change application code as you add new reader instances
Distribute read operations
Writer
us-east-1a
Reader
us-east-1b
Reader
us-east-1c
Reader
us-east-1d
Reader
us-east-1e
There’s a cost associated with it, though!And you should have some sort of a need for itRead-only hostname, round-robin between read replicasDon’t have to change application code as you add new reader instances
Distribute read operations
Writer
us-east-1a
Reader
us-east-1b
Reader
us-east-1c
Reader
us-east-1d
Reader
us-east-1e
your-name.cluster-ro-abc123.us-east-1.rds.amazonaws.com
There’s a cost associated with it, though!And you should have some sort of a need for itRead-only hostname, round-robin between read replicasDon’t have to change application code as you add new reader instances
AWS Aurora!
Storage allotment will grow as your data set grows, in 10 GB incrementsEach 10 GB chunk replicated 6 ways, across 3 availability zonesLaunch a read replica in minutesBit of a price premium, depending on your needs
AWS Aurora!• Completely re-imagined storage of data
• http://bit.ly/atlphpaurora
Storage allotment will grow as your data set grows, in 10 GB incrementsEach 10 GB chunk replicated 6 ways, across 3 availability zonesLaunch a read replica in minutesBit of a price premium, depending on your needs
AWS Aurora!• Completely re-imagined storage of data
• http://bit.ly/atlphpaurora
• Greatly reduces replica lag to single-digit milliseconds
Storage allotment will grow as your data set grows, in 10 GB incrementsEach 10 GB chunk replicated 6 ways, across 3 availability zonesLaunch a read replica in minutesBit of a price premium, depending on your needs
AWS Aurora!• Completely re-imagined storage of data
• http://bit.ly/atlphpaurora
• Greatly reduces replica lag to single-digit milliseconds
• Read replicas launch in minutes
Storage allotment will grow as your data set grows, in 10 GB incrementsEach 10 GB chunk replicated 6 ways, across 3 availability zonesLaunch a read replica in minutesBit of a price premium, depending on your needs
AWS Aurora!• Completely re-imagined storage of data
• http://bit.ly/atlphpaurora
• Greatly reduces replica lag to single-digit milliseconds
• Read replicas launch in minutes
• Run MySQL or PostgreSQL engines on top
Storage allotment will grow as your data set grows, in 10 GB incrementsEach 10 GB chunk replicated 6 ways, across 3 availability zonesLaunch a read replica in minutesBit of a price premium, depending on your needs
Logging
Centralize Application Logs
Centralize Application Logs
• Web servers (PHP error log, Apache logs)
Centralize Application Logs
• Web servers (PHP error log, Apache logs)
• Cron jobs
Centralize Application Logs
• Web servers (PHP error log, Apache logs)
• Cron jobs
• Asynchronous processes
Centralize Application Logs
• Web servers (PHP error log, Apache logs)
• Cron jobs
• Asynchronous processes
• You need to be able to access these at any time
CloudWatch Logs
CloudWatch Logs• CloudWatch Logs agent runs on EC2 instance
CloudWatch Logs• CloudWatch Logs agent runs on EC2 instance
• Polls local log files on disk and copies to CW Logs
CloudWatch Logs• CloudWatch Logs agent runs on EC2 instance
• Polls local log files on disk and copies to CW Logs
• Broken up into Log Groups and Log Streams
• Log Group: Apache access log, error log, PHP error log
• Log Stream: log entries from a specific instance
CW Logs Agent Install
[messages]file = /var/log/messageslog_group_name = /var/log/messageslog_stream_name = {instance_id}datetime_format = %b %d %H:%M:%S
$ sudo yum install awslogs$ sudo service awslogs start
CW Logs Console
Make Them Searchable
Make Them Searchable• Elasticsearch (Amazon ES)
• OSS utilities to “tail” Elasticsearch indexes
Make Them Searchable• Elasticsearch (Amazon ES)
• OSS utilities to “tail” Elasticsearch indexes
• Amazon ES includes Kibana
Make Them Searchable• Elasticsearch (Amazon ES)
• OSS utilities to “tail” Elasticsearch indexes
• Amazon ES includes Kibana
• Allows you to spot trends over time
Make Them Searchable• Elasticsearch (Amazon ES)
• OSS utilities to “tail” Elasticsearch indexes
• Amazon ES includes Kibana
• Allows you to spot trends over time
• Dig through data for specific entries, time periods, etc.
Kibana
Kibana
Proactively monitor and alert!
Don’t let your boss or your customers find a problem first!
Examples:amount of PHP error log entries too highToo many photos waiting to be processed# of emails sent in the past 15 minutes is over a certain thresholdDatabase CPU utilization over 80%
Proactively monitor and alert!• Logs should really be empty day-to-day
Don’t let your boss or your customers find a problem first!
Examples:amount of PHP error log entries too highToo many photos waiting to be processed# of emails sent in the past 15 minutes is over a certain thresholdDatabase CPU utilization over 80%
Proactively monitor and alert!• Logs should really be empty day-to-day
• If they’re not right now, fix that first
Don’t let your boss or your customers find a problem first!
Examples:amount of PHP error log entries too highToo many photos waiting to be processed# of emails sent in the past 15 minutes is over a certain thresholdDatabase CPU utilization over 80%
Proactively monitor and alert!• Logs should really be empty day-to-day
• If they’re not right now, fix that first
• CloudWatch Alerts for log entries over threshold
Don’t let your boss or your customers find a problem first!
Examples:amount of PHP error log entries too highToo many photos waiting to be processed# of emails sent in the past 15 minutes is over a certain thresholdDatabase CPU utilization over 80%
Proactively monitor and alert!• Logs should really be empty day-to-day
• If they’re not right now, fix that first
• CloudWatch Alerts for log entries over threshold
• Amazon Simple Notification Service: get paged, wake up!
Don’t let your boss or your customers find a problem first!
Examples:amount of PHP error log entries too highToo many photos waiting to be processed# of emails sent in the past 15 minutes is over a certain thresholdDatabase CPU utilization over 80%
Proactively monitor and alert!• Logs should really be empty day-to-day
• If they’re not right now, fix that first
• CloudWatch Alerts for log entries over threshold
• Amazon Simple Notification Service: get paged, wake up!
• Develop as to avoid being woken up by pages
Don’t let your boss or your customers find a problem first!
Examples:amount of PHP error log entries too highToo many photos waiting to be processed# of emails sent in the past 15 minutes is over a certain thresholdDatabase CPU utilization over 80%
Recap
Operating Servers
Operating Servers
• Choose an EC2 AMI strategy that suits your needs
Operating Servers
• Choose an EC2 AMI strategy that suits your needs
• Don’t have an SPOF
Operating Servers
• Choose an EC2 AMI strategy that suits your needs
• Don’t have an SPOF
• Spread resources over AZs and/or Regions
Operating Servers
• Choose an EC2 AMI strategy that suits your needs
• Don’t have an SPOF
• Spread resources over AZs and/or Regions
• Keep SSL simple
Static Content
Static Content
• Don’t serve it from your web servers!
Static Content
• Don’t serve it from your web servers!
• Utilize S3 for all static content storage
Static Content
• Don’t serve it from your web servers!
• Utilize S3 for all static content storage
• Leverage CloudFront for better global performance
Security
Security
• Leverage IAM Roles to grant access to types of servers
Security
• Leverage IAM Roles to grant access to types of servers
• Limit Security Groups to only what’s needed in and outbound
Security
• Leverage IAM Roles to grant access to types of servers
• Limit Security Groups to only what’s needed in and outbound
• Principle of Least Privilege is a great guide
Databases
Databases
• Again, spread across AZs
Databases
• Again, spread across AZs
• Distribute read operations to slaves
Databases
• Again, spread across AZs
• Distribute read operations to slaves
• More sleep: automatic failover is a great asset
Logging
Logging
• Use CloudWatch Logs for central logging
Logging
• Use CloudWatch Logs for central logging
• Don’t just write the logs, monitor them!
Logging
• Use CloudWatch Logs for central logging
• Don’t just write the logs, monitor them!
• Alert on anomalies
Logging
• Use CloudWatch Logs for central logging
• Don’t just write the logs, monitor them!
• Alert on anomalies
• Find your bug and errors before your users do!
Thanks to our Sponsors!
PHP[TEK] 2017
http://www.deshong.net/
@bdeshong
http://www.shootproof.com
We’re Hiring: http://www.shootproof.com/about/careers