automate your backups at scale
TRANSCRIPT
![Page 1: Automate Your Backups at Scale](https://reader034.vdocument.in/reader034/viewer/2022042716/55cd197cbb61eb507a8b474d/html5/thumbnails/1.jpg)
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015©2015, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Automate Your Backups at Scale
Jeff GentileThe Common Application :: DevOps Engineer
Aaron ArmstrongThe Common Application :: Director of Engineering
![Page 2: Automate Your Backups at Scale](https://reader034.vdocument.in/reader034/viewer/2022042716/55cd197cbb61eb507a8b474d/html5/thumbnails/2.jpg)
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
3.6 millionApplication Forms
Counselor & Teacher
14 million
Recommendations
800,000Students
• Founded in 1975• Not-for-profit• 500+ colleges / universities in US & abroad
About the Organization
![Page 3: Automate Your Backups at Scale](https://reader034.vdocument.in/reader034/viewer/2022042716/55cd197cbb61eb507a8b474d/html5/thumbnails/3.jpg)
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
AWS at the Common App
• Run all prod and dev assets on AWS (nothing on-premises)• Use most “core” services: Amazon EC2, Amazon S3, Amazon
DynamoDB, Amazon RDS, Amazon VPC• Some app-specific services: Amazon SWF, Amazon SQS• Completing 2nd year of production operation in AWS• Continually improving efficiencies, controls, and costs
Implement Learn Refine
![Page 4: Automate Your Backups at Scale](https://reader034.vdocument.in/reader034/viewer/2022042716/55cd197cbb61eb507a8b474d/html5/thumbnails/4.jpg)
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
Agenda
• The problem we faced• DevOps Ecosystem• Where we wanted to go as an organization / group• What options were available• The solution we implemented• Where we are now• How to leverage and other benefits• Questions
![Page 5: Automate Your Backups at Scale](https://reader034.vdocument.in/reader034/viewer/2022042716/55cd197cbb61eb507a8b474d/html5/thumbnails/5.jpg)
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
The Problem We Faced…
• Long-lived (static) vs. scale-out servers• Restorability
– Quick return to operation– Retrieval of lost data
• Use case example, FTP– 100s x GB data– Everyday use– Hundreds of users– Critical system
![Page 6: Automate Your Backups at Scale](https://reader034.vdocument.in/reader034/viewer/2022042716/55cd197cbb61eb507a8b474d/html5/thumbnails/6.jpg)
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
DevOps EcosystemEC2 Instance
Rundeck Job
Python Snapshot Script
Amazon EBS SnapshotsRundeck Job
Python Restore Script
Restored EC2 Instance
![Page 7: Automate Your Backups at Scale](https://reader034.vdocument.in/reader034/viewer/2022042716/55cd197cbb61eb507a8b474d/html5/thumbnails/7.jpg)
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
Where We Wanted to Go…
Automated schedule Rapid restore to normal operation Manual and scripted restore options Add / remove servers over time Monitoring and oversight of backup process
![Page 8: Automate Your Backups at Scale](https://reader034.vdocument.in/reader034/viewer/2022042716/55cd197cbb61eb507a8b474d/html5/thumbnails/8.jpg)
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
What Options Were Available?
• Third-Party Software (e.g., Skeddly)– Quick implementation
• Homegrown Implementation– Growing expertise in Python and Boto– Didn’t want another system to manage / learn– Tailored for specific needs– No net new costs
![Page 9: Automate Your Backups at Scale](https://reader034.vdocument.in/reader034/viewer/2022042716/55cd197cbb61eb507a8b474d/html5/thumbnails/9.jpg)
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
The Solution We Implemented…
• Snapshots using the Python SDK, Boto• Instance and snapshot tagging• Snapshot removal based on tag• Push button restore process
– More soon…
![Page 10: Automate Your Backups at Scale](https://reader034.vdocument.in/reader034/viewer/2022042716/55cd197cbb61eb507a8b474d/html5/thumbnails/10.jpg)
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
Solution continued… (Backup)
• Instance Tags• Snapshot Tags
![Page 11: Automate Your Backups at Scale](https://reader034.vdocument.in/reader034/viewer/2022042716/55cd197cbb61eb507a8b474d/html5/thumbnails/11.jpg)
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
Solution continued… (Restore)
• Rundeck UI
![Page 12: Automate Your Backups at Scale](https://reader034.vdocument.in/reader034/viewer/2022042716/55cd197cbb61eb507a8b474d/html5/thumbnails/12.jpg)
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
Solution continued… (Restore)
• How long will it take to restore?15:00?
10:00?
5:00?
1:35
![Page 13: Automate Your Backups at Scale](https://reader034.vdocument.in/reader034/viewer/2022042716/55cd197cbb61eb507a8b474d/html5/thumbnails/13.jpg)
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
Inspect What We Expect…
• Weekly reporting via Rundeck showing snapshot status
![Page 14: Automate Your Backups at Scale](https://reader034.vdocument.in/reader034/viewer/2022042716/55cd197cbb61eb507a8b474d/html5/thumbnails/14.jpg)
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
Where We Are…
Automated schedule Rapid restore to normal operation Manual and scripted restore options Add / remove servers over time Monitoring and oversight of backup process
![Page 15: Automate Your Backups at Scale](https://reader034.vdocument.in/reader034/viewer/2022042716/55cd197cbb61eb507a8b474d/html5/thumbnails/15.jpg)
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
How to Leverage…
• Figure out what your backup / restore strategy should be– Live snapshots– Stop instance, then snapshot– No snapshot, native application backups
• Determine which instances need regular backups– Create backup schedules for those instances
• Document the process!– Have a fresh set of eyes follow the steps for verification
![Page 16: Automate Your Backups at Scale](https://reader034.vdocument.in/reader034/viewer/2022042716/55cd197cbb61eb507a8b474d/html5/thumbnails/16.jpg)
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
Other Benefits
• Copy snapshots / AMIs across regions / accounts– Load Testing– Disaster Recovery
• Incremental backup– First snapshot takes the longest– Subsequent backups have lower storage overhead (i.e., cost efficiency)
![Page 17: Automate Your Backups at Scale](https://reader034.vdocument.in/reader034/viewer/2022042716/55cd197cbb61eb507a8b474d/html5/thumbnails/17.jpg)
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
Questions
![Page 18: Automate Your Backups at Scale](https://reader034.vdocument.in/reader034/viewer/2022042716/55cd197cbb61eb507a8b474d/html5/thumbnails/18.jpg)
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015
Thank You.This presentation will be loaded to SlideShare the week following the Symposium.
http://www.slideshare.net/AmazonWebServices
AWS Government, Education, and Nonprofit Symposium Washington, DC I June 25-26, 2015