aws for start-ups - case study - peopleperhour
DESCRIPTION
Customer Case Study - PeoplePerHour, Simos Kitiris, Co-Founder & CTOTRANSCRIPT
0
AWS @ PeoplePerHour – Feb 2013
1
A short intro…
Simos Kitiris Co-Founder & CTO PeoplePerHour
Tom Fotherby Tech Lead / Dev Ops PeoplePerHour
2
About PeoplePerHour
A bit about PeoplePerHour…
• Site Demo >>
• PPH History >>
3
Team • Currently 13 Devs
• 3 locations: Greece, UK, Poland
• Multiple staging environments
Tech • Site built in PHP/MySQL
• iPhone & Android apps
Tech Overview
4
• 2007: Started with shared server
• 2008: Got our 1st dedicated server (yay!)
• 2009–2011: Scaled up to 4-5 servers in Managed Hosting
• Started using a couple of AWS services to solve some growing pains (e.g. S3)
• 2012: Moved fully to AWS (yay!!!)
Infrastructure Evolution
5
Stellar Pace of Innovation!
• New services/features every week (literally)
• Caters for most infrastructure needs
Flexibility
• Scale up/down on demand
• Pay for what you use – Per Hour ;-)
• Try new approaches!
Reduce time spent on scaling & management
• Allowing us to focus on building our product
Why AWS?
6
AWS Services we’re currently using
7
• DNS: Route53
• Load Balancers: ELB
• App/SOLR Servers: EC2
• Database: Multi-AZ RDS + Slaves
• Cache: ElastiCache
• Assets/Storage: S3
• CDN: CloudFront
• Email sending: SES
• NoSQL: DynamoDB
• Queuing: SQS
• Backups: RDS + S3 + Glacier
• Monitoring: CloudWatch
AWS Services we’re currently using
8
RDS (MySQL) • Main issues with our own (pre-RDS) installation:
• replication was becoming a pain to manage
• we never got a proper Master db failover solution to work properly
• Easy, low maintenance replication: 1-click. Time saver!
• Fault-tolerance: Multi-AZ Standby replica with automatic failover
• Automated Backups: point-in-time restore!
• Scaling: up/down sizing without downtime
• Monitoring: (CloudWatch) e.g. Slave Lag
AWS Highlights - RDS
9
Easy, automated DB dump processing 1. Create live db snapshot
2. Create new RDS instance from snapshot
3. Process/modify data on new RDS instance
4. ‘mysqldump’ from new DB and upload tarball to S3
5. Tidy up – shut down instances and clean old snapshots
RDS – Dump Processing Example
10
S3 – What problems did it solve? • User uploads (profile images, attachments etc) in the millions started
becoming difficult/time consuming to handle
• Did not want to invest time to figure out the best strategy for scaling
• Wanted to offload our servers from serving/hosting site assets
• Backups were kept in an expensive ‘managed’ solution that was very hard to access
Why do we like it?
• Cost-effective
• Limitless Space with Zero maintenance!
• Lifecycle policies to move older backups to Glacier
• Seemless integration with Cloudfront (tip: get your cache headers right!)
• What’s not to like???
AWS Highlights – S3
11
New site launch – July 2012 • Total site overhaul
• Load/Performance testing stack
• Migration/Launch Rehearsals on parallel stack
• Bigger Instances during launch for quicker migrations
• Increased capacity for PR spikes following the launch
Recent examples: Flexibility
12
BBC coverage – Feb 2013 • PPH Featured on BBC BreakFast & BBC News on the same day
• Did not know what traffic levels to anticipate
• Launched extra app servers, db slaves, SOLR slaves + cache
• Launched more servers during broadcast, scaled down a few hours after
• Site held fine with lots of spare capacity! (we got a bit carried away)
Recent examples: Flexibility
13
Organic Growth + Seasonal Variations • Scaled down to save cost towards the end of 2012
• Much faster growth than anticipated in January (beyond seasonal)
• Ended January 2013 with x2 the traffic of December 2012
• Continuing to grow – nice to know we can easily scale up!
Recent examples: Flexibility
14
Redundancy • Use AWS to eliminate single points of failure
• Go multi-AZ on all tiers (if possible)
Automation/orchestration • Automate with puppet or similar
• Put in place flexible code deployment solution (+ GitHub)
Cost • Experiment to find the right instance sizes for your needs
• Use reserved instances to save cost
• Implement auto-scaling (if it makes sense)
A few useful tips & best practices
15
Fire away - there are no stupid questions!
Thank you - do ask questions!