austin cloud users group - august 23rd, 2011
Post on 20-Aug-2015
687 Views
Preview:
TRANSCRIPT
coppereggAustin CUG - August 23rd, 2011
(presented by Eric Anderson)anderson@copperegg.com
Wednesday, August 24, 11
About UsCopperEgg
• Founded spring 2010• Super real-time monitoring and analytics
About me (Eric Anderson)• SysAdmin - Centaur - 1999-2007
• 1400 compute nodes, ~50-100 file servers, ~200 misc systems, hundreds of TB’s
• Software Engineer - StorSpeed - 2007-2010• built distributed file system cache for NAS acceleration product
• Co-Founder/COO - CopperEgg - 2010-Present
2Wednesday, August 24, 11
Why Cloud?Important Differences:
• Installs in seconds – copy/paste install• No configuration required - anyone can do it
3
All reliable and business-worthy systems need something like this:
•Physical security•Redundant power•Redundant AC•Redundant & fast network•Peak hardware•Spare equipment•Physical space (storage of spare stuff too)•People to manage physical infrastructure•Hardware repairs
•Redundant infrastructure•Multi-AZ, Regions, storage, etc
•Resilient Applications•Designed for failure
•Performance measurement•Automatic failover/recovery•Security of your infrastructure•Monitoring - up/down/status•Visibility into system as a whole•Don’t rely on cloud vendor!•Delayed, inaccurate
Wednesday, August 24, 11
Why Cloud?Important Differences:
4
All reliable and business-worthy systems need something like this:
Physical
•Physical security•Redundant power•Redundant AC•Redundant & fast network•Peak hardware•Spare equipment•Physical space (storage of spare stuff too)•People to manage physical infrastructure•Hardware repairs
Cloud
•Redundant infrastructure•Multi-AZ, Regions, storage, etc
•Resilient Applications•Designed for failure
•Performance measurement•Automatic failover/recovery•Security of your infrastructure•Monitoring - up/down/status•Visibility into system as a whole•Don’t rely on cloud vendor!•Delayed, inaccurate
Wednesday, August 24, 11
Why Cloud? (for CopperEgg)Why did we go cloud?
• Needed to get building fast• We didn’t know what we needed• Just-in-time scaling• Keep costs low and still provide awesome service levels• Easy deployment for developers
• Test different scenarios, try new setups, etc
• We use it for everything!• code repositories, tickets, email, phone, alerting, etc
5Wednesday, August 24, 11
What we were buildingStorage analytics product
• visualize network attached storage in real-time• massive amounts of data
• analyzing 10 billion ops/day in beta, in real-time
• super real-time (seconds vs minutes)
Requirements:• highly available• super responsive• gobble large amounts of analytics data in real-time• historical data for 2 yrs• great UI
6Wednesday, August 24, 11
Where we started
Bad:• Outgrew it before we outgrew it• Slow!
So then what?
7
+ SimpleDB
Wednesday, August 24, 11
Amazon RDS to save the day!
Good:• Faster than SimpleDB• Could scale the storage
Bad:• Realized it still would not handle our dataset
• Inserts were too slow
So then what?
8
+ SimpleDB
+ RDS
Wednesday, August 24, 11
MySQL on EC2 to save the day!
Good:• Faster than RDS• Increased insert performance
• Using some cheats to get the insert rate up
Bad:• Still not good enough insert performance..
So then what?
9
+ SimpleDB
+ RDS
EC2 + MySQL
Wednesday, August 24, 11
MySQL on Rackspace Cloud
Good:• Faster than Amazon (CPU)• Seemed cheaper
Bad:• No easy way to scale across different zones or regions• No way to expand storage per instance (whole instance only - costly!)• Then we got the bill: they charge for data xfer between instances - OUCH
So then what?
10
+ SimpleDB
+ RDS
EC2 + MySQL+ MySQL
Wednesday, August 24, 11
Back to Amazon!
Why did we move back?• Lots of great services: S3, EC2, EBS, Route 53, ELB (we use all of these)• Even more: SQS, SES, etc• Multiple regions and availability zones• Scale-as-you-need: storage, memory, cpu, redundancy• Documentation
We’re still happy with this.. (9 months and running)
11
+ SimpleDB
+ RDS
EC2 + MySQL+ MySQL
EC2, EBS, MongoDB
Wednesday, August 24, 11
What’s this NoSQL thing?Realized maybe MySQL was not the best choice
• How about a NoSQL database?• So we tested and measured every one we thought was worth looking at:
• Redis• Tokyo Tyrant, Kyoto Cabinet• Cassandra• MongoDB• etc, etc, etc (there are a lot)
12Wednesday, August 24, 11
MongoDB wonMongoDB won the award - why?
• Redundant• Scalable• Persistent data-store• Handles large amounts of data• Awesome user community• Vendor support• Open source• Lots of momentum
13Wednesday, August 24, 11
Where are we now?Needed a way to monitor our site:
• Requirements:• Know right away when problems occur• See into the performance of the system• See historical trends as we grow the business• Super real-time product needs super real-time monitoring
• Not satisfied with existing solutions• slow updates (1m or 5m way to slow - not real-time)• not ‘cloud friendly’• pain to maintain• some are pricey
14Wednesday, August 24, 11
Not real-time?Then what *is* real-time?
• Smallest amount of time you can comfortably have poor service before someone notices and changes their behavior.
• Example:• Web site can only be slow/unavailable for a few seconds before people leave• Email can be slow for tens of seconds before people get grumpy (or less depending on
the people!)• Twitter - well, we’ll leave that one for you to decide
So, if seconds is the yardstick for measuring poor performance, why do we monitor every 1 or 5 minutes?
15Wednesday, August 24, 11
1
25
50
75
100
5:00 PM 5:05 PM
CPU Usage: 5min sampling
Here’s what a 5 minute sample provides• Doesn’t look like much is happening• Users should not be complaining right?
16Wednesday, August 24, 11
CPU Usage: 1min sampling
Same data - 1 minute sample• Looks like there was some kind of cpu activity at 5:01pm - 5:02pm
• Still no issue though - right?
17
0
25
50
75
100
5:00 PM 5:01 PM 5:02 PM 5:03 PM 5:04 PM 5:05 PM
Wednesday, August 24, 11
CPU Usage: 5 second sampling
Same data - 5s sampling• Becomes clear there was something happening:
• between 5:01:10pm - 5:01:25pm
18
0
25
50
75
100
5:00 PM 5:01 PM 5:02 PM 5:03 PM 5:04 PM 5:05 PM
Wednesday, August 24, 11
So we rolled our ownRevealCloud
• Turns out a lot of people agreed with us• Highlights:
• Built on our super real-time analytics engine• Updates in seconds vs minutes• Easy to install, no config required• Great looking and usable interface• Works anywhere - public/private cloud, vm, bare metal)
19Wednesday, August 24, 11
coppereggQuestions
Wednesday, August 24, 11
coppereggDemo
Wednesday, August 24, 11
Demo Screenshots
22Wednesday, August 24, 11
Demo Screenshots
23Wednesday, August 24, 11
Demo Screenshots
24Wednesday, August 24, 11
top related