scalebase webinar: scaling mysql - sharding made easy!
DESCRIPTION
Home-grown sharding is hard - REALLY HARD! ScaleBase scales-out MySQL, delivering all the benefits of MySQL sharding, with NONE of the sharding headaches. This webinar explains: MySQL scale-out without embedding code and re-writing apps, Successful sharding on Amazon and private clouds, Single vs. multiple shards per server, Eliminating data silos, Creating a redundant, fault tolerant architecture with no single-point-of-failure, Re-balancing and splitting shardsTRANSCRIPT
Scaling MySQL – Sharding Made Easy
2
Agenda
• Scalability Issues
• MySQL 5.6
• Why Do-It-Yourself (DIY) Sharding Sucks
• ScaleBase Data Distribution:– Successful sharding on Amazon and private clouds
– Single vs. multiple shards per server
– Eliminating data silos
– Creating a redundant, fault-tolerant architecture
– Re-balancing and splitting shards
• Q & A
3
Doron Levari, Founder & CTO
Doron Levari,Founder & CTO
A technologist and long-time veteran of the
database industry. Prior to founding ScaleBase,
Doron was CEO to Aluna.
4
What We Do
Simply and cost-effectively scale
MySQL to support an infinite
number of users, transactions and data
with NO disruption to the existing infrastructure
Scalability Issues and MySQL 5.6
6
MySQL Scalability Challenges
• Too many transactions
• Too many users
• Too much data
• Too many writes
• Capacity
• Throughput
• Performance inconsistencies
7
Improvements in MySQL 5.6 – Single Box
Partitioning Improvements– Explicit Partition Selection:
SELECT * FROM employees PARTITION (p0, p2);
– Import / Export for Partitioned Tables:Bring a new data set into a partitioned table, or export a partition to manage it as a regular table ALTER TABLE e EXCHANGE PARTITION p0 WITH TABLE e2;
http://dev.mysql.co/tech-resources/articles/whats-new-in-mysql-5.6.html
Replication Improvements– Optimizations to Row-Based
Replication
– Multi-Threaded Slaves
– Improvements to Data Integrity
– Crash-Safe Slaves
– Replication Checksums
SCALABILITY issues remain due to the limitations of a single box:To ensure ACID, you still face limitations with:
- Memory management - Thread management
- Semaphores - Locking
- Recovery tasks
No new functionality for sharing workloads across multiple boxes
8
What are my Options
1. More/Bigger Hardware?– Temporary fix…you will need new hardware again
– More memory…helps mostly with “reads,” but not with “writes”
– Every write operation is at least 4 write operations in database, plus multiple activities in the database engine memory
2. Application re-architecture?– Steer workload away from the database
– Example: introduce a caching layer
– Force application re-writes; new test & QA cycles
3. Do it Yourself Sharding?
4. Migrate to new database architecture– Other RDBMS/NewSQL / NoSQL?
– Force application re-writes; new test & QA cycles
– ACID/Durability Issues
9
Scale Out your Existing MySQL
• Keep your MySQL - keep your InnoDB
• Ecosystem compatibility, preserve skills
• 100% application compatibility
• Smoother migration, no down-time, no forklift
• Your data is safe
• No “in-memory” magic
• No “in-memory” size limit
Don’t throw out the baby with the bath water!
Why Do It Yourself Sharding “Sucks”
11
What is Sharding?
Wikipedia - Shard (database architecture) http://en.wikipedia.org/wiki/Shard_(database_architecture)
A database shard is a horizontal partition in a database or search engine. Each individual partition
is referred to as a shard.
Horizontal partitioning is a database design principle whereby rows of a database table are held
separately, rather than being split into columns.
Each partition forms part of a shard, which may in turn be located on a separate database server or
physical location.
12
DIY Sharding Challenges
Applications must be modified to support multiple shards
13
• Maintaining DB ops and IPs in the app
• Non-optimized sharding strategies
– No good way to maintain global tablesreplicated across all database
• Sacrifices development agility, additional administrative complexity
• Results in database silos
• Database ecosystem breaks because the application “conceals” sharding strategies internally
• Risks for data inconsistency
• Adding and removing databasesis not supported…overprovisioning…
• Jeopardizes high availability, backups & disaster recovery
• Demands custom application code that can fail ACID compliance
DIY Sharding Challenges
Challenges exist because application code changes are required to support multiple
database instances.
ScaleBase Data Distribution Overview
15
Data Distribution: Application Experience
Without ScaleBase: App must be customized to support shards
With ScaleBase: App sees ONE database… …and doesn’t require any customization
ScaleBase acts as a proxy between the app and thedatabase, virtualizing the database environment
16
Manual Sharding versus ScaleBase
Sharding Limitations:
• Major app rewrite, maintaining code• Maintaining DB ops & IPs in the app• Administration/3rd party tools are broken• DB silos/Database ecosystem is blind
– Application “hides” sharding strategies
• Non-optimized data distribution policy– No good way to maintain global tables,
replicated across all database
• Sacrifices development agility• Adding/removing DBs is not supported• Risks for data inconsistency• Demands custom application code that
can fail ACID compliance• Jeopardizes high availability, backups,
and disaster recovery
ScaleBase Benefits:
• No hard-coding application re-writes• Unlimited scalability• Improve performance• Real time elasticity• ACID compliance• Verified data consistency• Real time monitoring, traffic analysis• Carefully analyze distribution policy• Enable system upgrades and updates• Simplified, centralized admin
– Adding users– Changing schemas– Maintenance scripts– Management queries
17
Typical ScaleBase Data Traffic Manager Deployment
Application Servers
BI
Management
Database A Replica A
Database B Replica B
Database C Replica C
Database D Replica D
Unlimited Scale
ScaleBaseArchitecture
is Fault Tolerant
ScaleBase Data Distribution – In Detail
19
ScaleBase Enables MySQL Scale Out without Re-writing Apps
• Data distribution and scale-out is part of the database architecture, not the application
• One IP to connect to, and “see a unified database”
– The application
– Entire ecosystem (ETL, mysqldump, PHPMyAdmin)
– No special sharding wizard developer
– No app re-design, re-dev, re-QA, re-test, re-deploy
– No hard-coded variables lost in the code
– No special documentation
20
ScaleBase Enable Scale Out on AWS and Private Clouds
• A virtualized DB environment makes it easy to change real infrastructure, because it’s decoupled from the application
• No cloud makes your database elastic• ScaleBase enables elasticity of MySQL in the cloud (EC2, RDS, etc.)
Scale-up hits AWS’s tiered configuration limits fast
Scale-out is unlimited and gives cloud flexibility
21
ScaleBase Supports Scale Out on Single & Multiple Machines
Advantages of several shards on one machine:
– Several smaller MySQL instances better utilize cores, memory
– When data grows, each instance can later on migrate to a bigger machine of its own
Advantages of several shards on multiple machines
– Leverage commodity hardware– When reaches machine limits -
ScaleBase enables online data redistribution (resharding) and shard-split
22
ScaleBase Enables Splitting Shards
• ScaleBase also redistributes data across the array to eliminate hot spots, splitting the hot spot into two databases
23
ScaleBase Re-balances Shards
• Special analysis and alerts about approaching limits
• ScaleBase dynamically redistributes data (resharding) - moving the data across the array from the over-utilized to the under-utilized
24
ScaleBase Provides Optimal Data Distribution Policies
A good data distribution policy ensures that a specific transaction is directed to a specific database
1,000 transactions
250 transactions
250 transactions
250 transactions
250 transactions
1,000 transactions
25
ScaleBase Eliminates Data Silos
When a query needs datafrom several databases, ScaleBase:
– Runs the query in parallelon all databases
– Aggregates results into onemeaningful result-set to be returned to the client – the same result-set that would have been returned from a single DB!
– Including cross-db GROUP BY, ORDER BY, aggregate functions
– Including cross-db JOIN operations
– Enables 2-phase commit for transactions spanning multiple databases
26
ScaleBase Provides a Fault Tolerant Architecture
Application Servers
BI
Management
Database A Replica A
Database B Replica B
Database C Replica C
Database D Replica D
Fully Redundant
Resilience to failures
Scheduled maintenance without
downtime
Summary
28
ScaleBase Delivers Scalability
Scale to Unlimited Throughpu
t
No Specialized Hardware
No Re-architecture
No Application Rewrites
29
Easily Scale your MySQL Database
1 2 4 6 8 10 140
20000
40000
60000
80000
100000
120000
140000
160000
600012000
24000
36000
48000
60000
84000
500 5001000
1500 15002000
2500
Throughput (TPM)Total DB Size (MB)# Connections
Number of Databases
Thro
ughp
ut
30
Detailed Scale Out Case Studies
Large Chip Co• Scalability• Multiple Apps• Multiple growing
users• Availability• MySQL DB
Solar Edge• Next Gen
Monitoring App• Massive Scale• Monitors real
time data from thousands of distributed systems
Mozilla• New Product/
Next Gen App/ AppStore
• Scalability• Geo-clustering
AppDynamics• Next gen APM
company• Scalability for the
Netflix implementation
31
ScaleBase Deployment
Environments
– Public Cloud
– AWS, Rackspace, any
– Private cloud
– Hosted / on-premise
Databases Supported
– MySQL 5.1, 5.5, 5.6 (under certification)
– AWS RDS MySQL 5.1, 5.5
– Maria DB 10.0 (under certification)
Path to Scale-Out:
1. Data Distribution Policy Analysis
2. Functional Test
3. Load Test
4. Production Migration (safe, online)
32
Summary
ScaleBase provides cost-effective Scale-Out solutions
• Scale to an infinite number of users, data and transactions
• Improve performance
• No application rewrites
• Real-time elasticity
• ACID Compliant
• Expert analysis and simple deployment
• Leverage existing MySQL ecosystem/skills
• Improve database visibility with real-time monitoring
• Simplified, centralized administration
33
Questions (please enter directly into the GTW side panel)
www.ScaleBase.com
617.630.2800
Additional Resources
http://www.scalebase.com/blog/
http://www.scalebase.com/resources/
@scalebase
34
Thank You