2009 calpont corporation 1 calpont open source columnar storage engine for scalable mysql data...
TRANSCRIPT
1
2009 Calpont Corporation
CalpontCalpontOpen SourceOpen SourceColumnar Storage Engine Columnar Storage Engine for Scalable MySQLfor Scalable MySQLData WarehousingData Warehousing
April 22, 2009April 22, 2009MySQL User ConferenceMySQL User ConferenceSanta Clara, CASanta Clara, CA
Jim TommaneyJim TommaneyChief Product ArchitectChief Product [email protected]@calpont.com
2
2009 Calpont Corporation
In a Nutshell
• Calpont built… a columnar storage engine optimized for
data warehousing an all-software solution on a solid foundation
1. Capable2. Scalable3. Extendable4. Simple
• Because… today’s business asks more questions, by more
people, across more data “technology as usual” can’t keep up there has to be a way
3
2009 Calpont Corporation
Calpont solves for this…and more.
• Because we fundamentally believe in the power of open source software So does MySQL So does this Community
• Because of the opportunity MySQL is uniquely positioned to become a serious player in
data warehousing MySQL and its partners have made great strides in
beginning to address the market But to be truly successful, MySQL must add the ability to
perform:1. Transparent, distributed linear scaling
2. Distributed parallel table scans
3. Distributed joins
4. Hash joins5. On-line add-column operations
Why MySQL?
4
2009 Calpont Corporation
Serious Architecture for a Serious Problem.
Calpont Pillars
It needs to go!1. Scalable scan, filter,
aggregation and hash join operations
2. Intra-server parallelism3. Inter-server parallelism
Scalable
It needs to be right!
1. Built for analytics
2. Built for big data
3. Built for speedCapable
It needs to grow!1. Extend the data2. Extend the data
model3. Extend database
functionality via UDFs
Extendable
It needs to be easy!1. “Load & Go”2. Automatic,
maintenance-free partitioning
3. Automatic parallelismSimple
5
2009 Calpont Corporation
Building BlocksModule Process Functionality Value
• Hosts MySQL • Connection management• SQL parsing &
optimization
Familiar DBMS interface Leverages existing partner
integrations
• Abstracts physical and logical storage
• Metadata store
Enables shared nothing and shared everything storage
Enables partition elimination Built-in failover
Controller
• Work distribution• Final results
management and aggregation
Independent scalability and tunable concurrency
Multi-threaded to take advantage of multi-core HW platforms
Worker
• Scale-out cache management
• Distributed scan, filter, join and aggregation operations
• Resource management
Independent scalability and tunable performance
Multi-threaded to take advantage of multi-core HW platforms
Data
• High Speed Bulk Load• Transactional DML and
DDL• Online schema
extensions
Enables concurrent reads and writes, non-blocking read enabled
Multi-threaded to take advantage of multi-core HW platforms
MySQLDirectorMySQLDirector
ExtentMapExtentMap
PerformanceModule
UserModule
6
2009 Calpont Corporation
Calpont OAM
Scalable Building Blocks
UserModule n
UserModule 1
Scale out for Performance
Scale out for User Concurrency
• Add User Modules to scale user concurrency• Add Performance Modules to scale performance• Calpont OAM provides robust administration capabilities and monitors system health
• Each User Module can distribute work across all Performance Modules
PerformanceModule 1
PerformanceModule 2
PerformanceModule 3
PerformanceModule n
7
2009 Calpont Corporation
Multiple Storage Architectures Enabled by the Calpont Extent Map
Storage Foundations
“Shared Nothing” Storage Architecture
StorageStorage StorageStorage StorageStorage StorageStorage
PerformanceModule 1
PerformanceModule 2
PerformanceModule 3
PerformanceModule n
“Shared Everything” Storage Architecture
SANSAN
PerformanceModule 1
PerformanceModule 2
PerformanceModule 3
PerformanceModule n
Enables both “rack and stack” and centrally managed storage deployments.
8
2009 Calpont Corporation
Build a Better Warehouse
• …one instance,
Distributed Servers
UserModule n
UserModule 1
Scale out for Performance
Scale out for User Concurrency
PerformanceModule 1
PerformanceModule 2
PerformanceModule 3
PerformanceModule n
Single Server
UserModule 1
PerformanceModule 1
Single Server
UserModule 1
PerformanceModule 1
Single Server
UserModule 1
PerformanceModule 1
Whether you need…
• …or many;
• …or to scale;
• …to start small,
…you can do it with Calpont.
9
2009 Calpont Corporation
Let’s Take a Look
1. Simply Capable2. Simply Scalable3. Simply Extendable
ScalableCapable Extendable Simple
Demo Infrastructure
SANSAN
UserModule 1
PerformanceModule 1
PerformanceModule 2
PerformanceModule 3
PerformanceModule 4
www.trueffect.com
10
2009 Calpont Corporation
What Did You See?
Scan & Aggregation Scalability
0
50,000,000
100,000,000
150,000,000
200,000,000
250,000,000
300,000,000
350,000,000
400,000,000
450,000,000
500,000,000
1 PM 2 PMs 3 PM s 4 PMs
Perfomance Modules
Ro
ws
Pro
ce
ss
ed
Pe
r S
ec
on
d (
Sy
ste
m)
Query 1 - 5.1 Billion
Query 2 - 1.3 Billion
ScalableCapable Extendable Simple
Rows Joined Per Second
0
20,000,000
40,000,000
60,000,000
80,000,000
100,000,000
120,000,000
140,000,000
1 PM 2 PMs 3 PMs 4 PMs
Performance Modules
Ro
ws
Pe
r S
ec
on
d
Disk
Cache
11
2009 Calpont Corporation
Calpont Early Adopter Program
• Serious about architecture, serious about support We are completing our support infrastructure We are forming the Calpont Community
• The Program If you’re ready, so are we We want your feedback Register now at www.calpont.com
• Follow us All things Calpont – read our blog on
www.calpont.com Follow us at www.twitter.com/calpont Register for updates at www.calpont.com Read me at http://jtommaney.livejournal.com/
12
2009 Calpont Corporation
Get in the race…
Jim TommaneyChief Product Architect @ Calpont