calpont open source columnar storage engine for scalable mysql data warehousing april 22, 2009

12
1 2009 Calpont Corporation Calpont Calpont Open Source Open Source Columnar Storage Columnar Storage Engine for Engine for Scalable MySQL Scalable MySQL Data Warehousing Data Warehousing April 22, 2009 April 22, 2009 MySQL User Conference MySQL User Conference Santa Clara, CA Santa Clara, CA Jim Tommaney Jim Tommaney Chief Product Architect Chief Product Architect [email protected] [email protected]

Upload: armen

Post on 06-Jan-2016

25 views

Category:

Documents


0 download

DESCRIPTION

Calpont Open Source Columnar Storage Engine for Scalable MySQL Data Warehousing April 22, 2009 MySQL User Conference Santa Clara, CA Jim Tommaney Chief Product Architect [email protected]. In a Nutshell. Calpont built… a columnar storage engine optimized for data warehousing - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Calpont Open Source Columnar Storage Engine for Scalable MySQL Data Warehousing April 22, 2009

1

2009 Calpont Corporation

CalpontCalpontOpen SourceOpen SourceColumnar Storage Engine Columnar Storage Engine for Scalable MySQLfor Scalable MySQLData WarehousingData Warehousing

April 22, 2009April 22, 2009MySQL User ConferenceMySQL User ConferenceSanta Clara, CASanta Clara, CA

Jim TommaneyJim TommaneyChief Product ArchitectChief Product [email protected]@calpont.com

Page 2: Calpont Open Source Columnar Storage Engine for Scalable MySQL Data Warehousing April 22, 2009

2

2009 Calpont Corporation

In a Nutshell

• Calpont built… a columnar storage engine optimized for

data warehousing an all-software solution on a solid foundation

1. Capable2. Scalable3. Extendable4. Simple

• Because… today’s business asks more questions, by more

people, across more data “technology as usual” can’t keep up there has to be a way

Page 3: Calpont Open Source Columnar Storage Engine for Scalable MySQL Data Warehousing April 22, 2009

3

2009 Calpont Corporation

Calpont solves for this…and more.

• Because we fundamentally believe in the power of open source software So does MySQL So does this Community

• Because of the opportunity MySQL is uniquely positioned to become a serious player in

data warehousing MySQL and its partners have made great strides in

beginning to address the market But to be truly successful, MySQL must add the ability to

perform:1. Transparent, distributed linear scaling

2. Distributed parallel table scans

3. Distributed joins

4. Hash joins5. On-line add-column operations

Why MySQL?

Page 4: Calpont Open Source Columnar Storage Engine for Scalable MySQL Data Warehousing April 22, 2009

4

2009 Calpont Corporation

Serious Architecture for a Serious Problem.

Calpont Pillars

It needs to go!1. Scalable scan, filter,

aggregation and hash join operations

2. Intra-server parallelism3. Inter-server parallelism

Scalable

It needs to be right!

1. Built for analytics

2. Built for big data

3. Built for speedCapable

It needs to grow!1. Extend the data2. Extend the data

model3. Extend database

functionality via UDFs

Extendable

It needs to be easy!1. “Load & Go”2. Automatic,

maintenance-free partitioning

3. Automatic parallelismSimple

Page 5: Calpont Open Source Columnar Storage Engine for Scalable MySQL Data Warehousing April 22, 2009

5

2009 Calpont Corporation

Building BlocksModule Process Functionality Value

• Hosts MySQL • Connection management• SQL parsing &

optimization

Familiar DBMS interface Leverages existing partner

integrations

• Abstracts physical and logical storage

• Metadata store

Enables shared nothing and shared everything storage

Enables partition elimination Built-in failover

Controller

• Work distribution• Final results

management and aggregation

Independent scalability and tunable concurrency

Multi-threaded to take advantage of multi-core HW platforms

Worker

• Scale-out cache management

• Distributed scan, filter, join and aggregation operations

• Resource management

Independent scalability and tunable performance

Multi-threaded to take advantage of multi-core HW platforms

Data

• High Speed Bulk Load• Transactional DML and

DDL• Online schema

extensions

Enables concurrent reads and writes, non-blocking read enabled

Multi-threaded to take advantage of multi-core HW platforms

MySQLDirectorMySQLDirector

ExtentMapExtentMap

PerformanceModule

UserModule

Page 6: Calpont Open Source Columnar Storage Engine for Scalable MySQL Data Warehousing April 22, 2009

6

2009 Calpont Corporation

Calpont OAM

Scalable Building Blocks

UserModule n

UserModule 1

Scale out for Performance

Scale out for User Concurrency

• Add User Modules to scale user concurrency• Add Performance Modules to scale performance• Calpont OAM provides robust administration capabilities and monitors system health

• Each User Module can distribute work across all Performance Modules

PerformanceModule 1

PerformanceModule 2

PerformanceModule 3

PerformanceModule n

Page 7: Calpont Open Source Columnar Storage Engine for Scalable MySQL Data Warehousing April 22, 2009

7

2009 Calpont Corporation

Multiple Storage Architectures Enabled by the Calpont Extent Map

Storage Foundations

“Shared Nothing” Storage Architecture

StorageStorage StorageStorage StorageStorage StorageStorage

PerformanceModule 1

PerformanceModule 2

PerformanceModule 3

PerformanceModule n

“Shared Everything” Storage Architecture

SANSAN

PerformanceModule 1

PerformanceModule 2

PerformanceModule 3

PerformanceModule n

Enables both “rack and stack” and centrally managed storage deployments.

Page 8: Calpont Open Source Columnar Storage Engine for Scalable MySQL Data Warehousing April 22, 2009

8

2009 Calpont Corporation

Build a Better Warehouse

• …one instance,

Distributed Servers

UserModule n

UserModule 1

Scale out for Performance

Scale out for User Concurrency

PerformanceModule 1

PerformanceModule 2

PerformanceModule 3

PerformanceModule n

Single Server

UserModule 1

PerformanceModule 1

Single Server

UserModule 1

PerformanceModule 1

Single Server

UserModule 1

PerformanceModule 1

Whether you need…

• …or many;

• …or to scale;

• …to start small,

…you can do it with Calpont.

Page 9: Calpont Open Source Columnar Storage Engine for Scalable MySQL Data Warehousing April 22, 2009

9

2009 Calpont Corporation

Let’s Take a Look

1. Simply Capable2. Simply Scalable3. Simply Extendable

ScalableCapable Extendable Simple

Demo Infrastructure

SANSAN

UserModule 1

PerformanceModule 1

PerformanceModule 2

PerformanceModule 3

PerformanceModule 4

www.trueffect.com

Page 10: Calpont Open Source Columnar Storage Engine for Scalable MySQL Data Warehousing April 22, 2009

10

2009 Calpont Corporation

What Did You See?

Scan & Aggregation Scalability

0

50,000,000

100,000,000

150,000,000

200,000,000

250,000,000

300,000,000

350,000,000

400,000,000

450,000,000

500,000,000

1 PM 2 PMs 3 PM s 4 PMs

Perfomance Modules

Ro

ws

Pro

ce

ss

ed

Pe

r S

ec

on

d (

Sy

ste

m)

Query 1 - 5.1 Billion

Query 2 - 1.3 Billion

ScalableCapable Extendable Simple

Rows Joined Per Second

0

20,000,000

40,000,000

60,000,000

80,000,000

100,000,000

120,000,000

140,000,000

1 PM 2 PMs 3 PMs 4 PMs

Performance Modules

Ro

ws

Pe

r S

ec

on

d

Disk

Cache

Page 11: Calpont Open Source Columnar Storage Engine for Scalable MySQL Data Warehousing April 22, 2009

11

2009 Calpont Corporation

Calpont Early Adopter Program

• Serious about architecture, serious about support We are completing our support infrastructure We are forming the Calpont Community

• The Program If you’re ready, so are we We want your feedback Register now at www.calpont.com

• Follow us All things Calpont – read our blog on

www.calpont.com Follow us at www.twitter.com/calpont Register for updates at www.calpont.com Read me at http://jtommaney.livejournal.com/

Page 12: Calpont Open Source Columnar Storage Engine for Scalable MySQL Data Warehousing April 22, 2009

12

2009 Calpont Corporation

Get in the race…

Jim TommaneyChief Product Architect @ Calpont

[email protected]