Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 1 - 03/02/2004
C-JDBC: a High Performance Database Clustering Middleware
Nicolas [email protected]
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 2 - 03/02/2004
Outline - Motivations
MotivationsUse-CasesC-JDBC conceptsPerformanceMonitoringCommunityConclusion
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 3 - 03/02/2004
Motivations
J2EE performance scalability bounded by database performanceDatabase tier must be
–scalable
–fault tolerant (high availability + failover)
–without modifying the client application
–using open source databases
–on commodity hardware
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 4 - 03/02/2004
What is
Database Database
C-JDBC Controller Scalability - Fault tolerance - Failover -
Monitoring - Caching - Logging - ...
Database JDBC driver
DatabaseDatabase Database
Database
JVM
Java client program
Database JDBC driver
JVM
EJB Container JOnAS, WebLogic,
JBoss, WebSphere, ...
Servlet container Tomcat , Jetty, ...
Database JDBC driver
MySQL, PostgreSQL, Oracle, DB2, InstantDB, ...
Database JDBC driver
JVM
Java client program
C-JDBC driver
JVM
EJB Container
JOnAS, WebLogic, JBoss,
WebSphere, ...
Servlet container
Tomcat, Jetty, ...
C-JDBC driver
C-JDBC driver
JVM
MySQL, PostgreSQL, Oracle, DB2, InstantDB, ...
JVM
C-JDBC
No scalability No fault tolerance
No failover
EJB Container
JOnAS, WebLogic, JBoss,
WebSphere, ...
Servlet container
Tomcat, Jetty, ...
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 5 - 03/02/2004
Redundant Array of Inexepensive Databases
RAIDb controller– gives the view of a single database to the client
– balance the load on the database backends
RAIDb levels– RAIDb-0: full partitioning
– RAIDb-1: full mirroring
– RAIDb-2: partial replication
– composition possible
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 6 - 03/02/2004
C-JDBC
Middleware implementing RAIDbTwo components– generic JDBC 2.0 driver (C-JDBC driver)
– C-JDBC Controller
C-JDBC Controller provides– performance scalability
– high availability
– failover
– caching, logging, monitoring, …
Supports heterogeneous databases
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 7 - 03/02/2004
Outline - Use-Cases
MotivationsUse-CasesC-JDBC conceptsPerformance MonitoringCommunityConclusion
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 8 - 03/02/2004
What C-JDBC offers
Usually, we do this:
Database
JDBCDriver
Application
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 9 - 03/02/2004
Pooling
Cache
ControllerDatabaseVirtual
Driver
What C-JDBC offers
Now we have this:
Backend
JDBCDriver
Application
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 10 - 03/02/2004
What C-JDBC offers
And, finally we have all this:
DatabaseVirtual
BackendBackend recovery
Pooling
Cache
ControllerDatabaseVirtual
JDBCDriver
Load balancing
Driver
Application
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 11 - 03/02/2004
Heterogeneity support
application alreadywritten for a specific[commercial] databaseuser defined rulesfor on-the-fly queryrewriting to executeon heterogeneousbackends
MySQL MySQL
C-JDBC Controller RAIDb-2
MySQL JDBC driver
MySQLOracle
Java client program
C-JDBC driver
JVM
C-JDBC driver
C-JDBC driver
JVM
JVM
Oracle JDBC driver
EJB Container JOnAS, WebLogic,
JBoss, WebSphere, ...
Servlet container Tomcat, Jetty, ...
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 12 - 03/02/2004
Outline - C-JDBC concepts
MotivationsUse-CasesC-JDBC conceptsPerformanceMonitoringCommunityConclusion
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 13 - 03/02/2004
Controller
XML configuration file
C-JDBC Controller
MySQL
C-JDBC driver
Java client program
(Servlet, EJB, ...)
XML engine
MySQL
Virtual database
Database Backend
Connection Manager
Database Backend
Connection Manager
Request Manager
Request Cache
Scheduler
Load balancer
MySQL JDBC driver
MySQL JDBC driver
Configuration &
administrationAdministration console
RMIRMI
Recovery Log
Authentication Manager
MySQL
Database Backend
Connection Manager
MySQL JDBC driver
Virtual database
Database Backend
Connection Manager
Database Backend
Connection Manager
Request Manager
Request Cache
Scheduler
Load balancer
Recovery Log
Authentication Manager
Oracle
Oracle JDBC driver
Oracle JDBC driver
Oracle
Sockets Sockets
JMX
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 14 - 03/02/2004
Virtual Database
MySQL
C-JDBC driver
Java client program
(Servlet, EJB, ...)
MySQL
Virtual database
Database Backend
Connection Manager
Database Backend
Connection Manager
Request Manager
Request Cache
Scheduler
Load balancer
MySQL JDBC driver
MySQL JDBC driver
Recovery Log
Authentication Manager
MySQL
Database Backend
Connection Manager
MySQL JDBC driver
gives the view of a single databaseestablishes the mapping between the database name used by the application and the backend specific settingsbackends can be added and removed dynamicallyconfigured using an XML configuration file
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 15 - 03/02/2004
Building the initial state
➨ Octopus is an ETL tool
➨ Use Octopus to store a dump of the initial database state
PostgreSQL
C-JDBC Controller
PostgreSQL JDBC driver
C-JDBC driver
JVM
EJB Container JOnAS, WebLogic,
JBoss, WebSphere, ...
disabled
dump for initial
checkpoint
Recovery LogOctopus
disabled
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 16 - 03/02/2004
Journaling
➨ Backend is enabled
➨ All database updates are logged (SQL statement, user, transaction, …)
PostgreSQL
C-JDBC Controller
PostgreSQL JDBC driver
C-JDBC driver
JVM
EJB Container JOnAS, WebLogic,
JBoss, WebSphere, ...
enabled
dump for initial
checkpoint
Recovery LogOctopus
JDBC Recovery Log
enabled
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 17 - 03/02/2004
Adding backend on the fly
➨Add new backends while system online
➨Restore dump corresponding to initial checkpoint with Octopus
PostgreSQL
C-JDBC Controller
PostgreSQL JDBC driver
C-JDBC driver
JVM
EJB Container JOnAS, WebLogic,
JBoss, WebSphere, ...
dump for initial
checkpoint
Recovery Log
Octopus
PostgreSQL PostgreSQLdisabled disabledenabled
enabled
JDBC Recovery Log
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 18 - 03/02/2004
Synchronizing backends
➨Replay updates from the log
PostgreSQL
C-JDBC Controller
PostgreSQL JDBC driver
C-JDBC driver
JVM
EJB Container JOnAS, WebLogic,
JBoss, WebSphere, ...
dump for initial
checkpoint
Recovery Log
Octopus
PostgreSQL PostgreSQLenabled
enabled
JDBC Recovery Log
disableddisabled
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 19 - 03/02/2004
Expanded Cluster
➨Enable backends when done
PostgreSQL
C-JDBC Controller
PostgreSQL JDBC driver
C-JDBC driver
JVM
EJB Container JOnAS, WebLogic,
JBoss, WebSphere, ...
dump for initial
checkpoint
Recovery Log
Octopus
PostgreSQL PostgreSQLenabled
enabled
JDBC Recovery Log
enabledenabled
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 20 - 03/02/2004
Handling a backend failure
➨A node fails!
➨Automatically disabled but should be fixed or changed by administrator
PostgreSQL
C-JDBC Controller
PostgreSQL JDBC driver
C-JDBC driver
JVM
EJB Container JOnAS, WebLogic,
JBoss, WebSphere, ...
dump for
checkpoint
Recovery Log
Octopus
PostgreSQL PostgreSQLenabled
enabled
JDBC Recovery Log
enableddisabled
...
dump for last
checkpointdump
for last checkpoint
dump for initial
checkpoint
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 21 - 03/02/2004
Restoring a backend
➨Restore latest dump with Octopus
...
dump for last
checkpoint
PostgreSQL
C-JDBC Controller
PostgreSQL JDBC driver
C-JDBC driver
JVM
EJB Container JOnAS, WebLogic,
JBoss, WebSphere, ...
dump for last
checkpoint
Recovery Log
Octopus
PostgreSQL PostgreSQLenabled
enabled
JDBC Recovery Log
enableddisabled
dump for initial
checkpoint
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 22 - 03/02/2004
Re-synchronization
➨Replay missing updates from log
dump for
checkpoint
Octopus...
dump for last
checkpoint
dump for last
checkpoint
dump for initial
checkpoint
PostgreSQL
C-JDBC Controller
PostgreSQL JDBC driver
C-JDBC driver
JVM
EJB Container JOnAS, WebLogic,
JBoss, WebSphere, ...
Recovery Log
PostgreSQL PostgreSQLenabled
enabled
JDBC Recovery Log
disabled enabled
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 23 - 03/02/2004
Healed Cluster
➨Re-enable backend when done
PostgreSQL
C-JDBC Controller
PostgreSQL JDBC driver
C-JDBC driver
JVM
EJB Container JOnAS, WebLogic,
JBoss, WebSphere, ...
dump for
checkpoint
Recovery Log
Octopus
PostgreSQL PostgreSQLenabled
enabled
JDBC Recovery Log
enabledenabled
...
dump for last
checkpointdump
for last checkpoint
dump for initial
checkpoint
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 24 - 03/02/2004
Outline - Performance
MotivationsUse-CasesC-JDBC conceptsPerformance MonitoringCommunityConclusion
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 25 - 03/02/2004
TPC-W Performance(Amazon.com)
0
200
400
600
800
1000
1200
1400
1600
0 1 2 3 4 5 6
Number of nodes
Thro
ughp
ut in
requ
ests
per
min
ute
Single DB
RAIDb-0
RAIDb-1
RAIDb-2
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 26 - 03/02/2004
RUBiS- Tomcat withoutC-JDBC caching
Tomcat
~50% cpu
1 Database
100% cpu
Throughput: 3900 pages/min
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 27 - 03/02/2004
RUBiS- Tomcat withC-JDBC caching
Tomcat
~55% cpu
1 Database
~20% cpu
Throughput: 4200 pages/min
C-JDBC <10% cpu
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 28 - 03/02/2004
Outline - Monitoring
MotivationsUse-CasesC-JDBC conceptsPerformanceMonitoringCommunityConclusion
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 29 - 03/02/2004
Monitoring/Trace
Trace, save, get statistic content of different modules
Controller, database, users, backend, cache, load, memory usage ...
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 30 - 03/02/2004
SQL Console: Squirrel
Enable
Execute a set of atomic sql requests
Verify content of clustered database
Verify cluster schemas
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 31 - 03/02/2004
View graphic remote logs
–Watch execution:
–per backend
–per controller
–per virtual database
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 32 - 03/02/2004
Outline - Community
MotivationsUse-CasesC-JDBC conceptsPerformanceMonitoringCommunityConclusion
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 33 - 03/02/2004
Stats as of Feb, 2004
Downloads– total : 11260 downloads since may 2003
– 2004 : > 1300 downloads
– Top 5 of the most downloaded ObjectWeb project
Mailing lists– [email protected]: 124 subscribers
Team– 11 committers
– 1 full-time INRIA engineer
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 34 - 03/02/2004
The developer community
Mathieu Peltier (ObjectWeb)– build scripts, automatic installer, JUnit test
Julie Marguerite (ObjectWeb)– JDBCRecoveryLog, automatic schema detection
Christiana Amza (Rice University), Anupam Chanda (Rice University), Sara Bouchenak (EPF Lausanne)
– SQL query caching
Guillaume Bort (INRIA Lorraine)– JBoss support
Marek Prochazka (INRIA Rhone-Alpes)– Datasource implementation
Greg Ward (dplanet.ch)– Sybase support, design, debug
Marc Wick (monte-bre.ch)– HSQL support, design debug and ideas
Duncan Smith (mightybot.com)– IP binding, security concerns, console, JMX, distributed management
Vadim Kassin (Kazakhstan Stock Exchange)– Autogenerated keys support
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 35 - 03/02/2004
Outline - Conclusion
MotivationsUse-CasesC-JDBC conceptsPerformanceMonitoringCommunityConclusion
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 36 - 03/02/2004
Current status
C-JDBC 1.0 rc1 release– Generic JDBC 2.0 driver
– Schedulers and load balancers for RAIDb 0, 1 and 2
– Fine grain query caching and sql monitoring
– JDBC recovery log
– Logger/request player
– Java installer
– User documentation
– Octopus integration
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 37 - 03/02/2004
On-going work and efforts
Listen to the needs of users, quick answers on the mailing list
Horizontal scalability
Fully featured administration console
Graphical configuration and deployment of centralized/distributed backeds and controllers (offline/online)
Dynamic reconfiguration
Automated Load testing, report page updated by users
RPM packaging (Jpackage version 1.0b15 done)
C-ODBC (asked by a lot of people)
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 38 - 03/02/2004
Take this message at home
Database Clustering Middleware(100% java)Based on JDBC StandardNo code modification (application or database)Open source (LGPL)
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 39 - 03/02/2004
Questions & Answers_________
Thanks to all users and contributors ...
http://c-jdbc.objectweb.org
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 40 - 03/02/2004
Prototype
C-JDBC Management Framework
Shared design
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 41 - 03/02/2004
Request cache
caches results from SQL requestsimproved SQL statement analysis to limit cache invalidations– table based invalidations– column based invalidations– single-row SELECT optimization
request parsing possible in theC-JDBC driver– offload the controller– parsing caching in the driver
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 42 - 03/02/2004
Load balancer
RAIDb-0– query directed to the backend having the
needed tables
RAIDb-1– read executed by current thread
– write executed in parallel by a dedicated thread per backend
– result returned if one, majority or all commit
– if one node fails but others succeed, failing node is disabled
RAIDb-2– same as RAIDb-1 except that writes are
sent only to nodes owning the written table
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 43 - 03/02/2004
Connection Manager
Connection pooling for a backend– Simple : no pooling
– RandomWait : blocking pool
– FailFast : non-blocking pool
– VariablePool : dynamic pool
Connection pools defined on a per login basis– resource management per login
– dedicated connections for admin
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 44 - 03/02/2004
Scheduler
Manages concurrency controlSpecific implementations for Single DB, RAIDb 0, 1 and 2Query-levelOptimistic and pessimistic transaction level– uses the database schema that is
automatically fetched from backends
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 45 - 03/02/2004
Recovery Log
Checkpoints are associated with database dumpsRecord all updates and transaction markers since a checkpointUsed to resynchronize a database from a checkpointJDBCRecoveryLog– store information in a database
– can be re-injected in a C-JDBC cluster for fault tolerance
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 46 - 03/02/2004
Making new checkpoints
➨Disable one backend to have a coherent snapshot
➨Mark the new checkpoint entry in the log
➨Use Octopus to store the dump
...
dump for last
checkpoint
PostgreSQL
C-JDBC Controller
PostgreSQL JDBC driver
C-JDBC driver
JVM
EJB Container JOnAS, WebLogic,
JBoss, WebSphere, ...
dump for last
checkpoint
Recovery Log
Octopus
PostgreSQL PostgreSQLenabled
enabled
JDBC Recovery Log
enableddisabled
dump for initial
checkpoint
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 47 - 03/02/2004
Making new checkpoints
➨Replay missing updates from log
dump for
checkpoint
Octopus...
dump for last
checkpoint
dump for last
checkpoint
dump for initial
checkpoint
PostgreSQL
C-JDBC Controller
PostgreSQL JDBC driver
C-JDBC driver
JVM
EJB Container JOnAS, WebLogic,
JBoss, WebSphere, ...
Recovery Log
PostgreSQL PostgreSQLenabled
enabled
JDBC Recovery Log
disabled enabled
Nicolas Modrzyk - http://c-jdbc.objectweb.org/ - [email protected] 48 - 03/02/2004
Making new checkpoints
➨Re-enable backend when done
PostgreSQL
C-JDBC Controller
PostgreSQL JDBC driver
C-JDBC driver
JVM
EJB Container JOnAS, WebLogic,
JBoss, WebSphere, ...
dump for
checkpoint
Recovery Log
Octopus
PostgreSQL PostgreSQLenabled
enabled
JDBC Recovery Log
enabledenabled
...
dump for last
checkpointdump
for last checkpoint
dump for initial
checkpoint