2010_12mysqlclusteroverview

MySQL Cluster for Real Time, HA Services

Bill Papp ([email protected])Principal MySQL Sales ConsultantOracle

Agenda Overview of MySQL ClusterDesign Goals, Evolution, Workloads, UsersArchitecture and Core TechnologyDeep Dive, New Features & CapabilitiesMySQL Cluster 7.1 MySQL Cluster ManagerResources to Get Started

MySQL Cluster GoalsHigh Performance: Write Scalability & Low Latency

99.999% Availability

Low TCO

MySQL Cluster - Key Advantages

MySQL Cluster HighlightsDistributed Hash Table backed by an ACID Relational ModelShared-Nothing Architecture, scale-out on commodity hardwareImplemented as a pluggable storage engine for the MySQL Server with additional direct access via embedded APIs.Automatic or user configurable data partitioning across nodesSynchronous data redundancy

MySQL Cluster Highlights (cont.)Sub-second fail-over & self-healing recoveryGeographic replicationData stored in main-memory or on disk (configurable per-column) Logging and check pointing of in-memory data to diskOnline operations (i.e. add-nodes, schema updates, maintenance, etc)

MySQL Cluster Users & Applications HA, Transactional Services: Web & Telecomshttp://www.mysql.com/customers/cluster/TelecomsSubscriber Databases (HLR/HSS)Service Delivery PlatformsVoIP, IPTV & VoDMobile Content DeliveryOn-Line app stores and portalsIP ManagementPayment GatewaysWebUser profile managementSession storeseCommerceOn-Line GamingApplication Servers

MySQL Cluster Data NodesMySQL Cluster Application NodesMySQL Cluster MgmtClientsMySQL Cluster MgmtMySQL Cluster Architecture Parallel Database with no SPOF: High Read & Write Performance & 99.999% uptime

Example ConfigurationMySQL Cluster Manager agent runs on each physical hostNo central process for Cluster Manager agents co-operate, each one responsible for its local nodesAgents are responsible for managing all nodes in the clusterManagement responsibilitiesStarting, stopping & restarting nodesConfiguration changesUpgradesHost & Node status reportingRecovering failed nodesagent1. ndb_mgmd7. mysqld192.168.0.10agent2. ndb_mgmd8. mysqld192.168.0.11agent5. ndbd3. ndbd192.168.0.12agent6. ndbd4. ndbd192.168.0.13agentn. ndb_mgmdn. mysqldn. ndbdMySQL Server (ID=n)Management Node (ID=n)Data Node (ID=n)MySQL Cluster Manager agent

Creating & Starting a ClusterDefine the site: Expand the MySQL Cluster tar-ball(s) from mysql.com to known directoryDefine the package(s): Note that the basedir should match the directory used in Step 2.Create the Cluster This is where you define what nodes/processes make up the Cluster and where they should runStart the Cluster:agent1. ndb_mgmd7. mysqld192.168.0.10agent2. ndb_mgmd8. mysqld192.168.0.11agent5. ndbd3. ndbd192.168.0.12agent6. ndbd4. ndbd192.168.0.13Mysql> create site --hosts=192.168.0.10,192.168.0.11, -> 192.168.0.12,192.168.0.13 mysite;Mysql> add package --basedir=/usr/local/mysql_6_3_26 6.3; Mysql> add package --basedir=/usr/local/mysql_7_0_7 7.0;Mysql> create cluster --package=6.3 -> [email protected],[email protected], -> [email protected],[email protected], [email protected], -> [email protected],[email protected],[email protected] -> mycluster;Mysql> start cluster mycluster;

Upgrade ClusterUpgrade from MySQL Cluster 6.3.26 to 7.0.7: Automatically upgrades each node and restarts the process in the correct order to avoid any loss of serviceWithout MySQL Cluster Manager, the administrator must stop each process in turn, start the process with the new version and wait for the node to restart before moving onto the next one agent1. ndb_mgmd7. mysqld192.168.0.10agent2. ndb_mgmd8. mysqld192.168.0.11agent5. ndbd3. ndbd192.168.0.12agent6. ndbd4. ndbd192.168.0.13mysql clientmysql> upgrade cluster --package=7.0 mycluster;

Out of the Box Scalability: Data PartitioningData partitioned across Data NodesRows are divided into partitions, based on a hash of all or part of the primary keyEach Data Node holds primary fragment for 1 partitionAlso stores secondary fragment of another partitionRecords larger than 8KB stored as BLOBs

Synchronous Replication within Node GroupShared-Nothing Architecture for High Availability

Node Failure Detection & Self-Healing Recovery

On-Line Scaling & Maintenance Can also update schema on-lineUpgrade hardware & software with no downtimePerform back-ups on-line

New node group addedData is re-partitionedRedundant data is deletedDistribution is switched to share load with new node group

Geographic ReplicationCluster 1Synchronous replicationCluster 2MyISAMMyISAMInnoDBAsynchronous replicationSynchronous replication within a Cluster node group for HABi-Direction asynchronous replication to remote Cluster for geographic redundancyAsynchronous replication to non-Cluster databases for specialised activities such as report generationMix and match replication types

High Throughput, Low Latency Transactional PerformanceMySQL Cluster delivered:250k TPM, 125k operations per secondAverage 3ms response time4.3x higher throughput than previous MySQL Cluster 6.3 releasehttp://www.mysql.com/why-mysql/benchmarks/mysql-cluster/

MySQL Cluster vs MySQL MEMORY:

30x Higher Throughput / 1/3rd the Latency on a single nodeTable level locking inhibits MEMORY scalability beyond a single client connectionCheck-pointing & logging enabled, MySQL Cluster still delivers durability 4 socket server, 64GB RAM, running Linux

Delivering up to 10x higher Java Throughput

MySQL Cluster Connector for Java:Native Java APIOpenJPA Plug-InReducing Cost of Operations

Simplified Management & Monitoring: NDBINFOMySQL Cluster Manager (part of CGE only)Faster RestartsMySQL Cluster CGE 7.1 Key Enhancements

Real-Time Metrics w/ ndbinfo New database (ndbinfo) which presents real-time metric data in the form of tablesExposes new information together with providing a simpler, more consistent way to access existing dataExamples include:Resource usage (memory, buffers)Event counters (such as number of READ operations since last restart)Data node status and connection statusmysql> use ndbinfomysql> show tables;+-------------------+ | Tables_in_ndbinfo | +-------------------+ | blocks | | config_params | | counters | | logbuffers | | logspaces | | memoryusage | | nodes | | resources | | transporters | +-------------------+

Real-Time Metrics w/ ndbinfo (cont.) Example 1: Check memory usage/availability

mysql> select * from memoryusage;+---------+--------------+------+------+ | node_id | DATA_MEMORY | used | max | +---------+--------------+------+------+ | 3 | DATA_MEMORY | 594 | 2560 | | 4 | DATA_MEMORY | 594 | 2560 | | 3 | INDEX_MEMORY | 124 | 2336 | | 4 | INDEX_MEMORY | 124 | 2336 | +---------+--------------+------+------+

Note that there is a DATA_MEMORY and INDEX_MEMORY row for each data node in the clusterIf the Cluster is nearing the configured limit then increase the DataMemory and/or IndexMemory parameters in config.ini and then perform a rolling restart

Real-Time Metrics w/ ndbinfo (cont.) Example 2: Check how many table scans performed on each data node since the last restart

mysql> select node_id as 'data node', val as 'Table Scans' from counters where counter_name='TABLE_SCANS';+-----------+-------------+ | data node | Table Scans | +-----------+-------------+ | 3 | 3 | | 4 | 4 | +-----------+-------------+

You might check this if your database performance is lower than anticipatedIf this figure is rising faster than you expected then examine your application to understand why there are so many table scans

MySQL Cluster 7.1: ndbinfo Example 3: Check if approaching the point at which the undo log completely fills up between local checkpoints (which could result in delayed transactions or even a database halt if not addressed):

mysql> select node_id as 'data node', total as 'configured undo log buffer size', used as 'used buffer space' from logbuffers where log_type='DD-UNDO; +-----------+---------------------------------+-------------------+ | data node | configured undo log buffer size | used buffer space | +-----------+---------------------------------+-------------------+ | 3 | 2096128 | 0 | | 4 | 2096128 | 0 | +-----------+---------------------------------+-------------------+

If log buffer is almost full then increase size of log buffer

MySQL Cluster Connector for JavaNew Domain Object Model Persistence API (ClusterJ) :Java APIHigh performance, low latencyFeature richJPA interface built upon this new Java layer:Java Persistence API compliantImplemented as an OpenJPA pluginUses ClusterJ where possible, reverts to JDBC for some operationsHigher performance than JDBCMore natural for most Java designersEasier Cluster adoption for web applicationsData NodesNetwork

ClusterJPARemoves ClusterJ limitations:Persistent classesRelationshipsJoins in queriesLazy loadingTable and index creation from object modelImplemented as an OpenJPA pluginBetter JPA performance for insert, update, delete

Performance

Beyond 7.1: SPJ Push Down JoinsA linked operation is formed by the MySQL Server from the SQL query and sent to the data nodesFor a linked operation, first part of query can be a scan but should result in primary key lookups for the next partMore complex queries could be sent as multiple linked operationsReduces latency and increases throughput for complex joinsQualifies MySQL Cluster for new classes of applicationsAlso possible directly through NDB APIUp to 42x performance gain in PoC!mysqldData NodesmysqldData NodesThe existence, content and timing of future releases described here is included for information only and may be changed at Oracles discretion.

MySQL Enterprise Monitor 2.3 (pre GA)

Monitoring

Status Monitoring & RecoveryAutomated Management

Cluster-Wide ManagementProcess ManagementOn-Line Operations (Upgrades / Reconfiguration)

HA Operations

Disk PersistenceConfiguration ConsistencyHA Agent OperationMySQL Cluster Manager 1.0 Features

MySQL Cluster Manager Current Development ProjectsOn-line add-nodemysql> add hosts --hosts=192.168.0.35,192.168.0.36 mysite;mysql> add package --basedir=/usr/local/mysql_7_0_7 hosts=192.168.0.35,192.168.0.36 7.0;mysql> add process [email protected],[email protected],[email protected],[email protected] mycluster;mysql> start process --added mycluster; Restart optimizationsFewer nodes restarted on some parameter changes

Application: Service Delivery PlatformRoaming platform to support 7m roaming subscribers per day FIFA World Cup 2010Database supports AAA, routing, billing, messaging, signalling, payment processingMySQL Cluster 7.1 delivered 1k TPS on 1TB data with carrier-grade availabilityKey business benefitsLocal carriers to monetize new subscribersUsers enjoy local pricing with full functionality of their home networkReduced deployment time by 75%

*MySQL Cluster 7.1 gave us the perfect combination of extreme levels of transactionthroughput, low latency & carrier-grade availability. We also reduced TCO by being able to scale out on commodity server blades and eliminate costly shared storage - Phani Naik, Head of Technology at Pyro Group

*Since deploying MySQL Cluster as our eCommerce database, we have had continuous uptime with linear scalability enabling us to exceed our most stringent SLAs Sean Collier, CIO & COO, Shopatron IncShopatron: eCommerce Platform ApplicationsEcommerce back-end, user authentication, order data & fulfilment, payment data & inventory tracking. Supports several thousand queries per secondKey business benefitsScale quickly and at low cost to meet demandSelf-healing architecture, reducing TCOWhy MySQL? Low cost scalabilityHigh read and write throughputExtreme availability

Resources to Get StartedMySQL Cluster Quick Start Guideshttp://www.mysql.com/products/database/cluster/get-started.html#quickstartMySQL Cluster 7.1, Architecture and New Features http://www.mysql.com/why-mysql/white-papers/mysql_wp_cluster7_architecture.php MySQL Cluster on the Webhttp://www.mysql.com/products/database/cluster/

The preceding is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described for Oracles products remains at the sole discretion of Oracle.

***Design goals for MySQL Cluster: - everything we do in dev of the product is designed to enhance 1 or more of these core design goals:

High Perf, specifically write scalability: how do we deliver without app developers having to modify their apps. Other dimension is low latency, to deliver real time responsivenss

- 5 x 9s avail handle both scheduled maintenace and failiers, so planned and unplanned downtime with less than 5 mins downtime per year

- Low TCO acquistion and operation of the s/w, as well as optimising perf and avail on commodity h/w, so keep overall projext costs down**This slide shows the key capabilities of MySQL ClusterVery high read/write th/put: Data is distributed across multiple nodes, so you have multi-master db with parallel architecture so perform multiple write ops concurrently, with any changes instantly available to all clients accessing the cluster. Recent tests on 4 nodes cluster using DBT2 BM 125k op/s with average latency of 3ms

Availability, delivers 99.999% There are no SPOFs, it is shared nothing archWithin the Cluster, all updates are syncrhonously replicated so written to at least 2 nodes before TX commited , so if 1 node fails, up to date copy of data on another node. You can fail over typically in less than 1 sec as there is no shared disk or lock managers to worry about. Self healing, node can auto rejoin and resynchronize with the cluster. Developers don't have to care about this in their apps - which is a massive gain

RT indexes in memory (also data in memory or on disk) + RT extensions, all CP of in-mem data to disk happens in backround so no slow down for disk i/o. Typically 2-5 ms latency with sync replication across multiple data nodes so you get low response time, but also predictable latency.

Linear scale scale horizontaly, vertically -also scale on-line with no downtime to appCore database is open source, multi- platform and multi-access methods

*So, look at how we deliver against those goals

- distributed hash table backed by an ACID relational model- As name suggests, MySQL Cluster comprises multiple nodes which act as a single system, implemented as shared-nothing architecture, scale out on commodity hardwareimplemented as a pluggable storage engine for the MySQL Server, like InnoDB or MyISAM so gives you ease-of-use and ubiquity of MySQL, with additional direct access via embedded APIs, so can eliminate SQL transofrmations completely and manage data directly from your app C++, LDAP, HTTP, most recently, Java and OpenJPA. This boosts perf also enables devs to work in their prefeered dev environments accelate dev cycles.- automatic or user configurable data partitioning across nodes, MySQL Cluster handles this, no need to partition within the apps- synchronous data redundancy across nodes, using 2PC. Can be turned off, but default and recommendation is for it to be on- Because shared nothingh & sync repli, we get sub-second fail-over. System also designed for self-healing recovery, so a fail;ed node will automatically rejoin and re-sync the cluster- geographic replication, for DR- data stored in main-memory or on disk (configurable per-column) - logging and check pointing of in-memory data to disk, so durability perfomed as a background process, so eliminate I/O waits- online operations (i.e. add-nodes, schema updates, maintenance, etc), no downtime to apps or clients

*So, look at how we deliver against those goals

- distributed hash table backed by an ACID relational model- As name suggests, MySQL Cluster comprises multiple nodes which act as a single system, implemented as shared-nothing architecture, scale out on commodity hardwareimplemented as a pluggable storage engine for the MySQL Server, like InnoDB or MyISAM so gives you ease-of-use and ubiquity of MySQL, with additional direct access via embedded APIs, so can eliminate SQL transofrmations completely and manage data directly from your app C++, LDAP, HTTP, most recently, Java and OpenJPA. This boosts perf also enables devs to work in their prefeered dev environments accelate dev cycles.- automatic or user configurable data partitioning across nodes, MySQL Cluster handles this, no need to partition within the apps- synchronous data redundancy across nodes, using 2PC. Can be turned off, but default and recommendation is for it to be on- Because shared nothingh & sync repli, we get sub-second fail-over. System also designed for self-healing recovery, so a fail;ed node will automatically rejoin and re-sync the cluster- geographic replication, for DR- data stored in main-memory or on disk (configurable per-column) - logging and check pointing of in-memory data to disk, so durability perfomed as a background process, so eliminate I/O waits- online operations (i.e. add-nodes, schema updates, maintenance, etc), no downtime to apps or clients

*Look at types of workload Cluster deployed into and the users

Cluster technology originally developed bu E///, used purely as an in-mem, carrier-grade db embedded in network equpt typically switches. Acquired by MySQL in 2003. Acquired not just the technology, but also the engineering team who have contined to develop the product rapidy in subsequent years, ie added disk based tables, added SQL i/f, automtic node recovery, add Geo-Repl, open sourced.

Very strong in telecoms in Sub DBs (HLR/HSS) truly mission critical apps, also in app servers, VAS

In web workloads, used a lot for session stores, eCommerce, Mgmt of user profiles

*Here we show the architectire of MySQL Cluster

3 core elementsData Nodes: handle actual storage and access to data data is distributed across the data nodes automatically partitioned, replication, failover and self healing. Don't need complex logic in the application data nodes handle all of thatApplication Nodes provide connectors to the data. Most common use case is to connect via MySQL Server for a std SQL i/f. Also have a series of native interfaces embed directly into apps for direct access bypasses SQL gives highest performance, lowest latency: C++ API, Java API and OpenJPA plug-in for Object/Relational Mapping. Can also access via LDAP servers and HTTP with an Apache moduleMgmt Nodes: used at start-up, used to add nodes, to reconfigure the cluster, arbitration if there is a network failure avoid split brain as determine which side of the cluster assumes ownership of servicing requests

So, its pretty simple always have at least 2 copies of your data held in the data nodes, and accessed via series of app nodesMySQL Cluster Manager is implemented as a set of peer-level agents; one running on each host.

DBA uses the standard MySQL client to connect to any one of these agents; the chosen agent will then cooperate with all of the other agents to perform the requested operation.

If an agent fails then just restart the process and it will be bought back up to date.

Management responsibilitiesStarting, stopping & restarting nodesConfiguration changesUpgradesHost & Node status reportingRecovering failed nodes*The example here shows the step to install, configure and run MySQL Cluster using MCM; dont worry about the exact commands we have plenty of training and reference material.*Included this to show how what used to be a complex operation is made far simpler and less prone to human error. Without using MySQL Cluster Manager, there would be dozens of commands to run to upgrade even a simple deployment make a mistake and you may have a temporary loss of service.

Which MCM this is reduced to a single command which goes away and restarts each process with the right software in the right sequence.

Stop by the MySQL Cluster demo-pod to see this in action.**Partitioning 1 of the keys to scalability partitioning automatically in Cluster, no need to implement sharding at application level so keeps things very simple

Unlike most other MySQL storage engines, the data for MySQL Cluster tables is not stored in the MySQL Server instead it is partitioned across a pool of data nodes as shown in the chart. The rows for a table are divided into partitions with each data node holding the primary fragment for one partition and a secondary (backup) fragment for another. By default, partitioning is based on a hashing of the primary key for a table.

To get the best performance, use a common component of the primary key (e.g. SessionID) across all of the tables that a transaction will access -> whole transaction can be processed on a single data node -> less messaging required.

By using this partitioning approach, we can distribute multiple copies of the data across multiple nodes, and scale perf very effecitvely, and give HA.

*When a change is made, a 2-phase commit protocol is used to ensure that the change is simultaneously applied to both data nodes in the node group by the time the transaction is committed.

If a data node fails or is being restarted for maintenance reasons then the remaining data node processes all reads/writes and then updates the other data node when it returns to the Cluster.

As the replication is synchronous, the latency, throughput and reliability of the network interconnect is important.*A heart beat message is passed round all of the data nodes in the Cluster in a round-robin fashion. If no heartbeats are received for 3x the configurable heart-beat interval then the nodes right-hand neighbour initiates a protocol to remove the failed node from the Cluster.

A more complex scenario would be where a subset of the nodes became isolated from the others; part of the protocol is to check that any community of nodes in communication with each other represents the one and only viable subset of the Cluster: at least one data node from each node group and if there is the possibility of a second community, refer to the arbitrator to discover is should stay up or not.

When the failed node is brought back into service (manual or automated) it rejoins the Cluster and is automatically brought back up to date using the set of changes recorded by its peer in the node group.

**To further support continuous operation, MySQL Cluster has the ability to add nodes on-line to a running cluster by automatically re-partitioning data as new node groups are added, ensuring the cluster maintains continuous operation.

Can also add new MySQL servers with no loss of service.

MySQL Cluster also allows on-line updates to a live database schema, in addition to upgrades and maintenance to the underlying hardware & software infrastructure.

Cluster backups are in-service.

On-line schema changes: add-column, add-index, drop-index**Data nodes within the cluster replicate synchronously with each other almost always in the same data center.

Can also have asynchronous replication to one or remote Clusters. Unlike other MySQL storage engine Cluster supports active-active replication with conflict detection and resolution (does require some work from the application).

Can also replicate to databases that are using different storage engines (for example to allow complex report generation).*When we look at perf:

Ran industry std DBT2 test on the latest major release of MySQL Cluster: 7.02-4 socket commodity h/w

4 x data nodes, scaled performance by adding more MySQL Server instances / connectionsSustain 250k tpm. Each Tx = 30 operations, 125k op/s, average latency of 3ms with synchronous replicationData NodesSun Fire x4450sSQL NodesSun Fire x4600s & x4450sOpenSolarisGigabit Ethernet

We also compared Cluster 7 perf and scalability vs the previous release, and saw over 4x higher throughput on the same h/w

*Recently benchmarked MySQL Cluster against the MEMORY Storage and was able to perform nearly 30x more transactions per second on a single node running a READ / WRITE Sysbench test suite. Also had checkpointing and logging enabled for Cluster so providing data durability, something MEMORY cant do if we turned c-pointing off, perf of Cluster would have been even higher

Performance of the MEMORY storage engine for READ / WRITE workloads is severely limited due to table locking when processing updates. For the Sysbench workload, WRITE operations comprise 30% of the total mix of queries so have almost no scalability at all beyond a single thread when using the MEMORY storage engine.

When running Sysbench as a purely read-only workload, scalability in the MEMORY storage engine was significantly improved as table locks are eliminated. When measuring performance up to 128 client connections, MEMORY delivered an average of 2.8x higher throughput than MySQL Cluster. This type of performance will only be seen for those workloads which comprise absolutely no UPDATE operations. Also unlike MEMORY, scale Cluster across multiple nodes to achieve perf requirements

END++++++++++++++++++++++++++++++++++++++++++++++++++ Benchmark DescriptionThe Sysbench benchmark was run a single server, configured as follows:4-socket server equipped with 6-core 2.8GHz x86 processors.64GB RAM.2 x 7,200 RPM Hard Disk Drives configured as RAID 0.Fedora 12 Linux Operating System.Single MySQL Server instance. MySQL Cluster was configured with a single data node, with check-pointing and logging enabled.

The Sysbench transactions consist of a set of queries with a mix of reads, updates, deletes and inserts. The reads comprise access by both primary key and by a number of range scans. Each transaction performs between 14 and 21 queries. For the READ/WRITE benchmark, around 30% of the queries are UPDATE operations.

**Announce Cluster 7 at UC 3 main messagesUC in April, we announced GA release of Cluster 7.1

Focus of this release is improving performance and reducing cost of operations

Performance relates to Java services delivered MySQL Cluster Connector for Java, feature rich ORM solution providing a native Java interface embedded directly into apps and an OpenJPA plug-in to directly access data within the cluster from a Java app, without going via conventional JDBC / SQL path so you get higher perf, low latency and a more natural environment for Java persistence

Cost of operations:, released NDBINFO: provides monitoring of the cluster access to real-time status info as series of SQL views (always been a bit of a black box, lot of trawling through logs to see what is happening) allows admins to Monitor & optimize cluster performance & availability

Faster restart via better caching better internal communication within the Cluster which is delivering 20x speed up on complex, write-intensive workloads

Also released MySQL Cluster manager: this is part of the commercial edition of MySQL Cluster only. Automates and simplifies cluster management, reduces load and complexity for DBAsRather than having to wade through logs or know secret developer tools, you can now get access to whats going on within the data nodes using the familiar SQL interface.

Examples include index and data memory usage, buffer occupancy, numbers of each type of operation performed on each data nodes, status of nodes and inter-node connections.

As the interface is SQL you can slice and dice all of this information to get exactly the output you need.

Probably simplest to step through a couple of examples...****Prior to MySQL Cluster 7.1 Java applications had a choice between using JDBC or writing their own JNI wrapper around the C++ NDB API: JDBC familiar to Java designers but it is relational rather than dealing in objects. Also inefficient as Java app has to convert to relational view, JDBC then converts to SQL for the MySQL Server and the MySQL Server converts it to NDB API calls: adds latency & complexity. Can add JPA as a layer on top so that apps dont have to perform the ORM but that is an extra source of latency. NDB API: Best possible performance and lowest latency. Complex for Java developers to use.

The 2 new access methods allow application to work purely with objects and also cut out the SQL middle-man to get better performance.

ClusterJ is our own version of an ORM; can also use ClusterJPA as a plugin for OpenJPA to allow it to use ClusterJ whenever possible for better performance.

Can of course use Hibernate or TopLink through the JDBC path.**Y-axis is the CPU time consumed for each test, on X-axis show 5 different access methods for different classes of operations.

For the best apples-to-apples comparison compare the yellow and green bars. Green is OpenJPA using traditional JDBC path, yellow is when it is instead going the ClusterJ route.In Cluster today, a join is implemented as a nested loop join within the MySQL Server. The data is not held there and so the MySQL server has to keep on making trips to the data nodes to read the next piece of data for the join. This works but there is a clear performance impact which makes Cluster unsuitable for a significant number of applications.

With SPJ the MySQL Server can send many of these join operations down to the data nodes which then calculate the results; removing a huge number of round trips and so reduce latency.

If the query is more complex then the MySQL Server may break it down into multiple requests.

This does not help every type of join but for those it does the results can be spectacular e.g. 42x improvement has been observed; in some cases you may need to optimise your schema and/or queries to see the benefit.*NDBINFO exposes information on whats happening within the data nodes via SQL, therefore build rules and graphs first time enables degree of data node monitoring from MEM

Alarms generate SNMP traps. Exposed via Mysqld acting as a proxyCant see data nodes just the data they expose through NDBINFO.

MySQL Enterprise Monitor 2.3 is aiming to deliver a Cluster advisor (set of rules that can generate alerts) as well as a set of graphs. No reason why the end user couldnt add their own (for example to graph the number of table scans).***Look at the features that make up MCM 1.0 release:

Divided them into 3 areas of functionality:ManagementMonitoring HA Operations

Most of the design has been focused on Mgmt functionality: Due to the complexity of managing clustered database environments, development efforts have been prioritized towards automating and simplifying common management tasks, thereby reducing cost, risk and complexity for our users. Also focused on ensuring that the operation of MCM complements and enhances the 99.999% availability of Cluster We expect to add more management, monitoring and analysis capabilities to MySQL Cluster Manager in future releases.

Discuss each of these in turnMySQL Cluster Manager 1.0 went GA in April together with Cluster 7.1.

It is very much focussed on making management operations simpler, faster and safer to perform.

Without MCM, the DBA is responsible for editing configuration files, starting, stopping & restarting nodes in the correct order make a mistake and service is lost.

With MCM you manage the Cluster through a single command-line-interface. No need to edit and duplicate configuration files across the Cluster. Complex operations are reduced from dozens of commands to a single command.

MCM also extends the HA of MySQL Cluster in that it can be configured such that if any node fails (MySQL Server, management node or data node) then it can automatically be restarted. Without Cluster Manager this is limited to data nodes.Development work continues for MCM; one of the main features targeted for the next release is the automation of on-line add-node.

This is possible without MCM but involves a lot more steps.

Again, stop by the demo-pod to see this in action.*******

2010_12mysqlclusteroverview

Documents