set up & operate open source oracle replication
DESCRIPTION
Oracle's expensive and complex replication makes it difficult to build cost-effective applications that move data in real-time to data warehouses (Oracle, Hadoop, Vertica) and popular databases like MySQL. Fortunately, Continuent Tungsten offers a solution. In this virtual course, you will learn how Continuent Tungsten solves problems with Oracle replication at a fraction of the cost of other solutions and with less management overhead too – think "Oracle GoldenGate without the price tag!” Course Topics: - Replicator installation and configuration - Provisioning existing schema and tables from Oracle into another server - Replicating between Oracle schemas - Replicating from Oracle to MySQL and data warehouses - Tips for solving common problems and improving performance. The course includes live demos to illustrate operation. By the end of the course you should be ready move ahead with your own deployments.TRANSCRIPT
©Continuent 2014
Replicate from Oracle to Oracle, Oracle to MySQL
and Oracle to AnalyticsMC Brown, Senior Information Architect Linas Virbalas, Senior Software Engineer
©Continuent 2014
Introducing Continuent
2
• The leading provider of clustering and replication for open source DBMS
• Our Product: Continuent Tungsten
• Clustering - Commercial-grade HA, performance scaling and data management for MySQL
• Replication - Flexible, high-performance data movement
©Continuent 2014
Quick Continuent Facts
• Largest Tungsten installation processes over 700 million transactions daily on 225 terabytes of data
• Tungsten Replicator was application of the year at the 2011 MySQL User Conference
• Wide variety of topologies including MySQL, Oracle, Vertica, and MongoDB are in production now
• MySQL to Hadoop deployments are now in progress with multiple customers
3
©Continuent 2014©Continuent 2014
Continuent Tungsten Customers
4
1
©Continuent 2014
Tungsten Replicator
5
Tungsten Replicator is a fast, open source, database
replication engine !
Designed for speed and flexibility GPL V2 license
100% open source Annual support subscription available from Continuent
©Continuent 2014
Tungsten Master/Slave in Action
6
Master
(Transactions + Metadata)
Slave
THL
DBMS Logs
Replicator
(Transactions + Metadata)
THLReplicator
Download transactions via network
Apply using JDBC
©Continuent 2014
Master Replication Service
7
Extract Filter Apply
StageExtract Filter Apply
Stage
Pipeline
MySQL Master
Transaction History Log
In-Memory Queue
Slave ReplicatorsBinlog
tcp/ip
©Continuent 2014
Slave Replication Service
8
Extract Filter Apply
StageExtract Filter Apply
StageExtract Filter Apply
Stage
Pipeline
Transaction History Log
In-Memory Queue
Slave DBMS
Master Replicator
tcp/
ip
©Continuent 2014
Multiple Services per Replicator
Service frommysql
Service fromoracle
frommysql
fromoracle
Replicator
Replicator
Replicator
Replicator
Aggregated
©Continuent 2014
star
master-slave Heterogeneous
fan-in slave all-masters
MySQL
Oracle
Oracle
MySQL Oracle
Oracle
MySQL MySQL
©Continuent 2014
Replicating from Oracle to Oracle
©Continuent 2014©Continuent 2013
Steps to Homogeneous Replication
1. Restore backup on the slave
2. Set up replication
3. Continue real-time replication
Tungsten Replicator
tpmsetupCDC &
©Continuent 2014
Preparing CDC
13
Tables
Source schema
setupCDC
Oracle CDC (Synchronous or
Async Hotlog)
Publisher schema
Change tables
Capture started at <setupCDC_SCN>
©Continuent 2014
CDC in Action
14
Source table X
Source schema
Oracle CDC (Synchronous or
Async Hotlog)
Publisher schema
Change table X_CT
INSERTUPDATEDELETEx
SCN++
©Continuent 2014©Continuent 2013
Installation
15
./current/tools/tpm configure fromora \ --enable-heterogenous-service=true \ --user=oracle \ --install-directory=/opt/fromora/continuent \ --members=alhpa,beta \ --master=alpha !./current/tools/tpm configure fromora --hosts=alpha \ --datasource-type=oracle \ --user=oracle \ --datasource-oracle-service=ORCL \ --replication-user=DEMO_PUB \ --replication-password=DEMO_PUB \ --svc-table-engine=CDCASYNC \ --property=replicator.global.extract.db.user=tungsten \ --property=replicator.global.extract.db.password=secret \ --property=replicator.extractor.parallel-extractor. ChunkDefinitionFile=/opt/fromora/chunks.csv !./tools/tpm configure fromora --hosts=beta \ --datasource-type=oracle \ --datasource-oracle-service=ORCL \ --replication-user=DEMO \ --replication-password=DEMO
©Continuent 2014
Deployment
16
Service fromora
Tungsten Master Replicator
Service fromora
Tungsten Slave Replicator
demo schema
demo_pub schema
Oracle CDC (Synchronous or
Async Hotlog)
Tungsten Slave Replicator
OracleApplierOracleCDCExtractor
©Continuent 2014©Continuent 2013
Heterogeneous Replication
©Continuent 2014
Use Case: Web Content Publishing
Real-Time Publication
Backend Office Web-Based Catalog
©Continuent 2014©Continuent 2013
Steps to Heterogeneous Replication
1. Prepare (translate) schema for the slave DBMS
2. Set up replication
3. Provision initial data
4. Continue real-time replication
ddlscan
Tungsten Replicator
Parallel ApplyParallel Extract &
tpmsetupCDC &
©Continuent 2014©Continuent 2013
(Open Source)
©Continuent 2014©Continuent 2013
1. Translating schema for the slave
©Continuent 2014©Continuent 2013
Translating Schema
22
• Beginning - how to convert tables?
empty
Tables
•Data types? •Column lengths? •Naming conventions? •Reserved words?
ddlscan
©Continuent 2014©Continuent 2013
ddlscan
23
• Part of Tungsten Replicator, GPL v2
• Translates schema with replication in mind
• Provides errors and warnings
• Can rename schema/tables/columns
©Continuent 2014©Continuent 2013
Usage (Oracle to MySQL Example)
24
$ cd tungsten-replicator/bin !
$ ./ddlscan \ -db DEMO \ -template ddl-oracle-mysql.vm
©Continuent 2014©Continuent 2013
Translating Schema
25
• ddlscan looks into source schema
Tables
empty
©Continuent 2014©Continuent 2013
Translating Schema
26
• ddlscan translates and renders DDL commands
mysql-ddl.sql
Tables
empty
©Continuent 2014©Continuent 2013
Result of ddlscan
27
DROP TABLE IF EXISTS demo.test; CREATE TABLE demo.test ( id1 INT /* NUMBER(10, ?) */ NOT NULL, id2 INT /* NUMBER(10, ?) */ NOT NULL, val TINYINT /* NUMBER(3, ?) */, owner VARCHAR(30) /* VARCHAR2(30) */ NOT NULL, created DATETIME /* DATE */ NOT NULL, PRIMARY KEY (id1, id2) /* WARN: no PK found, using suitable unique index instead: UQ_UK1 */ ) ENGINE=InnoDB; !CREATE TABLE talks ...
©Continuent 2014©Continuent 2013
Translating Schema
28
• You run resulting SQL file on Oracle
mysql-ddl.sql
Tables
empty
©Continuent 2014©Continuent 2013
Translating Schema
29
• Tables are ready!
Translated tables (no rows)
Tables
©Continuent 2014
Preparing CDC
30
Tables
Source schema
setupCDC
Oracle CDC (Synchronous or
Async Hotlog)
Publisher schema
Change tables
Capture started at <setupCDC_SCN>
SCN++
©Continuent 2014©Continuent 2013
Connecting the Dots
31
• Preparation completed
Translated tables (no rows)
Tables Change tables
…
©Continuent 2014©Continuent 2013
2. Set Up Replication
©Continuent 2014©Continuent 2013 33
./current/tools/tpm configure fromora \ --enable-heterogenous-service=true \ --user=oracle \ --install-directory=/opt/fromora/continuent \ --members=alhpa,beta \ --master=alpha !./current/tools/tpm configure fromora --hosts=alpha \ --datasource-type=oracle \ --user=oracle \ --datasource-oracle-service=ORCL \ --replication-user=DEMO_PUB \ --replication-password=DEMO_PUB \ --svc-table-engine=CDCASYNC \ --property=replicator.global.extract.db.user=tungsten \ --property=replicator.global.extract.db.password=secret \ --property=replicator.extractor.parallel-extractor. ChunkDefinitionFile=/opt/fromora/chunks.csv
Oracle Master Part
©Continuent 2014©Continuent 2013
MySQL Slave Part
34
!!./current/tools/tpm configure fromora --hosts=beta \ --user=tungsten \ --replication-user=tungsten \ --replication-password=secret \ --svc-applier-filters=CDC,casetransform,rename \ --property=replicator.filter.CDC.from=DEMO_PUB.HEARTBEAT \ --property=replicator.filter.CDC.to=tungsten_fromora.heartbeat \ --property=replicator.filter.casetransform.to_upper_case=false \ --property=replicator.filter.rename.definitionsFile= /opt/frommysql/rename.csv
©Continuent 2014
Deployment
35
Service fromoracle
Tungsten Master Replicator
OracleCDCExtractor No Special Filters
Service fromoracle
Tungsten Slave Replicator
Special Filters •Map names to lower case •Ignore extra tables •Heartbeat table renaming
demo schema
demo_pub schema
Oracle CDC (Synchronous or
Async Hotlog)
MySQLApplierOracleCDCExtractor
©Continuent 2014©Continuent 2013
3. Provision & Replication
©Continuent 2014©Continuent 2013
Parallel Extractor
37
• trepctl online -provision <setupCDC_SCN>
Tables
Service fromoracle
Tungsten Master Replicator
THL
ParallelExtractor
Thread 1Thread 2Thread 3
SELECT … AS OF <setupCDC_SCN>
©Continuent 2014©Continuent 2013
Real-time Replication
38
• Automatically switches to real-time extraction
Service fromoracle
Tungsten Master Replicator
THL
Oracle CDC (Synchronous or
Async Hotlog)
Change tables
>=$setupCDC_SCN
OracleCDCExtractor
Tables
©Continuent 2014
Accelerate the Slave Provision
39
Extract Filter Apply
StageExtract Filter Apply
StageStage
Pipeline
Remote Master
Transaction History Log
Parallel Queue
Slave DBMS
Extract Filter ApplyExtract Filter ApplyExtract Filter Apply
(Assign Shard ID)
©Continuent 2014
Replication to Vertica and
Hadoop
©Continuent 2014
The Data Warehouse Impedance Mismatch
41
Replication
CSV FilesCSV FilesBuffered
Transactions
Dump/load
Single Transactions
Batches
©Continuent 2014
Column Store--Real-Time Batches
MySQL/Oracle Tungsten Master Replicator
Service ora2vr
Special Filters * pkey - Fill in pkey info * colnames - Fill in names * replicate - Ignore tables
Tungsten Slave Replicator
Service ora2vr
CSV FilesCSV FilesCSV FilesCSV FilesCSV Files
Large transaction batches to leverage load parallelization
©Continuent 2014
Batch Loading--The Gory Details
Replicator
Service ora2vrTransactions from master
CSV FilesCSV FilesCSV Files
Staging TablesStaging TablesStaging Tables
Base Tables
Base Tables
Base Tables
Merge Script
(or) COPY
directly to base tables
COPY to stage tables SELECT to
base tables
©Continuent 2014
Basic Hadoop Loading
MySQL/Oracle Tungsten Master Replicator
hadoop
Master-Side Filtering * pkey - Fill in pkey info * colnames - Fill in names * replicate - Subset tables to be replicated
Tungsten Slave Replicator
hadoop
CSV FilesCSV FilesCSV FilesCSV FilesCSV Files
Hadoop Cluster
Extract from source DBMS
Load raw CSV to HDFS (e.g., via LOAD DATA to
Hive)
Access via Hive
©Continuent 2012
Provisioning plus Replication
45
MySQL/Oracle
Tungsten 3.0 Master
hadoop
Tungsten 3.0 Slave
hadoop
CSV FilesCSV FilesCSV FilesCSV FilesCSV
Apache Sqoop/ETL
Fast data filtering
Buffered CSV
Programmable load scripts
Parallel applyParallel table
dumps
Low impact extraction
©Continuent 2014
Materialized Views
Transaction logs Snapshot
UNION ALL
Emit last row per key if not a delete
MAP
REDUCE
Materialized view including all updates
Sort by key(s), transaction orderSHUFFLE
©Continuent 2014
Tungsten Replicator 3.0 & Hadoop
47
• Extract from MySQL or Oracle
• Base Hadoop plus commercial distributions: Cloudera, HortonWorks, MapR, IBM, Apache
• Provision using Sqoop or parallel extraction
• Automatic replication of incremental changes
• Transformation to preferred HDFS formats
• Schema generation for Hive
• Tools for generating materialized views
©Continuent 2014
Getting Started!
48
• Tungsten Replicator builds are available on code.google.com http://code.google.com/p/tungsten-replicator/
• Replicator documentation is available on Continuent website http://docs.continuent.com/tungsten-replicator-3.0/deployment-hadoop.html
• Tungsten Hadoop tools are available on GitHub https://github.com/continuent/continuent-tools-hadoop
Contact Continuent for support
©Continuent 2014
We’re Hiring!
Continuent Web Page: http://www.continuent.com
!
Tungsten Replicator: http://code.google.com/p/tungsten-replicator