managing (schema) migrations in cassandra
TRANSCRIPT
![Page 1: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/1.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 1
Managing (Schema) Migrations in Cassandra
Mitch Gitmansenior software engineerGridPoint, Inc.
![Page 2: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/2.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 2
05/03/2023
![Page 3: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/3.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 3
05/03/2023
![Page 4: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/4.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 4
05/03/2023
migrationA word with many meanings.
![Page 5: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/5.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 5
05/03/2023
disclaimer…
image © Ana Camamiel
![Page 6: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/6.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 6
What I mean by migrations• Live-data migrations
05/03/2023
One-off as opposed to ETL
![Page 7: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/7.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 7
What I mean by migrations• Source-driven migrations
− Schema migrations
− Reference data migrations
− Test/sample data migrations
• CQL commands as opposed to real data (sstables), generally
05/03/2023
source control versioning
artifact versioningpublish
![Page 8: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/8.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 8
Database refactoring
05/03/2023
![Page 9: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/9.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 9
• Integration test & functional test automation (bootstrap-ability)• CI server pipelines
• Containerization??
• Consistency & repeatability across environments− Local developer box
− Dev environments
− Integration & QA environments
− Staging
− Production
Source-driven DB refactoring—the benefits
05/03/2023
![Page 10: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/10.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 10
We need tools!• Built into web application frameworks
• Standalone
05/03/2023
![Page 11: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/11.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 11
What do (perhaps) all these tools have in common?
05/03/2023
They’re relational. They’re for SQL.
![Page 12: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/12.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 12
NoSQL Distilled
05/03/2023
Chapter 12. Schema Migrations
"We have seen that developing and maintaining an application in the brave new world of schemaless databases requires careful attention to be given to schema migration."
either/or:• RDBMS = strong schema• NoSQL = no schema
![Page 13: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/13.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 13
05/03/2023
CREATE TABLE entities ( doc_id int, attribute_name String, attribute_value String, ... PRIMARY KEY(doc_id, attribute_name));
• partition keys & clustering keys• table-per-query denormalization• shift from Thrift to CQL
• Thrift: super columns & super column families • CQL: collection types
“metadata-driven documents in columnar storage:”
THE EXCEPTION
Does Cassandra like weak schemas?
So how have teams been managing their keyspace & table definitions?
![Page 14: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/14.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 14
The Cassandra migration tools landscape
05/03/2023
• Flyway: First-class Cassandra support.− Requires JDBC.− https://github.com/flyway/flyway/issues/823
• Pillar: Scala tool.• mutagen-cassandra: Java tool, Astyanax driver.• Trireme: Python tool.• cql-migrate: Python tool.• mschematool: Python tool.
![Page 15: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/15.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 15
What’s the secret behind DB migration tools?
05/03/2023
The migrations version tracking table
![Page 16: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/16.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 16
Migration tool philosophies
05/03/2023
>
SQL
SQL
>
© Martha Stewart Living Omnimedia Inc. © Harpo Print, LLC
![Page 17: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/17.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 17
Flyway for Cassandra
05/03/2023
• First-class Flyway• Faked-out Flyway
migrations
(in SQL)CQL
![Page 18: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/18.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 18
The tradeoff
05/03/2023
• Store the migrations tracking table in an RDBMS
+
![Page 19: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/19.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 19
Programmatically invoke Flyway
05/03/2023
![Page 20: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/20.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 20
05/03/2023
![Page 21: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/21.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 21
CassandraFlywayCallback
05/03/2023
implements FlywayCallback
![Page 22: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/22.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 22
Two-step process
05/03/2023
source control
artifact repository
build time
deploy time
MigrationsBuilder
FlywayMigrator
![Page 23: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/23.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 23
The migrations source
05/03/2023
The input to MigrationsBuilder
![Page 24: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/24.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 24
05/03/2023
build time
Run MigrationsBuilder for CQL:
Run MigrationsBuilder for SQL:
![Page 25: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/25.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 25
The generated migrations
05/03/2023
The output from MigrationsBuilder
build time
![Page 26: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/26.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 26
The generated SQL script
05/03/2023
Faking out Flyway
build time
![Page 27: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/27.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 27
05/03/2023
deploy time
Run FlywayMigrator for CQL:
Run FlywayMigrator for SQL:java -classpath /…/flyway-migrator-postgresql.jar \
com.gridpoint.tools.migrator.flyway.FlywayMigrator postgresql
java -classpath /…/flyway-migrator-cassandra.jar \
com.gridpoint.tools.migrator.flyway.FlywayMigrator cassandra
![Page 28: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/28.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 28
05/03/2023
deploy time
flyway-migrator-postgresql.jarflyway-migrator-cassandra.jar
![Page 29: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/29.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 29
The migrations version tracking table
05/03/2023
The Cassandra incarnation
![Page 30: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/30.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 30
Best practices
05/03/2023
• Variations on versions− Version control: f94c7d7f8b130df360a4e9e4f586eafc618ddc50− Artifact repository: 3.5.1− Migration tool: 201505270800 or 10 or whatever you want− Effective contract versions—multiple versions can coexist at runtime
• Consistent deployment across environments• Failure handling• Baselining• Rollbacks?• Check schema agreement
![Page 31: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/31.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 31
Schema agreement
05/03/2023
https://datastax.github.io/java-driver/2.1.8/features/metadata/
![Page 32: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/32.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 32
Cassandra… migrations… limitations
05/03/2023
• Limitations of our Flyway-based solution− You need a relational database− Not open-sourced
• Limitations of source-driven migrations, in general
![Page 33: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/33.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 33
Static vs. dynamic tables
05/03/2023
![Page 34: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/34.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 34
Deploy time vs. runtime
05/03/2023
Dedicated migration application vs. part of main application
![Page 35: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/35.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 35
Source-driven, but…
05/03/2023
• The orchestration is in source control
• Actual data rather than CQL commands− Not necessarily live data− Maybe doesn’t need to be in source control
![Page 36: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/36.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 36
Embracing polyglot persistence
05/03/2023
A unified migrations solution
![Page 37: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/37.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 37
Takeaways
05/03/2023
•challenging•exciting
•routine•boring+
=
=
![Page 38: Managing (Schema) Migrations in Cassandra](https://reader036.vdocument.in/reader036/viewer/2022081414/589aa2141a28abfc1a8b5803/html5/thumbnails/38.jpg)
© 2015 GridPoint, Inc. Proprietary and Confidential 38
05/03/2023
Thank you!
Mitch [email protected]@[email protected] presence @ LinkedIn