the hardest part of microservices: your data
TRANSCRIPT
Twitter: @christianpostaBlog: http://blog.christianposta.comEmail: [email protected]
Christian PostaPrincipal Architect – Red Hat
• Author “Microservices for Java Developers”• Committer/contributor Apache Camel, Apache
ActiveMQ, Fabric8.io, Apache Kafka, Debezium.io, et. al.
• Worked with large Microservices, web-scale, unicorn company
• Blogger, speaker about DevOps, integration, and microservices
Free download @ http://developers.redhat.com
“Microservices” is about optimizing… for speed.
People try to copy Netflix, but they can only copy what they see. They copy the results, not the process.
Adrian Cockcroft, former Chief Cloud Architect, Netflix
How does your company go fast?
Manage dependencies.
Data is a major dependency.
Wait. What is data?
What is one “thing”?
Book checkout / purchase Title Search
Recommendations
Weekly reporting
Focus on domain models, not data models
• Break things into smaller, understandable models
• Surround a model and its “context” with a boundary
• Implement the model in code or get a new model
• Explicitly map between different contexts
• Model transactional boundaries as aggregates
“A microservice has its own database”
Stick with these conveniences as long as you can.Seriously.
But ...• Load/size is too great to fit on one box• Modules/use cases have different read/write
characteristics• Queries/joins are getting too complex• Security issues• Lots of conflicting changes to the model/schema• Need denormalized, optimized indexing engines• We can live with eventual consistency (whatever
that really means)
Kinda looks like a combinatorial mess….
How do we deal with data in this world?
We’re now building a full-fledged distributed system.Some things to remember…
Plan for failures. Build concepts of time, delay, network, and failures into the design as a first-class citizen.
https://secure.phabricator.com/book/phabcontrib/article/n_plus_one/
https://secure.phabricator.com/book/phabcontrib/article/n_plus_one/
We need “consistency”. But we expect failures. This is starting to sound like CAP theorem…
What is consistency?
The history of past operations we observe as a reader of the data
Consistency models…
https://en.wikipedia.org/wiki/Consistency_model
• Strict consistency (Linearizability)• Sequential consistency• Causal consistency• Processor consistency• PRAM consistency (FIFO)• Bounded staleness consistency• Monotonic read consistency• Monotonic write consistency• Read your writes consistency• Eventual consistency
Linearizable (strict) consistency
Sequential consistency
Monotonic reads consistency
Eventual consistency
Can we really use relaxed consistency models?
Replicated Data Consistency Explained through Baseball (Doug Terry)
https://www.microsoft.com/en-us/research/publication/replicated-data-consistency-explained-through-baseball/
• What consistency model do you need, depending on what role you’re playing?
• What consistency model are you willing to pay for?• Official score keeper? (Linearizability or RMW)• Umpire? (Linearizability)• Sports writer? (Bounded staleness, Eventual
consistency)• Radio updates? (Monotonic read, Bounded
staleness)• Statistician (Bounded staleness)• Friends in the pub (Eventual consistency)
Replicated Data Consistency Explained through Baseball (Doug Terry)
https://www.microsoft.com/en-us/research/publication/replicated-data-consistency-explained-through-baseball/
Tradeoffs to make with read consistency and performance
Maybe we can use a relaxed consistency model for some of those previously mentioned use cases…
Example using sequential consistency…
Internet companies created their own toolsfor helping with this. (some opensource!!)
• Yelp – MySQL Streamerhttps://github.com/Yelp/mysql_streamer
• LinkedIn – Databushttps://github.com/linkedin/databus
• Zendesk – Maxwellhttps://github.com/zendesk/maxwell
Meet debezium.io
Meet debezium.io
Twitter: @christianpostaBlog: http://blog.christianposta.comEmail: [email protected]
Thanks for listening! Time for demo?