vertafore: database evaluation - selecting apache cassandra

83
Database Evaluation: Selecting Apache Cassandra

Upload: datastax-academy

Post on 16-Jan-2017

341 views

Category:

Technology


2 download

TRANSCRIPT

Page 1: Vertafore: Database Evaluation - Selecting Apache Cassandra

Database Evaluation: Selecting Apache Cassandra

Page 2: Vertafore: Database Evaluation - Selecting Apache Cassandra

Introduction

2© 2015. All Rights Reserved.

@ChrisMonosmith [email protected] github.com/cmmonosmith

Page 3: Vertafore: Database Evaluation - Selecting Apache Cassandra

© 2015. All Rights Reserved. 3

Introduction

software engineer at vertafore, >6 yearsgotten hands in almost every project in east lansing, enhancing or maintainingattended some talks, 2014 cassandra summit, first time speaking outside of the office

Page 4: Vertafore: Database Evaluation - Selecting Apache Cassandra

Prelude

© 2015. All Rights Reserved. 4

first a little history

Page 5: Vertafore: Database Evaluation - Selecting Apache Cassandra

Prelude

© 2015. All Rights Reserved. 5

monolith farmBig Oracle databases with VPDs, PL/SQLWebLogic clusters, Large web applications

Page 6: Vertafore: Database Evaluation - Selecting Apache Cassandra

A New Adventure

© 2015. All Rights Reserved. 6

Page 7: Vertafore: Database Evaluation - Selecting Apache Cassandra

A New Adventure

© 2015. All Rights Reserved. 7

green field developmentguidelines (next slides)

Page 8: Vertafore: Database Evaluation - Selecting Apache Cassandra

A New Adventure

8© 2015. All Rights Reserved.

0.001%

99.999%

Up Down

Modern goals for a modern system

Page 9: Vertafore: Database Evaluation - Selecting Apache Cassandra

A New Adventure

9© 2015. All Rights Reserved.

scalable, get it? but easy, intuitive

Page 10: Vertafore: Database Evaluation - Selecting Apache Cassandra

A New Adventure

10© 2015. All Rights Reserved.

maximum security, but not in a way that impedes us or our performance

Page 11: Vertafore: Database Evaluation - Selecting Apache Cassandra

A New Adventure

11© 2015. All Rights Reserved.

“money is irrelevant to the evaluation.”money is always relevant…

Page 12: Vertafore: Database Evaluation - Selecting Apache Cassandra

Choose Your Party

© 2015. All Rights Reserved. 12

“where do we start?”“find something to evaluate”

Page 13: Vertafore: Database Evaluation - Selecting Apache Cassandra

Choose Your Party

© 2015. All Rights Reserved. 13

etc…

limited exposure to nosql or non relational databasesGoogle “nosql”

Page 14: Vertafore: Database Evaluation - Selecting Apache Cassandra

Choose Your Party

© 2015. All Rights Reserved. 14

There are so many systems, and they all excel at everysomethingHow do you choose what makes the cut?

Page 15: Vertafore: Database Evaluation - Selecting Apache Cassandra

Choose Your Party

Consider Your Data Model And Goals

© 2015. All Rights Reserved. 15

evaluation guidelines: good response times, good throughput99th percentiles should also be “good”

Page 16: Vertafore: Database Evaluation - Selecting Apache Cassandra

Choose Your Party

© 2015. All Rights Reserved. 16

we care about entities and relationships

Page 17: Vertafore: Database Evaluation - Selecting Apache Cassandra

Choose Your Party

17© 2015. All Rights Reserved.

we also care about the history of these entities

Page 18: Vertafore: Database Evaluation - Selecting Apache Cassandra

Choose Your Party

© 2015. All Rights Reserved. 18

document stores and key-value stores were not on our keep list.Scary blog posts about data modelling relationships with document stores like MongoDB

Page 19: Vertafore: Database Evaluation - Selecting Apache Cassandra

Choose Your Party

© 2015. All Rights Reserved. 19

document stores and key-value stores were not on our keep list.Scary blog posts about data modelling relationships with document stores like MongoDB

Page 20: Vertafore: Database Evaluation - Selecting Apache Cassandra

Choose Your Party

© 2015. All Rights Reserved. 20

column stores and data abstraction layers looked to be worth our time

Page 21: Vertafore: Database Evaluation - Selecting Apache Cassandra

Choose Your Party

© 2015. All Rights Reserved. 21

from here we took a deeper dive into documentationwe eventually dropped hbase because it seemed to be for a different scalewe also eventually dropped datomic because it was very new

Page 22: Vertafore: Database Evaluation - Selecting Apache Cassandra

Choose Your Party

The Incumbent

© 2015. All Rights Reserved. 22

in a manner of speaking, everything would be measured against oracle

Page 23: Vertafore: Database Evaluation - Selecting Apache Cassandra

Level Select

© 2015. All Rights Reserved. 23

Page 24: Vertafore: Database Evaluation - Selecting Apache Cassandra

Level Select

Choose An Environment That Is Advantageous

© 2015. All Rights Reserved. 24

Page 25: Vertafore: Database Evaluation - Selecting Apache Cassandra

Level Select

25© 2015. All Rights Reserved.

as powerful a machine as this might be…other processes, limited cores and memorycluster with cassandra cluster manager can’t take advantage of optimized write path in cassandra due to extra disk seeks

Page 26: Vertafore: Database Evaluation - Selecting Apache Cassandra

Level Select

26© 2015. All Rights Reserved.

In-house Virtual machinesmay or may not give you more flexibility, depending on who manages them

Page 27: Vertafore: Database Evaluation - Selecting Apache Cassandra

Level Select

27© 2015. All Rights Reserved.

the cloud. e.g. AWS, Microsoft Azurewe used one called Skytapconsider: operating systems, cores, memory, and network interfaces, securitycost: pay a little now to save a lot later (migration)

Page 28: Vertafore: Database Evaluation - Selecting Apache Cassandra

Level Select

Something You Can Control

© 2015. All Rights Reserved. 28

can’t know all activities up front, need root to mess with stuff

Page 29: Vertafore: Database Evaluation - Selecting Apache Cassandra

Level Select

Follow Documented Best Practices

Trust the Experts

© 2015. All Rights Reserved. 29

Setup Virtual Machines: OS/Kernel, JVM, UtilitiesAgain, factor in project requirements: Do you need a cluster?there’s a reason the default cassandra config is what it is

Page 30: Vertafore: Database Evaluation - Selecting Apache Cassandra

Save Early and Often

© 2015. All Rights Reserved. 30

Page 31: Vertafore: Database Evaluation - Selecting Apache Cassandra

Save Early and Often

© 2015. All Rights Reserved. 31

Just good life advicewhether saving for retirement, saving a game, saving an essay, committing code…also this project

Page 32: Vertafore: Database Evaluation - Selecting Apache Cassandra

Save Early and Often

© 2015. All Rights Reserved. 32

in OOP, developers gain efficiency through re-use. doesn’t stop at codegame developers re-use assets, like tree models, to make forestswhat we did was use VM snapshots to build our world

Page 33: Vertafore: Database Evaluation - Selecting Apache Cassandra

Save Early and Often

© 2015. All Rights Reserved. 33

to do this, we needed helppart of building your world is knowing what it should look likeKnowledge on the business side helps hereCreate projections for Customer base, Data volumeUser profile variance (i.e. partition width in C* terms)Analytics may or may not be obvious…Keep reporting in the back of your mind

Page 34: Vertafore: Database Evaluation - Selecting Apache Cassandra

Save Early and Often

© 2015. All Rights Reserved. 34

Different databases will require independent modelsyou’ll have to work around unique limitationsAsk for help! DataStax helped us a lotnow let’s talk a little more about snapshots

Page 35: Vertafore: Database Evaluation - Selecting Apache Cassandra

Save Early and Often

© 2015. All Rights Reserved. 35

install your database

Page 36: Vertafore: Database Evaluation - Selecting Apache Cassandra

Save Early and Often

36© 2015. All Rights Reserved.

Page 37: Vertafore: Database Evaluation - Selecting Apache Cassandra

Save Early and Often

© 2015. All Rights Reserved. 37

Create your schema

Page 38: Vertafore: Database Evaluation - Selecting Apache Cassandra

Save Early and Often

38© 2015. All Rights Reserved.

Page 39: Vertafore: Database Evaluation - Selecting Apache Cassandra

Save Early and Often

© 2015. All Rights Reserved. 39

Build your cluster

Page 40: Vertafore: Database Evaluation - Selecting Apache Cassandra

Save Early and Often

40© 2015. All Rights Reserved.

Page 41: Vertafore: Database Evaluation - Selecting Apache Cassandra

Save Early and Often

© 2015. All Rights Reserved. 41

Build your data set. maybe encrypt it

Page 42: Vertafore: Database Evaluation - Selecting Apache Cassandra

Save Early and Often

42© 2015. All Rights Reserved.

Page 43: Vertafore: Database Evaluation - Selecting Apache Cassandra

Save Early and Often

© 2015. All Rights Reserved. 43

Add application servers

Page 44: Vertafore: Database Evaluation - Selecting Apache Cassandra

Save Early and Often

44© 2015. All Rights Reserved.

Page 45: Vertafore: Database Evaluation - Selecting Apache Cassandra

Save Early and Often

© 2015. All Rights Reserved. 45

Add stress test servers

Page 46: Vertafore: Database Evaluation - Selecting Apache Cassandra

Save Early and Often

46© 2015. All Rights Reserved.

Page 47: Vertafore: Database Evaluation - Selecting Apache Cassandra

Save Early and Often

© 2015. All Rights Reserved. 47

Add health monitoring services/servers

Page 48: Vertafore: Database Evaluation - Selecting Apache Cassandra

Save Early and Often

48© 2015. All Rights Reserved.

Page 49: Vertafore: Database Evaluation - Selecting Apache Cassandra

Save Early and Often

Do Anything Yourself At Most Once

© 2015. All Rights Reserved. 49

Replace manual steps withSnapshots, obviouslyScriptsSmall services or executablesExisting tools

Page 50: Vertafore: Database Evaluation - Selecting Apache Cassandra

Bring Your Gear

© 2015. All Rights Reserved. 50

tools

Page 51: Vertafore: Database Evaluation - Selecting Apache Cassandra

Bring Your Gear

Check Out The Jepsen Series by Kyle Kingsbury

https://aphyr.com/tags/jepsen

© 2015. All Rights Reserved. 51

Jepsen series has lots of info and examples about different techs performance in terms of CAP theorem and data loss in failure/partition scenariosvery succinctly points out the difficulties of distributed systems

Page 52: Vertafore: Database Evaluation - Selecting Apache Cassandra

© 2015. All Rights Reserved. 52

microservice

Bring Your Gear

microservice

mic

rose

rvic

e

microservice

microservice

We built a representative microservice prototypeOne REST-ish API, many data layers DataStax Java Driver Oracle JDBCHTTP endpoints Creating representative data sets Simulated user requests

Page 53: Vertafore: Database Evaluation - Selecting Apache Cassandra

Apache JMeter

Bring Your Gear

© 2015. All Rights Reserved. 53

A sample service needs sample clientsUsed an existing tool: JMeter (introduce jmeter: load testing and performance)Tools/automation team member built JMeter projects/test suitesIncluded a variety of load types: slow, bursty, firehose, read- vs write-heavyimportant: Executable via command lineimportant: Wrote all results to disk

Page 54: Vertafore: Database Evaluation - Selecting Apache Cassandra

Bring Your Gear

© 2015. All Rights Reserved. 54

microservice

OpsCenter

Generate lots of useful data… need more toolsconsistent format: microservice, jmeter opscenterDo some math, write some CSVs Find averages, mins, maxes… Percentiles are great

Page 55: Vertafore: Database Evaluation - Selecting Apache Cassandra

Bring Your Gear

After a few minutes with Excel…

© 2015. All Rights Reserved. 55

Page 56: Vertafore: Database Evaluation - Selecting Apache Cassandra

Bring Your Gear

© 2015. All Rights Reserved. 56

these are some cassandra numbersaverage and percentile tell a story together

Page 57: Vertafore: Database Evaluation - Selecting Apache Cassandra

Fight Some Dragons

57© 2015. All Rights Reserved.

This is the fun part

Page 58: Vertafore: Database Evaluation - Selecting Apache Cassandra

Fight Some Dragons

58© 2015. All Rights Reserved.

Page 59: Vertafore: Database Evaluation - Selecting Apache Cassandra

Fight Some Dragons

59© 2015. All Rights Reserved.

came in on a saturday. doesn’t happen often, outside of releasespartly because this was great funpartly because it took so long… couldn’t run every test at oncealso some anomalies… will get to that later

Page 60: Vertafore: Database Evaluation - Selecting Apache Cassandra

Fight Some Dragons

© 2015. All Rights Reserved. 60

have account for bad situations

Page 61: Vertafore: Database Evaluation - Selecting Apache Cassandra

Fight Some Dragons

© 2015. All Rights Reserved. 61

what happens when a party member gets knocked out?drop a node during a test

Page 62: Vertafore: Database Evaluation - Selecting Apache Cassandra

Fight Some Dragons

© 2015. All Rights Reserved. 62

what happens when you resurrect the fallen comrade?restore node in cluster during test

Page 63: Vertafore: Database Evaluation - Selecting Apache Cassandra

Fight Some Dragons

© 2015. All Rights Reserved. 63

what happens when a new fighter joins your party?add an additional nodedoes it help, or just get in the way?

Page 64: Vertafore: Database Evaluation - Selecting Apache Cassandra

Fight Some Dragons

“Failure Is Always An Option”

- Adam Savage

© 2015. All Rights Reserved. 64

The Goal of every test is To Achieve Some Sort Of Failure,or you’ve only learned part of the lesson it has to teach

Page 65: Vertafore: Database Evaluation - Selecting Apache Cassandra

Fight Some Dragons

© 2015. All Rights Reserved. 65

When will this thing topple over?How does that compare to our volume/throughput projections?Is there room for growth? Can we easily scale?every db works to some extent, even using a simple text file as a databaseremember “the goal is always to achieve some sort of failure”

Page 66: Vertafore: Database Evaluation - Selecting Apache Cassandra

Fight Some Dragons

Find Where It Fails Here, Not In Production

© 2015. All Rights Reserved. 66

this will probably save you time and money and stressand those Saturdays where you don’t get to just have funwhen you’ve done this, you’ve achieved a major goal of the evaluation

Page 67: Vertafore: Database Evaluation - Selecting Apache Cassandra

Collect Your Loot

© 2015. All Rights Reserved. 67

every good adventure ends with treasure

Page 68: Vertafore: Database Evaluation - Selecting Apache Cassandra

Collect Your Loot

© 2015. All Rights Reserved. 68

take good notes - on everythingdesign journal - documented our findings incrementallyended up with tremendous volume

Page 69: Vertafore: Database Evaluation - Selecting Apache Cassandra

Collect Your Loot

© 2015. All Rights Reserved. 69

it’s like drawing a maplets you retrace your steps, point out pitfalls, help out newcomers

Page 70: Vertafore: Database Evaluation - Selecting Apache Cassandra

Collect Your Loot

© 2015. All Rights Reserved. 70

a series of technical meetings and presentationsteaching is a great way to learn - taught about cassandralots of q&a with different people, DBAs, etcnow you have knowledge, metrics, charts… just line that up with your goals

Page 71: Vertafore: Database Evaluation - Selecting Apache Cassandra

Collect Your Loot

© 2015. All Rights Reserved. 71

Those metrics and charts were real “gems”We brought our findings to management, and…we work with very reasonable people

Page 72: Vertafore: Database Evaluation - Selecting Apache Cassandra

Easter Eggs

© 2015. All Rights Reserved. 72

During our testing, some of our VMs behaved strangelySporadic poor performance, slow startup, just fine laterWe had the metrics to clearly illustrate this!Hard to reproduce, but they eventually didThey pushed out a fix within a couple daysIt never happened again after their fix

Page 73: Vertafore: Database Evaluation - Selecting Apache Cassandra

Easter Eggs

© 2015. All Rights Reserved. 73

teamwork!

Page 74: Vertafore: Database Evaluation - Selecting Apache Cassandra

Replay

Consider Your Data Model And Goals

© 2015. All Rights Reserved. 74

evaluation guidelines

Page 75: Vertafore: Database Evaluation - Selecting Apache Cassandra

Replay

Choose An Environment That Is Advantageous

© 2015. All Rights Reserved. 75

Page 76: Vertafore: Database Evaluation - Selecting Apache Cassandra

Replay

Follow Documented Best Practices

Trust the Experts

© 2015. All Rights Reserved. 76

Page 77: Vertafore: Database Evaluation - Selecting Apache Cassandra

Replay

Do Anything Yourself At Most Once

© 2015. All Rights Reserved. 77

snapshots! tools!

Page 78: Vertafore: Database Evaluation - Selecting Apache Cassandra

Replay

© 2015. All Rights Reserved. 78

“Failure Is Always An Option”

- Adam Savage

and find those failures before you deploy anything

Page 79: Vertafore: Database Evaluation - Selecting Apache Cassandra

Questions or Comments

© 2015. All Rights Reserved. 79

Page 80: Vertafore: Database Evaluation - Selecting Apache Cassandra

Thanks

80© 2015. All Rights Reserved.

Go see Robert Johnson!

robert johnson, online analytics processing

Page 81: Vertafore: Database Evaluation - Selecting Apache Cassandra

Thanks

81© 2015. All Rights Reserved.

We’re Hiring! tinyurl.com/VertaforeEastLansing

we’re hiring

Page 82: Vertafore: Database Evaluation - Selecting Apache Cassandra

Thanks

82© 2015. All Rights Reserved.

@ChrisMonosmith [email protected] github.com/cmmonosmith

me me me

Page 83: Vertafore: Database Evaluation - Selecting Apache Cassandra

Thank you