elasticsearch sharding strategy at tubular labs
TRANSCRIPT
Elasticsearch Sharding Strategy at Tubular LabsHow we arrived at a sharding strategy
Our Elasticsearch Infrastructure?
• 3 clusters for search/aggregations
• 1 small autocomplete cluster
• 1 medium sized cluster for internal use
• 1 Elastic Stack cluster
Our Elasticsearch Clusters
© 2016 Tubular Labs
3
• 2.5 billion documents
• 4TB not including replicas
• Constant indexing load with periodic spikes
• Queries range from simple search request to heavy terms aggregations
• Not many concurrent queries, but queries can be demanding
• Cluster is very CPU heavy
• Recently migrated from Elasticsearch 1.7 to 2.3
Our Largest Cluster
© 2016 Tubular Labs
4
• We have to reindex anyway
• Our dataset has grown substantially
• Performance wasn’t great
• We don’t want to have to reindex in the near future
Migrating to 2.x is a good time to reconsider sharding
© 2016 Tubular Labs
5
Sharding Strategy
● How many shards should I have per index?
● How large should my shards be?
● How many shards should I have per node?
● What hardware/instance type should I use?
Sharding Questions...
© 2016 Tubular Labs
7
• How large is your dataset?
• How fast will your dataset grow?
• What kinds of queries are you running?
• How fast will usage grow?
• When do you want to reindex next?
• I’m sure there are more...
It Depends...
© 2016 Tubular Labs
8
How do we get answers?
© 2016 Tubular Labs
9
Repeatable Elasticsearch Experiments
What We Want
• Repeatable• Others can easily run the same tests and should get about the same results
• Easily modified
• Easy to define and understand
• Easy to run
• understandable results
Repeatable Elasticsearch Experiments:
© 2016 Tubular Labs
11
• Benchmarking framework for Elasticsearch
• Easily define a set of repeatable tests• Tests are defined in JSON
• Compare different configurations
• Sets up a single node cluster for tests or
target existing (external) clusters
• Targeting external clusters is not fully supported
and you’ll get warnings telling you as much
What is Rally?
© 2016 Tubular Labs
12
Terms•Track - a benchmarking scenario
•Car - system (Elasticsearch) configuration for a
benchmark
•Challenge - what benchmarks are run and its
configuration
•Race - an actual run of the benchmark
•Tournaments - A way to analyze the impact of
changes
What is Rally?
© 2016 Tubular Labs
13
Example track config
https://gist.github.com/mdelaney/b710fb3d25fabf7818f471bd4abe70a5
How does Rally work?
© 2016 Tubular Labs
14
Our Experiments and Results
NOTE: The following experiments are written as we would do them next time. Due to time constraints we had to do some of this in parallel. I’ll also mention where we deviated from what is in the next few slides.
• We’re still pretty new at running benchmarks with Elasticsearch so we’re still learning the
best way to do this.
• Running these tests answered a lot of questions (and raised brand new ones)
How we used this at Tubular Labs
© 2016 Tubular Labs
16
How big should my shards be?
Determining a good shard size
© 2016 Tubular Labs
17
The experiment
1. Obtain a realistic data set
2. Write the Rally config to:• Index your data (single shard)
• Run a set of common queries
3. Run benchmark with different document counts
4. Graph the results
Determining a good shard size
© 2016 Tubular Labs
18
The queries we used
• Query A and B:• Very similar but aggregate on a slightly different set of terms
• Hits about 10% of our dataset
• Query C and D:• Same aggregations as queries A and B
• Full dataset
Determining a good shard size
© 2016 Tubular Labs
19
Our results
Determining a good shard size
© 2016 Tubular Labs
20
We need to consider
• How fast do you need each query to be?
• How much do you expect your data set to grow before you want to look at reindexing
again?
• Your use case likely will have other concerns as well
Determining a good shard size
© 2016 Tubular Labs
21
How many shards per node?
Determining how many shards per node
© 2016 Tubular Labs
22
The experiment (almost the same as before)
1. Obtain a dataset of realistic data
2. Write the Rally config to:• Index your data
• Run a set of common queries
3. Run benchmark with different shard counts
4. Graph the results
Determining how many shards per node
© 2016 Tubular Labs
23
What we did differently this time (time constraints)
• Used the Apache HTTP Benchmark Tool with a script to run the queries.
• Our production cluster had 26 data nodes with about 200 million documents each
• Wanted to avoid expanding the cluster further if at all possible (c3.8xlarge is pricey!)• 10 total shards per node (about 20 million docs/shard)
• 16 total shards per node (about 12.5 million docs/shard)
• 32 total shards per node (about 6.25 million docs/shard)
• Tested on 3 node clusters (2 data nodes, 1 client/master)
Determining how many shards per node
© 2016 Tubular Labs
24
Our Results - Testing Number of Shards per node
Query response by shard count (C 1) Query response by shard count (C 3)
© 2016 Tubular Labs
25
Our Results - Testing Number of Shards per node
Query response production vs test (C 1) Query response production vs test (C 3)
© 2016 Tubular Labs
26
Production - 26 data nodes
Test Cluster - 2 data nodes
• Significant performance drop in each level of testing, why?
• A single shard on a single node performed much better than our
multiple shards per node tests
• The fully loaded 3 node cluster performed much better than our full
cluster in production
• Impact of moving to a machine with more memory• Will the extra file system cache make a large difference?
New Questions Raised
© 2016 Tubular Labs
27
Query load isn’t evenly distributed
Current path of performance investigation
© 2016 Tubular Labs
28
1 4
3* 2*
5* 8*
10 13*
11 6*
2 5
7* 4*
10* 9*
11* 12*
14 15
3 6
1* 9
13 8
12 7
15* 14*
Problems We Encountered
Rally related
• Document count in track.json != the
document count Rally checks at the end
of indexing with nested documents.
• Multi node support not yet available
Problems We Encountered?
© 2016 Tubular Labs
30
Non Rally related
•Performance in reality wasn’t as good as our testing suggested it should be• We haven’t found the reason for this yet
• We’ve noticed a correlation between the number of shards a query hits per node and the time taken to run the
query on the shard but have not yet identified the bottleneck.
• We were able to mitigate this by adding additional data nodes
Problems We Encountered?
© 2016 Tubular Labs
31
Thank You!Questions??