elasticsearch cluster deep dive

ElasticsearchCluster deep dive

NoSQL: Text Search and Document

Elasticsearch cluster

Cluster documentation

Distributed

Client Nodes

Data Nodes

Master Nodes

Ingest Nodes

Today view of the cluster

Other Nodes

Master Nodes

What happen when a node starts?

Starting

Starting1. Get a list of nodes to ping from config

Master

Starting1. Get a list of nodes to ping from config2. Each response contains:

a. cluster nameb. node detailsc. master node detailsd. cluster state version

Starting1. Get a list of nodes to ping from config2. Each response contains:

a. cluster nameb. node detailsc. master node detailsd. cluster state version

3. Only keeps master eligible responses based on discovery.zen.master_election.ignore_non_master_pings

Starting● List of master nodes: [C, C]● List of eligible master nodes: [A, B, C]

Starting1. Join master node (C) sending:

internal:discovery/zen/join

internal:discovery/zen/join2. Master validates join sending:

internal:discovery/zen/join/validate

Cluster state update

3. Master update the cluster state with the new node

4. Master waits for discovery.zen.minimum_master_nodes master eligible to respond

5. Change commited and confirmation sent

Starting1. New node check the received state for

a. new master nodeb. no master node in the state

Master fault detection

Started● Every discovery.zen.fd.ping_interval

nodes ping master (default 1s)● Timeout is

discovery.zen.fd.ping_timeout (default 30s)

● Retry is discovery.zen.fd.ping_retries (default is 3)

Node fault detection

Started● Every discovery.zen.fd.ping_interval

nodes ping master (default 1s)● Timeout is

discovery.zen.fd.ping_timeout (default 30s)

● Retry is discovery.zen.fd.ping_retries (default is 3)

Master election

Minimum of candidate required

Master election

Network Partition

C Master election cannot happen, master steps down

Network Partition

Master fault detection triggers new master election

Master election

1. Based on the list of master eligible nodes it chooses in priority:a. The node with the higher cluster state version (part of the ping response)b. Master eligible nodec. Sort alphabetically the id of the remaining a take the first

2. Sends a join to this new master. In the meantime it accumulates join requests

If the current node elected itself as master it waits for the minimum join requests to declare itself as master (discovery.zen.minimum_master_nodes)

In case of master failure detection, each node removes the failed master from the candidates.

Latest cluster version

Lost update partially fixed in 5.0 found by jepsen test

Cannot become the master

Shard allocation

Shard assigned to new node

1. Master will rebalance shard allocation to have:a. same average number of shard per nodeb. same average of shard per index per node avoiding 2 shard with the

same id on the same node2. Uses deciders to decide which shard goes where based on

a. Hot/Warm setup (time based indices)b. Disk usage allocation (low watermark and high watermark)c. Throttling (node is already recovering, master might again later)

Shard initialization (Primary)

1. Master communicate through cluster state a new shard assignment2. Node initialize an empty shard3. Node notify the master4. Master mark the shard as started5. If this is the first shard with a specific id, it is marked as primary is

receives requests

Shard initialization (Replica)

1. Master communicate through cluster state a new shard assignment2. Node initialize recovery from the primary3. Node notify the master4. Master mark the replica as started5. Node activate the replica

Shard recovery

S1S2S3

Memory

S1S2S3

Commit point

In memory buffer

Translog

Recovery from primary

Node with Primary Node with Replica

Start Recovery

1. Validate request2. Prevent translog from deletion3. Snapshot Lucene

Start Recovery

Segments

Start Recovery

Segments

Translog

Start Recovery

Segments

Translog

Notifies master

Thank you !

elasticsearch cluster deep dive

Technology

elasticsearch, logstash & kibana - scale · copyright...

elasticsearch: accelerating the django admin ·...

elasticsearch -...

2016 - ignite - an elasticsearch cluster named george...

elasticsearch documentation

bda402 deep dive: log analytics with amazon elasticsearch...

elasticsearch py

elasticsearch intro

cloudwatch deep dive - amazon s3 · managing, monitoring &...

amazon elasticsearch service security deep dive - aws online...

dive into dsl: digital response analysis with elasticsearch...

elasticsearch terminology

cluster health advisor (cha) deep dive by mark scardina

neo4j integration with elasticsearch - elasticsearch meetup

elasticsearch for logs & metrics - a deep dive

dell emc search · above elasticsearch. elasticsearch...

elasticsearch, logstash & kibana - · pdf filecopyright...

elasticsearch 5.0

elasticsearch - introduction

elasticsearch - life inside a cluster