Replace a Dead Node in a Scylla Cluster

Replace dead node operation will cause the other nodes in the cluster to stream data to the node that was replaced. This operation can take some time (depending on the data size and network bandwidth)

Prerequisites

  • Verify the status of the node using nodetool status command, node with status DN is down and need to be replaced
Datacenter: DC1
Status=Up/Down
State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns (effective)                         Host ID         Rack
UN  192.168.1.201  112.82 KB  256     32.7%             8d5ed9f4-7764-4dbd-bad8-43fddce94b7c   B1
UN  192.168.1.202  91.11 KB   256     32.9%             125ed9f4-7777-1dbn-mac8-43fddce9123e   B1
DN  192.168.1.203  124.42 KB  256     32.6%             675ed9f4-6564-6dbd-can8-43fddce952gy   B1

Login to one of the nodes in the cluster with (UN) status. Collect the following info from the node:

  • cluster_name - cat /etc/scylla/scylla.yaml | grep cluster_name
  • seeds - cat /etc/scylla/scylla.yaml | grep seeds:
  • endpoint_snitch - cat /etc/scylla/scylla.yaml | grep endpoint_snitch
  • Scylla version - scylla --version

Procedure

  1. Check that the dead node isn’t a seed node using cat /etc/scylla/scylla.yaml | grep seeds:. If the dead node IP is in the list it needs to be replaced

2. Install Scylla on a new node, see Getting Started for further instructions. Follow the Scylla install procedure up to scylla.yaml configuration phase. Make sure that Scylla version of the new node is identical to the other nodes in the cluster

3. In the scylla.yaml file edit the parameters listed below. The file can be found under /etc/scylla/

  • cluster_name - Set the selected cluster_name
  • listen_address - IP address that Scylla uses to connect to other Scylla nodes in the cluster
  • seeds - Set the selected seed nodes
  • auto_bootstrap - By default, this parameter is set to true, it allow new nodes to migrate data to themselves automatically
  • endpoint_snitch - Set the selected snitch
  • rpc_address - Address for client connection (Thrift, CQL)

4. Add the scylla replace address in the scylla.yaml file. This line can be added in any place in the file for the first start only. After a sucessfull replace, remove it before Scylla’s next restart.

The correct format is:

replace_address: 192.168.1.203

  1. Start Scylla node

sudo systemctl start scylla-server

  1. Verify that the node has been added to the cluster using nodetool status command

For Example:

The nodes in the cluster streams data to the replaced node

Datacenter: DC1
Status=Up/Down
State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns (effective)                         Host ID         Rack
UN  192.168.1.201  112.82 KB  256     32.7%             8d5ed9f4-7764-4dbd-bad8-43fddce94b7c   B1
UN  192.168.1.202  91.11 KB   256     32.9%             125ed9f4-7777-1dbn-mac8-43fddce9123e   B1
UJ  192.168.1.203  124.42 KB  256     32.6%             675ed9f4-6564-6dbd-can8-43fddce952gy   B1

The nodes in the cluster have successfully completed streaming data to the replaced node

Datacenter: DC1
Status=Up/Down
State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns (effective)                         Host ID         Rack
UN  192.168.1.201  112.82 KB  256     32.7%             8d5ed9f4-7764-4dbd-bad8-43fddce94b7c   B1
UN  192.168.1.202  91.11 KB   256     32.9%             125ed9f4-7777-1dbn-mac8-43fddce9123e   B1
UN  192.168.1.203  124.42 KB  256     32.6%             675ed9f4-6564-6dbd-can8-43fddce952gy   B1
  1. Run the nodetool repair command on the node that was replaced to make sure that the data is synced with the other nodes in the cluster

Procedures