Adding a New Node Into an Existing Scylla Cluster (Out Scale)

The adding a new node operation causes the other nodes in the cluster to stream data to the new node, this operation can take some time (This depends on the data size and network bandwidth)

Prerequisites

Before adding the new node check the nodes status in the cluster using nodetool status command. If one (or more), of the nodes is down, a new node can not be added to the cluster

For Example:

Datacenter: DC1
Status=Up/Down
State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns (effective)                         Host ID         Rack
UN  192.168.1.201  112.82 KB  256     32.7%             8d5ed9f4-7764-4dbd-bad8-43fddce94b7c   B1
DN  192.168.1.202  91.11 KB   256     32.9%             125ed9f4-7777-1dbn-mac8-43fddce9123e   B1

In order to proceed, start the node or remove it from the cluster

Login to one of the nodes in the cluster, collect the following info from the node:

  • cluster_name - cat /etc/scylla/scylla.yaml | grep cluster_name
  • seeds - cat /etc/scylla/scylla.yaml | grep seeds:
  • endpoint_snitch - cat /etc/scylla/scylla.yaml | grep endpoint_snitch
  • Scylla version - scylla --version

Procedure

1. Install Scylla on a new node, see Getting Started for further instructions. Follow the Scylla install procedure up to scylla.yaml configuration phase. Make sure that the Scylla version of the new node is identical to the other nodes in the cluster

In the case that the node starts during the process follow these instructions

2. In the scylla.yaml file edit the parameters listed below, the file can be found under /etc/scylla/

  • cluster_name - Set the selected cluster_name
  • listen_address - IP address that Scylla used to connect to the other Scylla nodes in the cluster
  • seeds - Set the selected seed nodes
  • auto_bootstrap - By default, this parameter is set to true, it allow new nodes to migrate data to themselves automatically
  • endpoint_snitch - Set the selected snitch
  • rpc_address - Address for client connections (Thrift, CQL)
  1. Start Scylla node using

sudo systemctl start scylla-server

4. Verify that the node was added to the cluster using nodetool status command. Since the other nodes in the cluster streams data to the new node, the new node will be in Up Joining (UJ) status. It may take some time (Depending on the data size and network bandwidth) until the node status will become Up Normal (UN) status

For Example:

The nodes in the cluster streams data to the new node

Datacenter: DC1
Status=Up/Down
State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns (effective)                         Host ID         Rack
UN  192.168.1.201  112.82 KB  256     32.7%             8d5ed9f4-7764-4dbd-bad8-43fddce94b7c   B1
UN  192.168.1.202  91.11 KB   256     32.9%             125ed9f4-7777-1dbn-mac8-43fddce9123e   B1
UJ  192.168.1.203  124.42 KB  256     32.6%             675ed9f4-6564-6dbd-can8-43fddce952gy   B1

The nodes in the cluster finish streaming data to the new node

Datacenter: DC1
Status=Up/Down
State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns (effective)                         Host ID         Rack
UN  192.168.1.201  112.82 KB  256     32.7%             8d5ed9f4-7764-4dbd-bad8-43fddce94b7c   B1
UN  192.168.1.202  91.11 KB   256     32.9%             125ed9f4-7777-1dbn-mac8-43fddce9123e   B1
UN  192.168.1.203  124.42 KB  256     32.6%             675ed9f4-6564-6dbd-can8-43fddce952gy   B1

5. When the new node status is Up Normal (UN), use nodetool cleanup cleanup command on all the nodes in the cluster except the new node that has just been added. It will remove keys that are no longer belong to the node. Run this command one node at a time. It is possible to postpone this step to low demand hours

Procedures