Adding a New Node Into an Existing Scylla Cluster (Out Scale)

The add a new node operation causes the other nodes in the cluster to stream data to the new node. This operation can take some time (depending on the data size and network bandwidth). If using a multi-availability-zone make sure they are balanced.

Prerequisites

Before adding the new node check the nodes status in the cluster using nodetool status command. You cannot add new nodes to the cluster if any of the nodes are down.

For Example:

Datacenter: DC1
Status=Up/Down
State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns (effective)                         Host ID         Rack
UN  192.168.1.201  112.82 KB  256     32.7%             8d5ed9f4-7764-4dbd-bad8-43fddce94b7c   B1
DN  192.168.1.202  91.11 KB   256     32.9%             125ed9f4-7777-1dbn-mac8-43fddce9123e   B1

In order to proceed, start the node or remove it from the cluster

Login to one of the nodes in the cluster, collect the following info from the node:

  • cluster_name - cat /etc/scylla/scylla.yaml | grep cluster_name

  • seeds - cat /etc/scylla/scylla.yaml | grep seeds:

  • endpoint_snitch - cat /etc/scylla/scylla.yaml | grep endpoint_snitch

  • Scylla version - scylla --version

  • Authentication status - cat /etc/scylla/scylla.yaml | grep authenticator

Note

If authenticator is set to PasswordAuthenticator - increase the system_auth table replication factor.

For example

ALTER KEYSPACE system_auth WITH REPLICATION = {'class' : 'NetworkTopologyStrategy', 'dc1' : <new_replication_factor>};

It is recommended to set system_auth replication factor to the number of nodes in each DC, or 5 (the smaller between the two).

Procedure

1. Install Scylla on a new node, see Getting Started for further instructions. Follow the Scylla install procedure up to scylla.yaml configuration phase. Make sure that the Scylla version of the new node is identical to the other nodes in the cluster.

In the case that the node starts during the process follow these instructions

Note

Make sure to use the same Scylla patch release on the new/replaced node, to match the rest of the cluster. It is not recommended to add a new node with a different release to the cluster. For example, use the following for installing Scylla patch release (use your deployed version)

  • Scylla Enterprise - sudo yum install scylla-enterprise-2018.1.9

  • Scylla open source - sudo yum install scylla-3.0.3

Note

It’s important to keep I/O scheduler configuration in sync on nodes with the same hardware. That’s why we recommend skipping running scylla_io_setup when provisioning a new node with exactly the same hardware setup as existing nodes in the cluster.

Instead, we recommend to copy the following files from the existing node to the new node after running scylla_setup and restart scylla-server service (if it is already running):
  • /etc/scylla.d/io.conf

  • /etc/scylla.d/io_properties.yaml

Using different I/O scheduler configuration may result in unnecessary bottlenecks.

2. In the scylla.yaml file edit the parameters listed below, the file can be found under /etc/scylla/

  • cluster_name - Set the selected cluster_name

  • listen_address - IP address that Scylla used to connect to the other Scylla nodes in the cluster

  • auto_bootstrap - By default, this parameter is set to true, it allows the new nodes to automatically migrate data to themselves.

  • endpoint_snitch - Set the selected snitch

  • rpc_address - Address for client connections (Thrift, CQL)

  • seeds - Set the IP addresses of the current seed nodes. Do not add IPs of any new nodes.

Note

The added node should not be a listed as a seed node. Scylla does not support adding new nodes to a cluster without bootstrapping the node first.

  1. Start the Scylla node using

sudo systemctl start scylla-server
sudo service scylla-server start
docker exec -it some-scylla supervisorctl start scylla

(with some-scylla container already running)

4. Verify that the node was added to the cluster using nodetool status command. Since the other nodes in the cluster stream data to the new node, the new node will be in Up Joining (UJ) status. It may take some time (Depending on the data size and network bandwidth) until the node’s status changes to Up Normal (UN).

For Example:

The nodes in the cluster streams data to the new node

Datacenter: DC1
Status=Up/Down
State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns (effective)                         Host ID         Rack
UN  192.168.1.201  112.82 KB  256     32.7%             8d5ed9f4-7764-4dbd-bad8-43fddce94b7c   B1
UN  192.168.1.202  91.11 KB   256     32.9%             125ed9f4-7777-1dbn-mac8-43fddce9123e   B1
UJ  192.168.1.203  124.42 KB  256     32.6%             675ed9f4-6564-6dbd-can8-43fddce952gy   B1

The nodes in the cluster finish streaming data to the new node.

Datacenter: DC1
Status=Up/Down
State=Normal/Leaving/Joining/Moving
--  Address        Load       Tokens  Owns (effective)                         Host ID         Rack
UN  192.168.1.201  112.82 KB  256     32.7%             8d5ed9f4-7764-4dbd-bad8-43fddce94b7c   B1
UN  192.168.1.202  91.11 KB   256     32.9%             125ed9f4-7777-1dbn-mac8-43fddce9123e   B1
UN  192.168.1.203  124.42 KB  256     32.6%             675ed9f4-6564-6dbd-can8-43fddce952gy   B1

5. When the new node status is Up Normal (UN), use nodetool cleanup cleanup command on all the nodes in the cluster except the new node that has just been added. It will remove keys that are no longer belong to the node. Run this command one node at a time. It is possible to postpone this step to low demand hours.

Note

If you are using Scylla Enterprise 2018.1.5 or lower or Scylla Open source 2.3 or lower Do not run the nodetool cleanup command before upgrading to the latest release of your branch, see this issue for further information.

  1. Wait until the new node becomes UN (Up Normal) in the output of nodetool status on one of the old nodes. After that, on the new node, you may edit the scylla.yaml configuration file and add the new node(s) to the seeder’s list if you wish.

    Note

    There is no need to restart the Scylla service after modifying the seeds list in scylla.yaml.

  2. If you are using Scylla Monitoring, update the monitoring stack to monitor it. If you are using Scylla Manager, make sure Manager can connect to the new node.