Create a Scylla Cluster on EC2 (Single or Multi Data Center)¶
The easiest way to run a Scylla cluster on EC2 is using CentOS base Scylla AMI. However, if you wish to use a different OS or your own AMI (Amazon Machine Image), or setup a multi DC Scylla cluster, you will need to configure Scylla cluster on your own. This page guides you through this process.
Scylla cluster on EC2 can deployed as a single-DC or a multi-DC. The table below described how to configure
scylla.yaml parameters for each node in your cluster for each case.
Best practice is to use each EC2 region as a Scylla DC. In such a case, nodes communicating using Internal (private) IPs inside the region, and using External (Public) IPs between regions (Data Centers)
For further information on AWS instance addressing
EC2 Configuration Table¶
|Parameter||Single DC||Multi DC|
|seeds||Internal IP address||External IP address|
|listen_address||Internal IP address||Internal IP address|
|rpc_address||Internal IP address||Internal IP address|
|broadcast_address||Internal IP address||External IP address|
|broadcast_rpc_address||Internal IP address||External IP address|
- EC2 instance with ssh access.
- Make sure that all the relevant ports are open in your EC2 Security Group.
- Select a unique name as
cluster_namefor the cluster (identical for all the nodes in the cluster).
- Decide which nodes will be the seed nodes (It is recommended to define more than one node as a seed node per data-center).
These steps need to be done for each of the nodes in the new cluster.
1. Install Scylla on a node, see Getting Started for further instructions.
Follow the Scylla install procedure up to
scylla.yaml configuration phase.
2. In the
scylla.yaml file edit the parameters listed below,
the file can be found under
/etc/scylla/ consult the table above how to configure your cluster.
- cluster_name - Set the selected cluster_name
- seeds - Set the selected seed nodes
- listen_address - IP address that Scylla used to connect to other Scylla nodes in the cluster
- auto_bootstrap - By default, this parameter is set to true, it allow new nodes to migrate data to themselves automatically. If using Scylla AMI add
--bootstrapto the user settings when creating a node
- endpoint_snitch - Set the selected snitch
- rpc_address - Address for client connection (Thrift, CQL)
- broadcast_address - The IP address a node tells other nodes in the cluster to contact it by
- broadcast_rpc_address - Default: unset, RPC address to broadcast to drivers and other Scylla nodes. This cannot be set to 0.0.0.0. If left blank, this will be set to the value of rpc_address. If rpc_address is set to 0.0.0.0, broadcast_rpc_address must be configured
3. After you have installed and configured Scylla and edit
scylla.yaml file on all the nodes, start the seed nodes one at a time, and then start the rest of the nodes in your cluster using
sudo systemctl start scylla-server
4. Verify that the node added to the cluster using.
EC2 snitchs Default DC and Rack Names, and how to Override DC Names¶
EC2snitch and Ec2MultiRegionSnitch give each DC and rack default names, the region name defined as datacenter name and availability zones are defined as racks within a datacenter. EC2snitch and Ec2MultiRegionSnitch give each DC and rack default names, the region name defined as datacenter name and availability zones are defined as racks within a datacenter, the rack names can’t be changed.
A node in the
us-east is the data center name and 1 is the rack location.
To change the DC names do the following:
cassandra-rackdc.properties file with the prefered data-center name.
The file can be found under
The dc_suffix defines a suffix added to the data-center name as described below.
dc_suffix=_1_scylla will be
dc_suffix=_1_scylla will be