ScyllaDB University LIVE, FREE Virtual Training Event | March 21
Register for Free
ScyllaDB Documentation Logo Documentation
  • Deployments
    • Cloud
    • Server
  • Tools
    • ScyllaDB Manager
    • ScyllaDB Monitoring Stack
    • ScyllaDB Operator
  • Drivers
    • CQL Drivers
    • DynamoDB Drivers
    • Supported Driver Versions
  • Resources
    • ScyllaDB University
    • Community Forum
    • Tutorials
Install
Ask AI
ScyllaDB Docs ScyllaDB Manual ScyllaDB for Administrators Procedures Cluster Management Procedures Cluster Platform Migration Using Node Cycling

Cluster Platform Migration Using Node Cycling¶

This procedure describes how to migrate a ScyllaDB cluster to new instance types using the add-and-replace approach, which is commonly used for:

  • Migrating from one CPU architecture to another (e.g., x86_64 to ARM/Graviton)

  • Upgrading to newer instance types with better performance

  • Changing instance families within the same cloud provider

The add-and-replace approach maintains data replication throughout the migration and ensures zero downtime for client applications.

Note

This procedure does not change the ScyllaDB software version. All nodes (both existing and new) must run the same ScyllaDB version. For software version upgrades, see Upgrade.

Overview¶

The add-and-replace migration follows these steps:

  1. Add new nodes (on target instance type) to the existing cluster

  2. Wait for data to stream to the new nodes

  3. Decommission old nodes (on source instance type)

This approach keeps the cluster operational throughout the migration while maintaining the configured replication factor.

Key characteristics¶

  • Zero downtime: Client applications continue to operate during migration

  • Data safety: Replication factor is maintained throughout the process

  • Flexible: Works with both vnodes and tablets-enabled clusters

  • Multi-DC support: Can migrate nodes across multiple datacenters

Warning

Ensure your cluster has sufficient capacity during the migration. At the peak of the process, your cluster will temporarily have double the number of nodes.

Prerequisites¶

Check cluster health¶

Before starting the migration, verify that your cluster is healthy:

  1. Check that all nodes are in Up Normal (UN) status:

    nodetool status
    

    All nodes should show UN status. Do not proceed if any nodes are down.

  2. Ensure no streaming or repair operations are in progress:

    nodetool netstats
    nodetool compactionstats
    

Plan the migration¶

Before provisioning new instances, plan the following:

Instance type mapping: Identify the source and target instance types. If your cluster uses vnodes (not tablets), consider that mismatched shard counts between source and target instance types can cause slower repairs. With tablets enabled, shard count mismatch is fully supported.

Rack assignment planning: Each new node must be assigned to the same rack as the node it will replace. This maintains rack-aware topology for:

  • Rack-aware replication (NetworkTopologyStrategy)

  • Proper data distribution across failure domains

  • Minimizing data movement during decommission

Example mapping for a 3-node cluster:

Source nodes (to be decommissioned):     Target nodes (to be added):
192.168.1.10 - RACK0                 →   192.168.2.10 - RACK0
192.168.1.11 - RACK1                 →   192.168.2.11 - RACK1
192.168.1.12 - RACK2                 →   192.168.2.12 - RACK2

Create a backup¶

Back up the data before starting the migration. One of the following methods can be used:

  • ScyllaDB Manager (recommended): Use ScyllaDB Manager to perform a cluster-wide backup. See the ScyllaDB Manager documentation for details.

  • Snapshots: On each node in the cluster, create a snapshot:

    nodetool snapshot -t pre_migration_backup
    nodetool listsnapshots
    

    Note

    Snapshots are local to each node and do not protect against node or disk failure. For full disaster recovery, use ScyllaDB Manager backup.

Procedure¶

Adding new nodes¶

  1. Provision new instances with the target instance type. Ensure:

    • The same ScyllaDB version as existing nodes

    • Same network configuration and security groups

    • Appropriate storage configuration

  2. On each new node, configure /etc/scylla/scylla.yaml to join the existing cluster:

    • cluster_name: Must match the existing cluster name

    • seeds: IP address of an existing node in the cluster (used to discover cluster topology on join)

    • endpoint_snitch: Must match the existing cluster configuration

    • listen_address: IP address of the new node

    • rpc_address: IP address of the new node

    All other cluster-wide settings (tablets configuration, encryption settings, experimental features, etc.) must match the existing nodes.

    Caution

    Make sure that the ScyllaDB version on the new node is identical to the version on the other nodes in the cluster. Running nodes with different versions is not supported.

  3. If using GossipingPropertyFileSnitch, configure /etc/scylla/cassandra-rackdc.properties with the correct datacenter and rack assignment for this node:

    dc = <datacenter-name>
    rack = <rack-name>
    prefer_local = true
    

    Warning

    Each node must have the correct rack assignment. Using the same rack for all new nodes breaks rack-aware replication topology.

  4. Start ScyllaDB on the new node:

    sudo systemctl start scylla-server
    

    For Docker deployments:

    docker exec -it <container-name> supervisorctl start scylla
    
  5. Monitor the bootstrap process from an existing node:

    nodetool status
    

    The new node will appear with UJ (Up, Joining) status while streaming data from existing nodes. Wait until it transitions to UN (Up, Normal).

    Example output during bootstrap:

    Datacenter: dc1
    Status=Up/Down
    State=Normal/Leaving/Joining/Moving
    --  Address        Load       Tokens  Owns   Host ID                               Rack
    UN  192.168.1.10   500 MB     256     33.3%  8d5ed9f4-7764-4dbd-bad8-43fddce94b7c  RACK0
    UN  192.168.1.11   500 MB     256     33.3%  125ed9f4-7777-1dbn-mac8-43fddce9123e  RACK1
    UN  192.168.1.12   500 MB     256     33.3%  675ed9f4-6564-6dbd-can8-43fddce952gy  RACK2
    UJ  192.168.2.10   250 MB     256     ?      a1b2c3d4-5678-90ab-cdef-112233445566  RACK0
    

    Example output after bootstrap completes:

    Datacenter: dc1
    Status=Up/Down
    State=Normal/Leaving/Joining/Moving
    --  Address        Load       Tokens  Owns   Host ID                               Rack
    UN  192.168.1.10   400 MB     256     25.0%  8d5ed9f4-7764-4dbd-bad8-43fddce94b7c  RACK0
    UN  192.168.1.11   400 MB     256     25.0%  125ed9f4-7777-1dbn-mac8-43fddce9123e  RACK1
    UN  192.168.1.12   400 MB     256     25.0%  675ed9f4-6564-6dbd-can8-43fddce952gy  RACK2
    UN  192.168.2.10   400 MB     256     25.0%  a1b2c3d4-5678-90ab-cdef-112233445566  RACK0
    
  6. For tablets-enabled clusters, wait for tablet load balancing to complete. After the node reaches UN status, verify no streaming is in progress:

    nodetool netstats
    

    Wait until output shows “Not sending any streams” and no active receiving streams.

  7. Repeat steps 1-6 for each new node to be added.

Note

You can add multiple nodes in parallel if they are in different datacenters. Within a single datacenter, add nodes one at a time for best results.

Updating seed node configuration¶

If any of your original nodes are configured as seed nodes, you must update the seed configuration before decommissioning them.

  1. Check the current seed configuration on any node:

    grep -A 4 "seed_provider" /etc/scylla/scylla.yaml
    
  2. If the seeds include nodes you plan to decommission, update scylla.yaml on all new nodes to use the new node IPs as seeds:

    seed_provider:
      - class_name: org.apache.cassandra.locator.SimpleSeedProvider
        parameters:
          - seeds: "192.168.2.10,192.168.2.11,192.168.2.12"
    

    Note

    Updating seed configuration on the old nodes (that will be decommissioned) is optional. Seeds are only used during node startup to discover the cluster. If you don’t plan to restart the old nodes before decommissioning them, their seed configuration doesn’t matter. However, updating all nodes is recommended for safety in case an old node unexpectedly restarts during the migration.

  3. Restart ScyllaDB on each new node (one at a time) to apply the new seed configuration:

    sudo systemctl restart scylla-server
    

    Wait for the node to fully start before restarting the next node.

  4. After restarting the new nodes, verify the cluster is healthy:

    nodetool status
    nodetool describecluster
    

Warning

Complete this seed list update on all new nodes before decommissioning any old nodes. This ensures the new nodes can reform the cluster after the old nodes are removed.

Decommissioning old nodes¶

After all new nodes are added and healthy, decommission the old nodes one at a time.

  1. Verify all nodes are healthy before starting decommission:

    nodetool status
    

    All nodes should show UN status.

  2. On the node to be decommissioned, run:

    nodetool decommission
    

    This command blocks until the decommission is complete. The node will stream its data to the remaining nodes.

  3. Monitor the decommission progress from another node:

    nodetool status
    

    The decommissioning node will transition from UN → UL (Up, Leaving) → removed from the cluster.

    You can also monitor streaming progress:

    nodetool netstats
    
  4. After decommission completes, verify the node is no longer in the cluster:

    nodetool status
    

    The decommissioned node should no longer appear in the output.

  5. Run nodetool cleanup on the remaining nodes to remove data that no longer belongs to them after the topology change:

    nodetool cleanup
    

    Note

    nodetool cleanup can be resource-intensive. Run it on one node at a time during low-traffic periods.

  6. Wait for the cluster to stabilize before decommissioning the next node. Ensure no streaming operations are in progress.

  7. Repeat steps 1-7 for each old node to be decommissioned.

Post-migration verification¶

After all old nodes are decommissioned, verify the migration was successful.

Verify cluster topology¶

nodetool status

Confirm:

  • All nodes show UN (Up, Normal) status

  • Only the new instance type nodes are present

  • Nodes are balanced across racks

Verify schema agreement¶

nodetool describecluster

All nodes should report the same schema version.

Verify data connectivity¶

Connect to the cluster and run a test query:

cqlsh <node-ip> -e "SELECT count(*) FROM system_schema.keyspaces;"

Note

If ScyllaDB is configured with listen_interface, you must use the node’s interface IP address (not localhost) for cqlsh connections.

Verify ScyllaDB version¶

Confirm all nodes are running the same ScyllaDB version:

scylla --version

Verify data integrity (optional)¶

Run data validation on each keyspace to verify sstable integrity:

nodetool scrub --mode=VALIDATE <keyspace_name>

Rollback¶

If issues occur during the migration, you can roll back by reversing the procedure.

During add phase¶

If a new node fails to bootstrap:

  1. Stop ScyllaDB on the new node:

    sudo systemctl stop scylla-server
    
  2. From an existing node, remove the failed node:

    nodetool removenode <host-id-of-failed-node>
    

During decommission phase¶

If a decommission operation gets stuck:

  1. If the node is still reachable, try stopping and restarting ScyllaDB

  2. If the node is unresponsive, from another node:

    nodetool removenode <host-id>
    

    See Remove a Node from a ScyllaDB Cluster for more details.

Full rollback¶

To roll back after the migration is complete (all nodes on new instance type), apply the same add-and-replace procedure in reverse:

  1. Add new nodes on the original instance type

  2. Wait for data streaming to complete

  3. Decommission the nodes on the new instance type

Troubleshooting¶

Node stuck in Joining (UJ) state¶

If a new node remains in UJ state for an extended period:

  • Check ScyllaDB logs for streaming errors: journalctl -u scylla-server

  • Verify network connectivity between nodes

  • Ensure sufficient disk space on all nodes

  • Check for any ongoing operations that may be blocking

Decommission taking too long¶

Decommission duration depends on data size. If it appears stuck:

  • Check streaming progress: nodetool netstats

  • Look for errors in ScyllaDB logs

  • Verify network bandwidth between nodes

Schema disagreement¶

If nodes report different schema versions:

  • Wait a few minutes for schema to propagate

  • If disagreement persists, restart the nodes one by one

  • Run nodetool describecluster to verify agreement

Additional resources¶

  • Adding a New Node Into an Existing ScyllaDB Cluster

  • Remove a Node from a ScyllaDB Cluster

  • Replace a Running Node in a ScyllaDB Cluster

  • Upgrade

Was this page helpful?

PREVIOUS
Preventing Quorum Loss in Symmetrical Multi-DC Clusters
NEXT
Backup and Restore Procedures
  • Create an issue
  • Edit this page

On this page

  • Cluster Platform Migration Using Node Cycling
    • Overview
      • Key characteristics
    • Prerequisites
      • Check cluster health
      • Plan the migration
      • Create a backup
    • Procedure
      • Adding new nodes
      • Updating seed node configuration
      • Decommissioning old nodes
    • Post-migration verification
      • Verify cluster topology
      • Verify schema agreement
      • Verify data connectivity
      • Verify ScyllaDB version
      • Verify data integrity (optional)
    • Rollback
      • During add phase
      • During decommission phase
      • Full rollback
    • Troubleshooting
      • Node stuck in Joining (UJ) state
      • Decommission taking too long
      • Schema disagreement
    • Additional resources
ScyllaDB Manual
  • 2026.1
    • master
    • 2026.1
    • 2025.4
    • 2025.3
    • 2025.2
    • 2025.1
  • Getting Started
    • Install ScyllaDB 2026.1
      • Launch ScyllaDB 2026.1 on AWS
      • Launch ScyllaDB 2026.1 on GCP
      • Launch ScyllaDB 2026.1 on Azure
      • ScyllaDB Web Installer for Linux
      • Install ScyllaDB 2026.1 Linux Packages
      • Run ScyllaDB in Docker
      • Install ScyllaDB Without root Privileges
      • Air-gapped Server Installation
      • ScyllaDB Housekeeping and how to disable it
      • ScyllaDB Developer Mode
    • Configure ScyllaDB
    • ScyllaDB Configuration Reference
    • ScyllaDB Requirements
      • System Requirements
      • OS Support
      • Cloud Instance Recommendations
      • ScyllaDB in a Shared Environment
    • Migrate to ScyllaDB
      • Migration Process from Cassandra to ScyllaDB
      • ScyllaDB and Apache Cassandra Compatibility
      • Migration Tools Overview
    • Integration Solutions
      • Integrate ScyllaDB with Spark
      • Integrate ScyllaDB with KairosDB
      • Integrate ScyllaDB with Presto
      • Integrate ScyllaDB with Elasticsearch
      • Integrate ScyllaDB with Kubernetes
      • Integrate ScyllaDB with the JanusGraph Graph Data System
      • Integrate ScyllaDB with DataDog
      • Integrate ScyllaDB with Kafka
      • Integrate ScyllaDB with IOTA Chronicle
      • Integrate ScyllaDB with Spring
      • Shard-Aware Kafka Connector for ScyllaDB
      • Install ScyllaDB with Ansible
      • Integrate ScyllaDB with Databricks
      • Integrate ScyllaDB with Jaeger Server
      • Integrate ScyllaDB with MindsDB
  • ScyllaDB for Administrators
    • Administration Guide
    • Procedures
      • Cluster Management
      • Backup & Restore
      • Change Configuration
      • Maintenance
      • Best Practices
      • Benchmarking ScyllaDB
      • Migrate from Cassandra to ScyllaDB
      • Disable Housekeeping
    • Security
      • ScyllaDB Security Checklist
      • Enable Authentication
      • Enable and Disable Authentication Without Downtime
      • Creating a Custom Superuser
      • Generate a cqlshrc File
      • Reset Authenticator Password
      • Enable Authorization
      • Grant Authorization CQL Reference
      • Certificate-based Authentication
      • Role Based Access Control (RBAC)
      • ScyllaDB Auditing Guide
      • Encryption: Data in Transit Client to Node
      • Encryption: Data in Transit Node to Node
      • Generating a self-signed Certificate Chain Using openssl
      • Configure SaslauthdAuthenticator
      • Encryption at Rest
      • LDAP Authentication
      • LDAP Authorization (Role Management)
      • Software Bill Of Materials (SBOM)
    • Admin Tools
      • Nodetool Reference
      • CQLSh
      • Admin REST API
      • Tracing
      • ScyllaDB SStable
      • ScyllaDB SStable Script API
      • ScyllaDB Types
      • SSTableLoader
      • cassandra-stress
      • ScyllaDB Logs
      • Seastar Perftune
      • Virtual Tables
      • Reading mutation fragments
      • Maintenance socket
      • Maintenance mode
      • Task manager
    • ScyllaDB Monitoring Stack
    • ScyllaDB Operator
    • ScyllaDB Manager
    • Upgrade Procedures
      • About Upgrade
      • Upgrade Guides
    • System Configuration
      • System Configuration Guide
      • scylla.yaml
      • ScyllaDB Snitches
    • Benchmarking ScyllaDB
    • ScyllaDB Diagnostic Tools
  • ScyllaDB for Developers
    • Develop with ScyllaDB
    • Tutorials and Example Projects
    • Learn to Use ScyllaDB
    • ScyllaDB Alternator
    • ScyllaDB Drivers
  • CQL Reference
    • CQLSh: the CQL shell
    • Reserved CQL Keywords and Types (Appendices)
    • Compaction
    • Consistency Levels
    • Consistency Level Calculator
    • Data Definition
    • Data Manipulation
      • SELECT
      • INSERT
      • UPDATE
      • DELETE
      • BATCH
    • Data Types
    • Definitions
    • Global Secondary Indexes
    • Expiring Data with Time to Live (TTL)
    • Functions
    • Wasm support for user-defined functions
    • JSON Support
    • Materialized Views
    • DESCRIBE SCHEMA
    • Service Levels
    • ScyllaDB CQL Extensions
  • Alternator: DynamoDB API in ScyllaDB
    • Getting Started With ScyllaDB Alternator
    • ScyllaDB Alternator for DynamoDB users
    • Alternator-specific APIs
    • Reducing network costs in Alternator
  • Features
    • Lightweight Transactions
    • Global Secondary Indexes
    • Local Secondary Indexes
    • Materialized Views
    • Counters
    • Change Data Capture
      • CDC Overview
      • The CDC Log Table
      • Basic operations in CDC
      • CDC Streams
      • CDC Stream Changes
      • Querying CDC Streams
      • Advanced column types
      • Preimages and postimages
      • Data Consistency in CDC
    • Workload Attributes
    • Workload Prioritization
    • Backup and Restore
    • Incremental Repair
    • Automatic Repair
    • Vector Search
  • ScyllaDB Architecture
    • Data Distribution with Tablets
    • ScyllaDB Ring Architecture
    • ScyllaDB Fault Tolerance
    • Consistency Level Console Demo
    • ScyllaDB Anti-Entropy
      • ScyllaDB Hinted Handoff
      • ScyllaDB Read Repair
      • ScyllaDB Repair
    • SSTable
      • ScyllaDB SSTable - 2.x
      • ScyllaDB SSTable - 3.x
    • Compaction Strategies
    • Raft Consensus Algorithm in ScyllaDB
    • Zero-token Nodes
  • Troubleshooting ScyllaDB
    • Errors and Support
      • Report a ScyllaDB problem
      • Error Messages
      • Change Log Level
    • ScyllaDB Startup
      • Ownership Problems
      • ScyllaDB will not Start
      • ScyllaDB Python Script broken
    • Cluster and Node
      • Handling Node Failures
      • Failure to Add, Remove, or Replace a Node
      • Failed Decommission Problem
      • Cluster Timeouts
      • Node Joined With No Data
      • NullPointerException
      • Failed Schema Sync
    • Data Modeling
      • ScyllaDB Large Partitions Table
      • ScyllaDB Large Rows and Cells Table
      • Large Partitions Hunting
      • Failure to Update the Schema
    • Data Storage and SSTables
      • Space Utilization Increasing
      • Disk Space is not Reclaimed
      • SSTable Corruption Problem
      • Pointless Compactions
      • Limiting Compaction
    • CQL
      • Time Range Query Fails
      • COPY FROM Fails
      • CQL Connection Table
    • ScyllaDB Monitor and Manager
      • Manager and Monitoring integration
      • Manager lists healthy nodes as down
    • Installation and Removal
      • Removing ScyllaDB on Ubuntu breaks system packages
  • Knowledge Base
    • Upgrading from experimental CDC
    • Compaction
    • Consistency in ScyllaDB
    • Counting all rows in a table is slow
    • CQL Query Does Not Display Entire Result Set
    • When CQLSh query returns partial results with followed by “More”
    • Run ScyllaDB and supporting services as a custom user:group
    • Customizing CPUSET
    • Decoding Stack Traces
    • Snapshots and Disk Utilization
    • DPDK mode
    • Debug your database with Flame Graphs
    • Efficient Tombstone Garbage Collection in ICS
    • How to Change gc_grace_seconds for a Table
    • Gossip in ScyllaDB
    • Increase Permission Cache to Avoid Non-paged Queries
    • How does ScyllaDB LWT Differ from Apache Cassandra ?
    • Map CPUs to ScyllaDB Shards
    • ScyllaDB Memory Usage
    • NTP Configuration for ScyllaDB
    • POSIX networking for ScyllaDB
    • ScyllaDB consistency quiz for administrators
    • Recreate RAID devices
    • How to Safely Increase the Replication Factor
    • ScyllaDB and Spark integration
    • Increase ScyllaDB resource limits over systemd
    • ScyllaDB Seed Nodes
    • How to Set up a Swap Space
    • ScyllaDB Snapshots
    • ScyllaDB payload sent duplicated static columns
    • Stopping a local repair
    • System Limits
    • How to flush old tombstones from a table
    • Time to Live (TTL) and Compaction
    • ScyllaDB Nodes are Unresponsive
    • Update a Primary Key
    • Using the perf utility with ScyllaDB
    • Configure ScyllaDB Networking with Multiple NIC/IP Combinations
  • Reference
    • AWS Images
    • Azure Images
    • GCP Images
    • Configuration Parameters
    • Glossary
    • Limits
    • API Reference
      • Authorization Cache
      • Cache Service
      • Collectd
      • Column Family
      • Commit Log
      • Compaction Manager
      • Endpoint Snitch Info
      • Error Injection
      • Failure Detector
      • Gossiper
      • Hinted Handoff
      • LSA
      • Messaging Service
      • Raft
      • Storage Proxy
      • Storage Service
      • Stream Manager
      • System
      • Task Manager Test
      • Task Manager
      • Tasks
    • Metrics
  • ScyllaDB FAQ
  • 2024.2 and earlier documentation
Docs Tutorials University Contact Us About Us
© 2026, ScyllaDB. All rights reserved. | Terms of Service | Privacy Policy | ScyllaDB, and ScyllaDB Cloud, are registered trademarks of ScyllaDB, Inc.
Last updated on 25 Mar 2026.
Powered by Sphinx 9.1.0 & ScyllaDB Theme 1.9.1
Ask AI