Scylla Repair

During regular operation, a Scylla cluster continues to function and remains ‘always-on’ even in the face of failures such as:

  • A down node

  • A network partition

  • Complete datacenter failure

  • Dropped mutations due to timeouts

  • Process crashes (before a flush)

  • A replica that cannot write due to a lack of resources

As long as the cluster can satisfy the required consistency level (usually quorum), availability and consistency will be met. However, in order to automatically mitigate data inconsistency (entropy), Scylla uses three processes:

Scylla Repair is a process that runs in the background and synchronizes the data between nodes so that eventually, all the replicas hold the same data. Data stored on nodes can become inconsistent with other replicas over time, which is why repairs are a necessary part of database maintenance. Using Scylla repair makes data on the node consistent with the other nodes in the cluster. The best use of Scylla repair is to have Scylla Manager schedule and run the repairs for you.

Note

Run the nodetool repair command regularly. If you delete data frequently, it should be more often than the value of gc_grace_seconds (by default: 10 days), for example, every week. Use the nodetool repair -pr on each node in the cluster, sequentially.

In most cases, the proportion of data that is out of sync is very small, while in a few other cases, for example, if a node was down for a day, the difference might be bigger.

Row-level Repair

New in version 3.1: Scylla Open Source

In previous versions of Scylla (before 3.1), repairs were done on the partition level. Scylla 3.1 introduced row-level repair where the repair only transferred the mismatched rows and not the entire partition.

Row Level Repair improves Scylla in two ways:

  • Minimizes data transfer With Row Level Repair, Scylla calculates the checksum for each row and uses set reconciliation algorithms to find the mismatches between nodes. As a result, only the mismatched rows are exchanged, which eliminates unnecessary data transmission over the network.

  • Minimize disk reads The new implementation manages to:

    • Read the data only once

    • Keep the data in a temporary buffer

    • Use the cached data to calculate the checksum and send it to the replicas.

Repair Base Operation

New in version 4.0: Scylla Open Source (disabled)

There are two mechanisms in place for Scylla to synchronize data between nodes:

  • Streaming, used for cluster topology changes, like adding and removing nodes

  • Row Level Repair, an offline process which compares and syncs data between nodes

With Repair Base Operation, Scylla uses Row Level Repair as the unified underlying mechanism for repair operation and all node operations, e.g., bootstrap, decommission, remove node, replace node, rebuild node.

This safer process makes the node operations resumable, syncing only the inconsistent data. Also, replaced nodes now accept writes, which with the above change, means there is no longer a need to repair after replacing a node.

This feature is disabled by default.

You can enable or disable this feature with a configuration parameter in the scylla.yaml:

enable_repair_based_node_ops: [true|false]

See also