Large partition support in Scylla 1.3

Topic: Large partitions

Learn: How to use large partitions with Scylla 1.3

Audience: Scylla users and DBAs

Scylla 1.3 improves support for large partitions compared to previous versions. This article describes large partition support and provides guidelines for their use.

What are large partitions?

A partition is the data identified by a single partition key. A large partition is simply a partition with a lot of data. There is no clear boundary between a large and non-large partition, but you can consider 10-100MB as the gray area between large and non-large partitions.

Scylla relies on partitions being evenly distributed among nodes and among cores (shards). For that, there need to be many partitions, hence there average size must be small. Try to model your data so that, in the average case, partitions will be well below the 10MB mark; smaller is better.

Nevertheless, Scylla accomodates data models where a small fraction of the partitions in the system is large - above 100MB. This allows Scylla to handle exceptional cases without the application needing special logic to break apart large partitions.

Limitations in Scylla 1.3

The following limitations apply to large partition support in Scylla 1.3.

  • Partitions above 10MB are not stored in cache; every read accesses the disk.
  • Reads using IN for clustering key columns will be slower; avoid them when possible.
  • Reads that include static columns will be slower; avoid them when possible.

These limitations may be lifted in future Scylla releases.

Limitations on clustered rows and individual cells within rows

Clustered rows and cells within rows should not exceed a few megabytes

Knowledge Base