Apache Cassandra 4.0 vs. Scylla 4.4: Comparing Performance

TL;DR Scylla Open Source 4.4 vs. Cassandra 4.0 Results

  1. The first is an apples-to-apples comparison of 3-node clusters.
  2. The second is a larger-scale setup where we used node sizes optimal for each database. Scylla can utilize very large nodes so we compared a setup of 4 i3.metal machines (288 vCPUs in total) vs. 40 (!) i3.4xlarge Cassandra machines (640 vCPUs in total — almost 2.5x the Scylla’s resources).
  • Cassandra 4.0 has better P99 latency than Cassandra 3.11 by 100x!
  • Cassandra 4.0 speeds up admin operations by up to 34% compared to Cassandra 3.11
  • Scylla has 2x-5x better throughput than Cassandra 4.0 on the same 3-node cluster
  • Scylla has 3x-8x better throughput than Cassandra 4.0 on the same 3-node cluster while P99 <10ms
  • Scylla adds a node 3x faster than Cassandra 4.0
  • Scylla replaces a node 4x faster than Cassandra 4.0
  • Scylla doubles a 3-node cluster capacity 2.5x faster than Cassandra 4.0
  • A 40 TB cluster is 2.5x cheaper with Scylla while providing 42% more throughput under P99 latency of 10 ms
  • Scylla adds 25% capacity to a 40 TB optimized cluster 11x faster than Cassandra 4.0.
  • Scylla finishes compaction 32x faster than Cassandra 4.0
  • Cassandra 4.0 can achieve a better latency with 40 i3.4xlarge nodes than 4 i3.metal Scylla nodes when the throughput is low and the cluster is being underutilized. Explanation below.

Limitations of Our Testing

Cluster of Three i3.4xlarge Nodes

3-Node Test Setup

Throughput and Latencies

  1. “Real-life” (Gaussian) distribution, with sensible cache-hit ratios of 30–60%
  2. Uniform distribution, with a close-to-zero cache hit ratio
  3. “In-memory” distribution, expected to yield almost 100% cache hits
  • 100% writes
  • 100% reads
  • 50% writes and 50% reads

“Real-life” (Gaussian) Distribution

Mixed Workload — 50% reads and 50% writes

Uniform Distribution (disk-intensive, low cache hit ratio)

Writes Workload — Only Writes

Reads Workload — Only Reads

Mixed Workload — 50% reads and 50% writes

Uniform Distribution (memory-intensive, high cache hit ratio)

Writes Workload — Only Writes

Reads Workload — Only Reads

Mixed Workload — 50% reads and 50% writes

Adding Nodes

One New Node

Doubling Cluster Size

Replace node

Major Compaction

“4 vs. 40” Benchmark

4 vs. 40 Node Setup

Throughput and Latencies

Mixed Workload — 50% reads and 50% writes

  • 4-node Scylla cluster (4 x i3.metal, 288 vCPUs in total)
  • 40-node Cassandra cluster (40 x i3.4xlarge, 640 vCPUs in total).
  • 4-node Scylla cluster (4 x i3.metal, 288 vCPUs in total)
  • 40-node Cassandra cluster (40 x i3.4xlarge, 640 vCPUs in total).

Scaling the cluster up by 25%

  • By adding a single Scylla node to the cluster (from 4 nodes to 5)
  • By adding 10 Cassandra nodes to the cluster (from 40 nodes to 50 nodes)

Major Compaction

Summary

Supplementary Information

Cassandra 3.11 configuration

Cassandra 4.0 configuration

Cassandra-stress parameters

  • Scylla’s Shard-aware Java driver was used.
  • Background loads were executed in the loop (so duration=5m is not a problem).
  • REPLICATION_FACTOR is 3 (except for major compaction benchmark).
  • COMPACTION_STRATEGY is SizeTieredCompactionStrategy unless stated otherwise.
  • loadgenerator_count is the number of generator machines (3 for “3 vs 3” benchmarks, 15 for “4 vs 40”).
  • BACKGROUND_LOAD_OPS is 1000 in major compaction, 25000 in other benchmarks.
  • DURATION_MINUTES is 10 for in-memory benchmarks, 30 for other benchmarks.

--

--

--

The monstrously-fast NoSQL database.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Getting started with Spark (part 3)

Splunk Core Certified Consultant SPLK-3003 Practice Exam Part 2

Splunk Core Certified Consultant SPLK-3003 Practice Exam Part 2

What is CORS ( Cross-Origin Resource Sharing ) ?

Basics of CSS

Zeta’s Hackathon Recap

testing testing one two three

Advanced T-SQL (part 5): Functions

Are platforms like Squarespace, Wix, and Shopify worth the hassle?

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
ScyllaDB

ScyllaDB

The monstrously-fast NoSQL database.

More from Medium

Databases with Automatic Rebalance Benchmark (TIDB vs YugabyteDB vs CockroachDB)

JMH: Benchmark Reactive vs Disruptor

How we use RocksDB in Kvo'r'k

Stargate gRPC: The Better Way to CQL