Apache Cassandra 4.0 vs. Scylla 4.4: Comparing Performance

TL;DR Scylla Open Source 4.4 vs. Cassandra 4.0 Results

  1. The first is an apples-to-apples comparison of 3-node clusters.
  2. The second is a larger-scale setup where we used node sizes optimal for each database. Scylla can utilize very large nodes so we compared a setup of 4 i3.metal machines (288 vCPUs in total) vs. 40 (!) i3.4xlarge Cassandra machines (640 vCPUs in total — almost 2.5x the Scylla’s resources).
  • Cassandra 4.0 has better P99 latency than Cassandra 3.11 by 100x!
  • Cassandra 4.0 speeds up admin operations by up to 34% compared to Cassandra 3.11
  • Scylla has 2x-5x better throughput than Cassandra 4.0 on the same 3-node cluster
  • Scylla has 3x-8x better throughput than Cassandra 4.0 on the same 3-node cluster while P99 <10ms
  • Scylla adds a node 3x faster than Cassandra 4.0
  • Scylla replaces a node 4x faster than Cassandra 4.0
  • Scylla doubles a 3-node cluster capacity 2.5x faster than Cassandra 4.0
  • A 40 TB cluster is 2.5x cheaper with Scylla while providing 42% more throughput under P99 latency of 10 ms
  • Scylla adds 25% capacity to a 40 TB optimized cluster 11x faster than Cassandra 4.0.
  • Scylla finishes compaction 32x faster than Cassandra 4.0
  • Cassandra 4.0 can achieve a better latency with 40 i3.4xlarge nodes than 4 i3.metal Scylla nodes when the throughput is low and the cluster is being underutilized. Explanation below.

Limitations of Our Testing

Cluster of Three i3.4xlarge Nodes

3-Node Test Setup

Throughput and Latencies

  1. “Real-life” (Gaussian) distribution, with sensible cache-hit ratios of 30–60%
  2. Uniform distribution, with a close-to-zero cache hit ratio
  3. “In-memory” distribution, expected to yield almost 100% cache hits
  • 100% writes
  • 100% reads
  • 50% writes and 50% reads

“Real-life” (Gaussian) Distribution

Mixed Workload — 50% reads and 50% writes

Uniform Distribution (disk-intensive, low cache hit ratio)

Writes Workload — Only Writes

Reads Workload — Only Reads

Mixed Workload — 50% reads and 50% writes

Uniform Distribution (memory-intensive, high cache hit ratio)

Writes Workload — Only Writes

Reads Workload — Only Reads

Mixed Workload — 50% reads and 50% writes

Adding Nodes

One New Node

Doubling Cluster Size

Replace node

Major Compaction

“4 vs. 40” Benchmark

4 vs. 40 Node Setup

Throughput and Latencies

Mixed Workload — 50% reads and 50% writes

  • 4-node Scylla cluster (4 x i3.metal, 288 vCPUs in total)
  • 40-node Cassandra cluster (40 x i3.4xlarge, 640 vCPUs in total).
  • 4-node Scylla cluster (4 x i3.metal, 288 vCPUs in total)
  • 40-node Cassandra cluster (40 x i3.4xlarge, 640 vCPUs in total).

Scaling the cluster up by 25%

  • By adding a single Scylla node to the cluster (from 4 nodes to 5)
  • By adding 10 Cassandra nodes to the cluster (from 40 nodes to 50 nodes)

Major Compaction

Summary

Supplementary Information

Cassandra 3.11 configuration

Cassandra 4.0 configuration

Cassandra-stress parameters

  • Scylla’s Shard-aware Java driver was used.
  • Background loads were executed in the loop (so duration=5m is not a problem).
  • REPLICATION_FACTOR is 3 (except for major compaction benchmark).
  • COMPACTION_STRATEGY is SizeTieredCompactionStrategy unless stated otherwise.
  • loadgenerator_count is the number of generator machines (3 for “3 vs 3” benchmarks, 15 for “4 vs 40”).
  • BACKGROUND_LOAD_OPS is 1000 in major compaction, 25000 in other benchmarks.
  • DURATION_MINUTES is 10 for in-memory benchmarks, 30 for other benchmarks.

--

--

--

The monstrously-fast NoSQL database.

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Generating Signatures for Display Ads

Reduce Cost and Increase Productivity with Value Added IT Services from buzinessware — {link} -

Top 5 Ways to Get Your Kids to be Interested in Programming. The When, How, and Why.

Why Kubernetes(K8s)?

Python: Purchasing Information and Receipts for Lovely Loveseats

Why R is better than Python

In-Depth Look at the Programmer Problem Solving Checklist for Testers Writing Automation Scripts

An Engineer at work programming

Bootstrap 4 Alpha 6

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
ScyllaDB

ScyllaDB

The monstrously-fast NoSQL database.

More from Medium

Kafka vs RedPanda Benchmark (also Tarantool and Clickhouse as queue)

Stargate gRPC: The Better Way to CQL

Reducing P99 Latency to 150 μs and Hardware Cost by 75% with a Scale-Out DBMS

OpenTelemetry