By Peter Corless
We’ve gotten a lot of attention since we published our benchmark reports on the performance of Apache Cassandra 4.0 vs. Cassandra 3.11, and, as well, how Cassandra 4.0 compares to Scylla 4.4. We had so much interest that we organized a webinar to discuss all of our benchmarking findings. You can watch the entire webinar on-demand now:
You can read the blogs and watch the video to get the full details from our point of view, but what happened live on the webinar was rather unique. The questions kept coming! In fact, though we generally wrap a webinar in an hour, the Q&A session afterwards took an extra half-hour. ScyllaDB engineers Piotr Grabowski and Karol Baryla fielded all the inquiries with aplomb. So let’s now look at just a few of the questions raised by you, our audience.
Q: Was this the DataStax Enterprise (DSE) version of Cassandra?
Karol: No, our tests were conducted against Apache Cassandra open source versions 3.11 and 4.0.
Piotr: We actually started working on those benchmarks even before Cassandra 4 was released. Of course those numbers are from the officially released version of Cassandra.
[DataStax Enterprise is currently most closely comparable to Cassandra 3.11 — see here.]
Q: Did you try to use off-heap memtables in the tests of Cassandra?
Piotr: Yes. First of all, I don’t think it really improved the performance. The second point is that I had some stability issues with off-heap memtables. Maybe it would require more fine-tuning. We did try to fine-tune the Cassandra configuration as much as we could to get the best results. But for all the benchmarks we have shown, we did not use off-heap memtables.
Q: If Java 16 is not officially supported, is it not risky to use it with Cassandra 4.0 in production?
Karol: Correct. Java 16 is not officially supported by Cassandra. We used it in our benchmarks because we wanted to get the best performance possible for Cassandra. But yeah, if you wanted to use Cassandra 4.0 in production, then this is something you should take into consideration: that your performance may not be the same as the performance in our benchmarks. Because if you want to use Java 11, between that and Java 16 the ZGC garbage collector had a lot of performance improvements. Java 11 performance might not be as good.
Q: The performance of ScyllaDB looks so much better. What’s the biggest concern I need to pay attention to if I want to use it to replace current Cassandra deployments?
Piotr: With Scylla, we highly recommend using our shard-aware drivers. Of course, Scylla is compatible with all existing Cassandra drivers. However, we have modified a select portion of them — the Java driver, the Go driver, the C++ driver, [and the Python and Rust drivers] — we have modified them to take advantage of the shard-aware architecture of Scylla. All of the requests that are sent from our shard-aware drivers come into the correct shard that holds the data.,
We did use our own shard-aware drivers in the testing. However, when our shard-aware driver connects to Cassandra it falls back to the old [non-shard aware] implementation, which we didn’t modify. They are backwards compatible with Cassandra.
Q: With latencies so low (in milliseconds), it seems like ScyllaDB takes away the need for building in-memory caches. Is that the right way to look at this?
Piotr: It depends on your workload. For many workloads it might be possible that you don’t need to use an in-memory cache. And if you look at Scylla internals, we have our own row-based cache, which serves as an in-memory cache within the Scylla database.
The best way to tell is to measure it yourself. Check out the difference in our benchmarks between the disk-intensive workloads and the memory intensive workloads.
[You can learn more about when an external cache might not be necessary by checking out this whitepaper.]
Q: In the benchmark graphs, the X scale shows 10k/s to 180k/s but they are called “operations” by the presenter. Is it really operations and not kilobytes/second?
Karol: Correct. Those are operations per second, not kilobytes per second.
Piotr: The payload size was the default for cassandra-stress, which is 300 bytes.
[Thus, for example, if a result was 40k/s ops, that would be 40,000 ops x 300 bytes, or 12 Mbytes/sec throughput.]
Piotr: You can read more of the specific test setup in the blog post.
Q: When adding a new node, can you remind us how much data is distributed across the nodes? E.g. surely it will take much longer to add a new node if there’s 100TB on each node compared with 1TB on each node…
Karol: In the 3-node test, each node has 1TB of data — 3TB data total in the cluster. In the 4 vs. 40 node test, the dataset was 40 TB. So, for Scylla it was 10TB of data for each node, and for Cassandra it was 1TB per node.
Q: What actually happens when you add a new node or double the cluster size? Why does Scylla need to do a compaction when it adds new nodes?
Piotr: So let’s say you have a three node cluster and you add a fourth node. When a new node is added, the database redistributes the data. That’s when streaming happens. Streaming is essentially copying the data from one node to another node. In the case of adding a new node, the data is streamed from all the existing nodes to the new node.
Compaction may be running while adding a new node, but the main reason we mentioned it is because using the Leveled Compaction Strategy (LCS) was supposed to have a greater advantage for Cassandra 4.0, because it has Zero Copy Streaming, which is supposed to work better with LCS strategy. Yet this doesn’t kick in during the streaming, but when we first populated those nodes. We added 1TB to each node, and periodically the database would compact different SSTables, and the LCS tables are better for Zero Copy Streaming in Cassandra.
Q: Did you compare your replace node test with the test published by Cassandra (where they declared 5x times improvement) — why was the difference in results so large?
Piotr: The schema might be different. We also ran background load, which might have introduced some differences, and we wanted to test a real case scenario. So I am not sure that a 5x improvement is an average performance gain.
Q: What main reasons make Scylla’s performance so much better than Cassandra?
Piotr: The big differential between Cassandra 3 and 4 was the JVM algorithms that were used to garbage collect. Scylla is written in C++. We use our own framework, Seastar, which runs close to the metal. We tuned it to run really close to the storage devices and to the network devices, while Cassandra has to deal with the JVM, with the garbage collection mechanism. So the first part is the language difference. Java has to have a runtime. However, as you have seen, Java is getting better and better, especially with the new garbage collection mechanisms.
The second part is that we really have tuned the C++ to be as fast as possible using a shard-per-core architecture. For example, in our sharded architecture, each core is a separate process that doesn’t share that much data between all other cores. So if you have a CPU that has many cores, it might be possible that you have many NUMA nodes. And moving data between those NUMA nodes can be quite expensive. From the first days of Scylla we really optimized the database to not share data between shards. And that’s why we recommend shard-aware drivers.
[Though it must be observed that newer garbage collectors like ZGC are also now NUMA-aware.]
These are just a few of the questions that our engaged audience had. It’s worth listening to the whole webinar in full — especially that last half hour! And if you have any questions of your own, we welcome them either via contacting us privately, or joining our Slack community to ask our engineers and your community peers.