Project Circe May Update

by Dor Laor

Project Circe is ScyllaDB’s year-long initiative to make Scylla, already the best NoSQL database, even better. We’re sharing our updates for the month of May 2021.

Operational Improvements

Better failure detection

SLA per workload

Configuration is now possible to set timeouts based on a role, using the SERVICE LEVEL infrastructure. These timeouts override the global timeouts in scylla.yaml, and can be overridden on a per-statement basis.

Virtual table enhancements

Repair news

Off strategy is also important when Repair Based Node Operations (RBNO) is used — this will soon be the default. RBNO pushes repair everywhere — to streaming, remove and node decommission. It’s important to tame compaction accordingly and automatically, so you as an end user wouldn’t even be aware of it.

Repair is now delayed until hints for that table are replayed. This reduces the amount of work that repair has to do, since hint replay can fill in the gaps that a downed node misses in the data set.

Raft news

Better Performance

  • Change Data Capture (CDC) uses a new internal table for maintaining the stream identifiers. The new table works better with large clusters.
  • Authentication had a 15-second delay, working around dependency problems. But it is long unneeded and is now removed, speeding up node start.
  • Repair allocates working memory for holding table rows, but did not consider memory bloat and could over-allocate memory. It is now more careful.
  • Scylla uses a log-structured memory allocator (LSA) for memtable and cache. Recently, unintentional quadratic behavior in LSA was discovered, so as a workaround the memory reserve size is decreased. Since the quadratic cost is in terms of this reserve size, the bad behavior is eliminated. Note the reserves will automatically grow if the workload really needs them.
  • We fixed a bug in the row cache that can cause large stalls on schemas with no clustering key.
  • The setup scripts will now format the filesystem with 1024 byte blocks if possible. This reduces write amplification for lightweight transaction (LWT) workloads. Yes, the Scylla scripts take care of this for you too!
  • SSTables will now automatically choose a buffer size that is compatible with achieving good latency, based on disk measurements by iotune.

Just Cool

  • Another, even competing method is the cloud-formation based container client setup that allows our cloud team to reach two million requests per second in a trivial way. Check out the blog post here, and the load test demo for Scylla Cloud in Github.
  • The perf_simple_query benchmark now reports how many instructions were executed by the CPU per query. This is just a unit-benchmark but cool to track!
  • The tarball installer now works correctly when SElinux is enabled.
  • There is now rudimentary support for code-coverage reports in unit tests. (Coverage may not be the coolest kid in the block but it is not cool not to test properly!)

Scylla Operator for Kubernetes News

Scylla Enterprise News

  • Space Amplification Goal (SAG) for Incremental Compaction Strategy allows a user to strike the utilization balance the desire between write amplification (which is more CPU and IO intensive) and disk amplification (which is more storage intensive). SAG can push the ICS volume consumption lower, towards the domain of Level compaction which you still enjoy from size-tiered behavior. This is the default and the recommended compaction strategy.
  • New Deployment Options for Scylla Enterprise now include our Scylla Unified Installer and allow you to install anywhere, also with an air-gap environment.

Security News

LEARN MORE ABOUT PROJECT CIRCE

The monstrously-fast NoSQL database.