Scylla Virtual Workshops

ScyllaDB
5 min readJul 21, 2020

Scylla Virtual Workshops are a way to deepen your understanding of our database and make you more productive with it. We began the series in May and hold workshop sessions twice per month.

The format consists of a demonstration of how to deploy a fully functional cluster of Scylla using Docker, an overview of Scylla’s design principles and NoSQL-based practices from Scylla’s point of view, followed by a Q&A session. We also encourage attendees to fill out a survey based on the common sizing issues they encounter while scaling their workloads, so we can better understand our attendees’ real-world setups and situations.

Who should attend?

Scylla Virtual Workshops are oriented toward those who want to deepen their understanding of NoSQL, and how Scylla might fit their real-world big data use cases. The ideal attendee should have basic Linux/Docker experience as well as a familiarity with NoSQL, but no specific knowledge of Scylla is required.

If you are a database architect, developer, or engineering manager looking to build or migrate a current application using a NoSQL database (MongoDB, Couch, Redis, Cassandra), this will be a great session for you. If you are using an RDBMS (PostgreSQL, MySQL, Aurora) and your application is growing to the level of having issues with scalability, distribution, or performance induced from the database, then this session would be of great value for you.

Now that we are entering our third month for the series, we thought we’d share with you some of the most insightful Q&A’s from one of our recent sessions, hosted by Eyal Gutkind, our VP of Solutions.

Q: Are Scylla Monitoring Stack and Scylla Manager part of Scylla Open Source?

A: We deliver these tools separately from Scylla Open Source. Scylla Monitoring Stack is an open source offering itself. Scylla Manager is free-to-use for Scylla Open Source users, but is limited to five nodes. (Scylla Manager can manage an unlimited number of nodes for Scylla Enterprise.)

Q: Does Scylla Monitoring Stack support the DynamoDB-compatible API?

A: Yes. In fact, we provide a dashboard interface for Alternator, our DynamoDB-compatible API, in Scylla Monitoring Stack. You can even enable monitoring for only the DynamoDB interface if you are not using the CQL interface (read more about it here).

Q: Do the Cassandra client drivers work seamlessly with ScyllaDB?

A: Yes. Any Cassandra-compatible driver will work with Scylla. However, you will see performance improvements if you use one of the Scylla shard-aware drivers, such as our Go, Java or Python drivers. Refer to our documentation to find the latest available drivers.

Q: Are Scylla’s SSTables compatible with open source Apache Cassandra?

A: Yes. They are compatible with Apache Cassandra SSTables. So if you are using the “la,” “ka, “mc” format, it’s possible to move those files from Cassandra to Scylla. (Read more here.)

Q: Is it possible to upgrade from Apache Cassandra or DataStax? How?

A: Yes. It’s possible to migrate from both. The question isn’t whether you can migrate, but which of the various migration strategies is the best for your use case. Cold migration? Dual writes? One way is by snapshotting an existing Cassandra system and loading the files to Scylla via sstableloader; you can read about that in our migration guide. However, DataStax uses a proprietary SSTable format on disk, so instead, we provide an open source Apache Spark-based, Scylla Migrator.

Q: How are you benchmarking your disks/systems? How can we evaluate it on our own?

A: Perfect! You can do that. We use something called iotune. Behind the scenes, when you do Scylla Setup, we are doing something called iotune. It will benchmark the number IOPS that you’ll get for reads and writes, and you get the actual bandwidth of the system. We recommend that you use a construction of RAID0, software RAID0. Don’t give a RAID controller. It’s going to be less efficient from your perspective.

Q: We currently have a cluster of 30 nodes. How can I drop it down to 5 nodes in Scylla?

A: There are some guidelines that we give you. For example, we would like to see a ratio of 1 to 50, in terms of memory-to-disk. So, for every gigabyte of RAM, we want to see roughly 50 gigabytes of disk space available. CPU? Not that much. Give us enough CPU and you’ll be great. Think about it. If you have 30 nodes and each one has 8 vCPUs, for 240 vCPUs total. That means you are running at 120 physical cores. For a system like Scylla, that’s way above what we need. If you have 30,000 to 50,000 transactions per second you can probably do that with 30 to 40 physical cores, so 80 vCPUs. You’ll be shrinking your cluster by 3x at least.

Also consider that Scylla can be used on denser nodes. The AWS I3en series can have up to 96 vCPUs per node. Which means you could reduce your deployment to just three nodes (with triple replication). That would shrink your cluster 10x.

If you have more questions about sizing reach out to us.

Q: Can we use shared disk or is it necessary to use directly connected SSDs?

A: We highly recommend directly-connected SSDs. You can use shared disk, something like VMware or OpenStack. However, we want to make sure that you have the right throughput and latency and IOPS enabled.

Q: How do you expose the server’s metrics?

A: So we use Prometheus in Scylla Monitoring Stack to expose all the metrics that we have inside the system. By all means play around with it. You can even use Datadog if you want to push the data out.

Q: Can you use Scylla Manager in a Docker container?

A: The answer to that is ‘Yes.’

Q: How does Scylla behave if you don’t have TTLs and continually generate tombstones?

A: Tombstones are not a problem if you have properly defined your GC grace period and compactions are run properly. (You can learn more about tombstones and repairs in this Scylla University course.)

Q: What about Kubernetes support?

A: We provide an open source Scylla Kubernetes Operator, which is currently in beta. You can learn more about it by reading our documentation, watching this presentation from last year’s Scylla Summit, or taking the course in Scylla University.

Join a Virtual Workshop

Scylla’s next Virtual Workshop will be this Friday, July 24 2020. Beyond that, always check our Webinars page to find out about our latest virtual events, including live and on-demand webinars. Also consider checking out videos from our Tech Talks from past live events.

SAVE YOUR SEAT AT OUR NEXT VIRTUAL WORKSHOP

--

--

ScyllaDB

The monstrously fast and scalable NoSQL database.