Capacity Planning with Style

By Avishai Ish Shalom

Scylla Cloud now offers a new Scylla Cloud Calculator to help you estimate your costs based on your database needs. While it looks like a simple tool, anyone steeped in the art of database capacity planning knows that there is often far more to it than meets the eye. In this blog post we’ll show you how to use our handy new tool and then illustrate it with an example from a fictional company.

Considerations for the Cloud

Yet in the case of stateful systems like databases capacity planning is still vital. Because although modern databases are elastic — allowing you to add and remove capacity — this elasticity is limited. For example, databases may need hours to adjust, making them unable to meet real-time burst traffic scaling requirements. Scaling a database is relatively slow and sometimes capped by the data model (e.g. partitioning) or data distribution methods (replication).

As well, many databases are made more affordable by signing longer term committed license agreements, such as annual contracts. In such cases, you want to make sure you are signing up for the actual capacity you need, and not overprovisioning (wasting your budget on excess capacity) or underprovisioning (risking hitting upper bounds on performance and storage capacity).

Using the Scylla Cloud Calculator

We have written a separate guide for how to use the Scylla Cloud Calculator, but let’s go through some of the basics here. With our calculator, you can specify the following attributes:

  • Read ops/sec
  • Write ops/sec
  • Average item size (in KB)
  • Data set size (in TB)
  • On demand or reserved (1 year)

See How Scylla Cloud Compares

  • Amazon DynamoDB
  • DataStax Astra
  • Amazon Keyspaces

Note that while many features across these databases may be the same or similar, there will be instances where specific features are not available, or there may be similar features that vary widely in implementation across the different offerings. How Scylla refers to “Reserved” and “On-Demand” resources also may differ from how other vendors use these same terms. You are encouraged to contact us to find out more about your specific needs.

ScyllaDB will strive to maintain current pricing on these competitors in our calculator. Yet if you find any discrepancy between our calculator and what you find on a competitor’s website, please let us know.

Assumptions and Limitations

If you have a need to do more advanced sizing calculation, to model configurations and settings more tailored to your use case, or to conduct a proof-of-concept and your own benchmarking, please reach out and contact our sales and solutions teams directly.

For now, let’s give an example of a typical use case that the calculator can more readily handle.

Example Use Case: BriefBot.ai

Each feed is limited to 1,000 daily items — we really don’t expect users to read more in a single day, and each item is about 1kb. Further, for the same of data retention, the scrollback in the feed is limited. Historical data will be served from some other lower priority storage that may have higher latencies than Scylla. A method such as Time To Live (TTL) can evict older data records. This also governs how swiftly the system will need to grow its storage over time.

Data Model

CREATE KEYSPACE briefbot WITH REPLICATION = {'class': 'SimpleStrategy', 'replication_factor': 3}CREATE TABLE briefs (
user_id UUID,
brief_id UUID,
bot_id UUID,
timestamp TIMESTAMP,
brief TEXT,
points int,
origin_time TIMESTAMP,
PRIMARY KEY (user_id, timestamp)
)

The briefs table is naturally partitioned by user_id and has everything we need when loading the feed, so we only need to do one query when a user is looking at the feed

SELECT * FROM briefs WHERE user_id=UUID AND timestamp<CURRENT_POSITION PER PARTITION LIMIT 100

This is a common pattern in NoSQL: denormalizing data such that we can read it with a single query. Elsewhere we will have another database containing briefs accessible by bot id, brief id etc, but this database is specifically optimized for fast feed loading experience of our users’ feeds. For example, if we need bot logos we might add another field “bot_avatar” to our table (right now we’ll fetch bot avatars from a CDN, so this isn’t important). Also note that in our example data model we specified using SimpleStrategy to keep things simple; we advise you choose NetworkTopologyStrategy for an actual production deployment.

While this table is being read synchronously, writes are done by a processing pipeline asynchronously. Every time a bot generates a new brief, the pipeline writes new brief items to all relevant users. Yes, we are writing the same data multiple times — this is the price of denormalization. Users don’t observe this and writes are controlled by us, so we can delay them if needed. We do need another subscriptions table which tells us which user is subscribed to which bot:

CREATE TABLE subscriptions (
user_id UUID,
bot_id UUID,
PRIMARY KEY (bot_id)
)

Notice the primary key of this table is bot_id, you might wonder how we are able to handle bots which may have millions of users subscribed to them. In a real system this would also cause a large cascade of updates when a popular bot has a new brief – the common solution for this problem would be to separate the popular bots to a separate auxiliary system which has briefs indexed by bot_id and have users pull from it in real time. Instead we will make bot_id a virtual identifier, with many uuids mapped to the same bot; every “virtual bot” will only have several thousand users subscribed to it.

Workload Estimation

Our initial guesses for the number of users are given as ranges based on market research. The model is based on data both from known sources (e.g. the number of briefs per bot which we control) and unknown sources outside of our control (e.g. the number of users and their activity). The assumptions we make need to take into account our uncertainty, and thus larger ranges are given to parameters we are less sure of. We can also use different distributions for different parameters, as some things have known distributions. The output of the model is also given as distributions, and for sizing our system we will take an upper percentile of the load — 90th percentile in our case. For systems that cannot be easily scaled after initial launch we would select a higher percentile to reduce the chances of error — but this will make our system substantially more expensive. In our example, the values for database load are:

As you can see, there is a large difference between the median and the higher percentiles. Such is the cost of uncertainty! But the good news is that we can iteratively improve this model as new information arrives to reduce this expensive margin of error.

Growth over Time

Compaction

Deciding on Operating Margins

The operating margin we choose will account for:

  • 1 node failure
  • Routine repair cycle
  • Peak load increase of 25%
  • Shards which are 20% hotter than average shard load

These margins can be either conservative and be stacked on top of another, or optimistic and overlapping. We choose to be optimistic and assume there is a small chance of more than one or two of the above happening simultaneously. We will need to make sure our capacity has a margin of at least one node or 25%, whichever higher. This margin will be applied to the throughput parameters, as storage already has its own margin for compaction, node failures and such — this is part of the design of Scylla.

Estimating Required Resources Using the Scylla Cloud Calculator

Again, the calculator is just an estimate — the exact performance of a complex system like Scylla depends on the actual data and usage patterns, but it does provide results which are good enough for initial design and cost estimation.

Selecting the Right Cluster

Scylla is a redundant cluster which is resilient to node failures, but that does not mean node failures have zero impact. For instance, in a recent disaster involving Kiwi.com, a fire took out an entire datacenter, resulting in the loss of 10 out of the total 30 nodes in their distributed cluster. While Scylla kept running, load on the other servers did increase significantly. As we have previously discussed we need to account for node failures in our capacity planning.

We should note that using larger nodes may be more efficient from a resource allocation point of view, but it results in a small number of large nodes which means that every failing node takes with it a significant part of cluster capacity. In a large cluster of small nodes each node failure will be less noticeable, but there is a larger chance of node failures — possibly more than one! This has been discussed further in The Math of Reliability.

We have a choice between a small cluster of large machines which emphasizes performance and cost effectiveness in a non-failing state and a large cluster of small machines which is more resilient to failures but also more likely to have some nodes failing. Like all engineering tradeoffs, this depends on the design goals of the system, but as a rule of thumb we recommend a balanced configuration with at least 6–9 nodes in the cluster, even if the cluster can run on 3 very large nodes.

In the case of BriefBot, we will use the calculator recommendation of 15 i3.12xlarge nodes which will give us ample capacity and redundancy for our workload.

Monitoring and Adjusting

In the case of Scylla, the metrics we would like predict growth on would be:

  • Storage used
  • CPU utilization (Scylla, not O/S)

These metrics and many more are available as part of Scylla Monitoring Stack.

Summary

CHECK OUT THE SYLLA CLOUD CALCULATOR

The monstrously-fast NoSQL database.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store