TL;DR
BigQuery has two pricing models: On-demand, charging per byte processed, and Capacity-based, charging for slot capacity over time. Capacity-based can be cheaper if monthly costs are high. It involves reserving slots and can be optimized with historical data and tools like the Slot Estimator. Commitments for 1 or 3 years can lower costs even further.
I recently did a deep dive into BigQuery’s compute pricing models. Here is what I learned.
BigQuery provides two types of compute pricing models:
- On-demand: This is the default option and works well for most situations. You are charged based on the amount of data processed.
- Capacity-based: This model charges for the capacity used to run queries, measured in slots (virtual CPUs) over time.
That last one can actually save you quite a bit of money when properly used. The official Google Cloud documentation has 10 pages with over 10,000 words on the subject, including various tutorials. Let’s see if we can get the main idea in much less.
Capacity-based pricing can grant you cost benefits if your monthly BigQuery compute costs are higher than the minimum commitment offered. For a 3-year commitment in the EU region, you currently pay $0.0396 per slot hour. There is a minimum of 100 slots for any commitment, so that makes 100*0.0396 dollars per hour, for (365/12)*24 hours per month: $2,890.8.
Capacity-based pricing can grant you cost benefits if your monthly BigQuery compute costs are higher than $2,890.8.
Keep in mind that this is only about compute pricing, not storage pricing. Compute pricing includes the cost to process queries, including SQL queries, user-defined functions, scripts, and certain data manipulation language (DML) and data definition language (DDL) statements. See the Overview of BigQuery Pricing for more details.
So, how does all of this work?
Running Queries
When running a query, BigQuery generates an execution tree that breaks the query into stages. These stages contain steps that can run in parallel. Given all these steps, BigQuery calculates the optimal parallelization factor to run the given query. If the optimal parallelization factor for your query is 10, it will request 10 slots to run the query. A slot is nothing more than an abstracted unit of compute, memory, and I/O resources. See How BigQuery processes queries for more details.
If 10 slots are available, BigQuery will use 10 slots. If there are only 5, it will use 5, which will make the query run slower.
With the On-demand pricing model, Google Cloud grants you access to 2,000 concurrent slots shared among your projects (by default). Within this model, you don’t pay for the slots you use; you only pay for the data you scanned. In the Capacity-based pricing model, on the other hand, you purchase dedicated (or autoscaling) query processing capacity (slots). In this case, you don’t pay at all for scanned data.
Making Reservations
A reservation is a pool of dedicated (or autoscaled) slots, bound to a specific region.
There are three different “editions”: Standard, Enterprise, and Enterprise Plus. Each edition has its own options and pricing. See Understand BigQuery Editions for more info.
You assign reservations to either projects, folders, or organizations. All queries run within this structure will have access to the reserved slots.
A reservation can have both dedicated and autoscaled slots.
- Dedicated slots: are always available, which means you always pay for them.
- Autoscaled slots: scale up (up to a defined max) and down when needed. You only pay for what you use, but they are subject to capacity availability. Read Autoscaling best practices before starting with this, as there are some caveats.
To optimize the usage of your reserved (dedicated) slots, queries running in a reservation use idle slots from other reservations within the same project. A requirement for using idle slots across different reservations is that those reservations must have identical regions, projects, and editions. This is a feature that can be turned off per reservation.
So, how many slots do you need?
There is no easy answer for this. You need to establish a baseline based on historical data and evaluate failure rate, performance, and cost.
Fortunately, BigQuery offers a handy Slot Estimator which you can use to estimate if adding or reducing your reserved slots could be useful. This is an image I took from the Google Cloud documentation. It shows how average job duration would increase or decrease if you add or reduce slots:
See the Estimate slot capacity requirements for a pretty good tutorial on how to estimate the number of slots you need based on your historical data.
Making Commitments
You can reserve slots for a pay-as-you-go price or through a reduced commitment price. A commitment is a 1 or 3-year purchase of slots at the reduced price.
Slots can be purchased in increments of 100 slots, up to your slot quota.
So, if you have experimented with pay-as-you-go reservations and have a good understanding of your continuous slot usage, a commitment might be a great next step to reduce costs further.
As a best practice, you can set your minimum monthly analytics use as your BigQuery capacity-based commitment to reduce costs.
See the Capacity compute pricing section for up-to-date pricing information.
Things to Note
Here are some things I found important to know if you consider capacity-based pricing:
- Have a single administration project to manage all commitments and reservations
- Simplifies billing management and slot allocation
- Leverages the use of idle slots
- Set a baseline based on historical data and monitor failure rate, performance, and cost
- Take the following things into account when considering the autoscaler:
- The autoscaler is most efficient for heavy, long-running workloads, such as workloads with multiple concurrent queries
- Scaling goes per 50 slots
- The slots stay up for a minimum of 1 minute
- Slots are subject to capacity availability, the baseline is always available
Conclusion
Leveraging the capacity-based pricing model in BigQuery can lead to significant cost savings by focusing on compute capacity rather than scanned bytes. BigQuery provides tools to accurately estimate the necessary capacity for your workload, and committing to a 1 or 3-year plan can further reduce expenses.
For more detailed information, refer to the comprehensive Workload Management section in the Google Cloud BigQuery Guides.
Top comments (1)
Suppose you have a dataset that's used for both business-critical queries and less critical background jobs. The problem is that the lowest level at which you can apply your pricing model is the project level.
One option is to have an autoscaling reservation with baseline slots set to meet the requirements of your background jobs. In this case, the reservation can scale up if any business-critical query comes in. However, the issue with this approach is that scaling takes time, and the number of slots you receive depends on availability.
Another approach is to create two projects: one configured with capacity-based pricing and the other with on-demand pricing. The dataset can reside in either project but must be accessible by both. You can then connect through the capacity-based project for background queries and through the on-demand project for business-critical queries.