This is the second story in the “Lessons Learned” series where we discuss real-world vulnerabilities from the eyes of an application security engineer, with a focus on the underlying root causes and the measures we can take to prevent similar issues in our applications.
In today’s story, we will discuss CVE-2024-22412 which affected ClickHouse a popular open-source column-oriented database management system typically used for online analytical processing (OLAP) in real-time. You can find the full write-up of the vulnerability here.
Impact of the vulnerability
This vulnerability could lead to authorization bypass under specific conditions, potentially leading to the exposure of sensitive data stored in the database.
What went wrong?
ClickHouse had previously introduced a feature that allowed role-based access control to any table based on the value of a column. For example you can create 2 roles, one that is only allowed to access to rows where user_id = 1
and another role is only allowed access to rows where user_id = 2
with the statements below.
CREATE ROLE user_role_1;
GRANT SELECT ON user_data TO user_role_1;
CREATE ROW POLICY user_policy_1 ON user_data
FOR SELECT USING user_id = 1 TO user_role_1;
CREATE ROLE user_role_2;
GRANT SELECT ON user_data TO user_role_2;
CREATE ROW POLICY user_policy_2 ON user_data
FOR SELECT USING user_id = 2 TO user_role_2;
GRANT user_role_1, user_role_2 TO user;
INSERT INTO user_data (id, user_id) VALUES
(1, 1), (2, 2), (3, 1), (4, 2), (5, 1), (6, 1), (7, 1), (8, 1), (9, 2);
Now when selecting from the table using these 2 roles the results will vary based on the role as shown below.
This “Role-based access control” feature in itself was working fine until a new feature was introduced which is a “Query cache”. The goal of the new feature is to enhance performance by caching the results of queries, and returning the results from the cache if the same query is run.
Now as you may have already guessed, the issue here was related to how these 2 features played together. The “Query cache” didn’t add the user role in the identifier of the query, making the same query run by 2 different roles look the same for the query cache, and subsequently returning the results of user_role_1
if they are already cached when user_role_2
is used, allowing access to rows the role shouldn’t be authorized to access.
The Fix
ClickHouse fixed this by adding a patch incorporating current users and roles into the cache key, making the same query with 2 different roles have different cache keys.
Lessons Learned
Similar to the last story of this series, this is a business logic issue specific to ClickHouse, so no security scanning tool (SAST, DAST, IAST, .. etc) could have detected this issue. For this kind of issue, your best lines of defense are:
- Threat modeling: Spending time during the design phase of any project to decide what could go wrong would be a good place to discuss business logic issues and what can be done to avoid them (the mitigations). For complex issues involving multiple features like this one, Threat modeling is the activity that is most likely to catch such issues before being pushed to production. For more about threat modeling you can check my series Threat Modeling Handbook.
- Security tests: Covering the security properties of your features (e.g. the threat model mitigations) with unit or integration tests could also have helped with detecting new security issues being introduced after a feature is launched. That being said, note that issues related to caching are sometimes missed by unit and integration tests as they need a specific sequence of events to be reproducible.
- Pentests and Bug Bounty: If you miss such issues in your threat model and security tests, then having regular pentests and/or a bug bounty program can act as your safety net. In this case, the issue was reported to ClickHouse’s Bug Bounty program, which put them in a much better place that if an actual attacker discovered the issue and tried to exploit it.
Conclusion
Software is complex, and while 2 features could be working well separately, they could introduce a security vulnerability when combined. Hence, it is always good to consciously consider the security implications of any new feature during the design phase (threat modeling) and to cover the security properties with tests to ensure they don’t get broken by future changes.
Top comments (0)