Now before you think I am casually going to write up here how I hacked AWS, I think CodePipeline works as intended. Still, I would like to bring a quirk to your attention in the way it's set up. This quirk can make it run different code, than the code that was put into the pipeline.
In short: overwrite the source artifact in S3 in the time between CodePipeline putting it there and follow-up actions downloading it for their business. This is not mentioned in their docs. (I did mention it as feedback to them some months ago, but nothing changed and I haven't heard from them.)
I could leave you with this, but it's of course more fun and insightful to give some more details.
Steps to reproduce
First of, you will need to have a CodePipeline pipeline deployed. If you don't know how to set it up, see AWS' documentation. I will use CodeCommit as repository source and CodeDeploy as execution environment in my example below, like in the provided AWS example. Any other combination of services that uses S3 to temporarily store the source artifact should work fine as well. (I verified that it also works with GitHub source / Lambda execution.)
Normal flow
For the demo, I created a minimal CodeCommit repository with only the following buildspec.yaml
file in a repository called inject
:
This should make a CodeBuild server echo hello world
. (Normally, you would do something more interesting here, such as compiling code or running a Terraform configuration.)
Then I start a pipeline that uses this repository as its source and pass it to CodeBuild:
As expected, this causes the following to happen in CodeBuild:
The string hello world
was correctly outputted.
Now to the more interesting part.
With injected code
Some general advice before you start: speed is key, so sharpen your AWS GUI navigation skills ;) (You could of course put a manual approval action in the pipeline to give yourself all the time you need, but I wanted to keep this demo as simple as possible.)
Again start the pipeline:
Now quickly go to the S3 bucket where the source artifacts are stored and check the name of the latest source zip file:
Now quickly rename your code package accordingly:
Note the different string to be echoed.
Upload that file to the bucket used as source, overwriting the original source file (to save time you could already setup this window, don't forget to use the correct settings for KMS etc.):
Now watch how CodeBuild executes the code just uploaded instead of what's in the repository:
Closing remarks
Being able to make a pipeline execute different code than that what was put into it is of course already undesirable by itself. What I think that makes it a more serious problem in this case is that it can be done via a completely different service (S3) which is usually used for all sorts of stuff. Editing the code to be executed in S3 can easily go unnoticed in an account hosting a score of different microservices or one with a lot of users. Also, in such a case the chance of a single service or user being hacked that is privileged enough to edit S3 is way higher than the chance that someone hacks their way up to a privilege to edit a pipeline directly.
Again, I would like to emphasize that this is not really a hack into CodePipeline. Still, I think CodePipeline is trying its best to hide all its shifting around with roles and artifacts from you. That is of course great for simplicity, but it might also lead to a security hole in your architecture when you are not careful or simply overlook it. It would be nice if CodePipeline would support pointing to a specific object version in the future, those have a stable unique id. For now, all you can do is protect the source artifact bucket at least as good as your pipeline. If you would like some inspiration on protecting your S3 buckets to the most paranoid level, I wrote a blog about that as well: Read-only buckets in shared AWS accounts.
I wrote this blog and performed the work described within it at Simacan, my current employer. Are you just as passionate as me about this, then have a look at our working@-page or our developer portal and maybe we'll be working together soon :)
Top comments (2)
Yes, exactly. I know most of the article is about how to reproduce this issue, but in the last two paragraphs I do try to point out that permissions are a very important part of the issue I see here. S3 permissions tend to be given out far more, let's say, easily than those for something like CodePipeline. That weakens the stance of the pipeline by itself and what makes it more worse is that you don't even get to see S3 is being used in the background if you quickly setup a pipeline in the GUI.
Also, in the end, most hacks can be formulated as permissions issues. I mean, even classic SQL injection: you should not have allowed your users to edit the query directly ;)
Well, not completely, because it's not the build artifacts we are modifying, but the pipeline input. It would be more like making an Azure DevOps build server run other code than that which was pushed to the main branch, for example. This would not be possible without hacking the build server.