Developers use git
to version control their source code. We all do, in fact, this blog is currently being versioned by git
. However, we not only use git
to version control, but also to deploy applications. Usually we push the new code to a remote server, where the server takes care of testing the code and then deploying the application. There are different ways of deploying an application, but this is one of them.
Some developers/sys-admins simply clone the repository to their server, then simply point their web-server to that directory. Most of the VCS keep a hidden directory at the root of the project. git
keeps a .git
directory at the root of the repository, where all the information of that project is stored, such as logs, versions, tags, configs, previous revisions and so on.
Not only some developers/sys-admins clone the repository in their server, but they also point their web-server to that directory. Not only that, they also sometimes change the source to set some configurations files with sensitive informations (email/password, and so on).
If the web-server is pointing to the git
repository and has directory listing enabled, we could could download the .git
using wget/curl
recursively, then simply checkout a master and voilà.
For example, say https://seds.nl/.git exists, containing all the objects of this blog, we could simply run wget --mirror -I .git https://seds.nl/.git
to download the repository.
How to find exposed git repositories
We can simply try out every URL we know by adding /.git at the end of the TLD.
Just kidding.
An easy way of finding websites which currently expose .git
is using Google D0rks. If you are not familiar with Google D0rks, it’s basically a few operators Google offers you to filter out a few queries. Here is a list of a few of them. The one we need is the intext:
operator.
intext:"Index of /.git"
This query makes use of the intext
operator. It allows us to ask Google to find all the pages that have a specific word in the body somewhere forcing inclusion on the page (source)
Holy sh1t. That’s ~89,900 results from Google. Can you imagine how much sensitive information there must be?
How to fix this
Well, first of all, find a better way to getting your source code to a remote server and simply pointing your web-server to that directory. If you don’t feel like finding a better way, or just want to keep things simple, here is what you need to do.
Nginx
Add the following telling Nginx to deny any request to a .git directory:
location ~ /.git/ {
deny all;
}
Apache
Add the following telling Apache to deny any request to a .git directory:
<DirectoryMatch "^/.*/\.git/">
Order deny,allow
Deny from all
</Directory>
The only question that remains is: Is there anyway you could extract a .git
from a web-server that has directory listing disabled? I haven’t looked much into it, but I wonder if there is anyway we could use git
against itself.
Top comments (5)
Thanks Ben, good advice!
If you are using git, then an archive is a neat way to generate deployable content: superuser.com/questions/81173/git-... You can do this away from the server and push the archive, or on the server (after the git pull!) to the web folder.
Subversion has the export command that operates similarly, generating a tree without the VCS control files.
git archive
is neat indeed, though I prefer to actually have a CI on the other end and take care of deploying the application for me. If you use AWS's Elastic Beanstalk, you can create a zip and deploy witheb deploy
(although afaik it already creates a zip out of your git repo by usinggit archive
)I didn't know about subversion, it's a neat feature actually, does git have something similar without a zip?
Doesn't look like it - one always has to unpack, which is easiest with the tar formatted stream: git-scm.com/docs/git-archive
i thought the
deny
command on nginx was default 😮. Thank you Ben.Nor
deny
orallow
are the defaults. Although I think one of them should.