Finding exposed .git repositories

#security #git #devops #webdev

Developers use git to version control their source code. We all do, in fact, this blog is currently being versioned by git. However, we not only use git to version control, but also to deploy applications. Usually we push the new code to a remote server, where the server takes care of testing the code and then deploying the application. There are different ways of deploying an application, but this is one of them.

Some developers/sys-admins simply clone the repository to their server, then simply point their web-server to that directory. Most of the VCS keep a hidden directory at the root of the project. git keeps a .git directory at the root of the repository, where all the information of that project is stored, such as logs, versions, tags, configs, previous revisions and so on.

Not only some developers/sys-admins clone the repository in their server, but they also point their web-server to that directory. Not only that, they also sometimes change the source to set some configurations files with sensitive informations (email/password, and so on).

If the web-server is pointing to the git repository and has directory listing enabled, we could could download the .git using wget/curl recursively, then simply checkout a master and voilà.

For example, say https://seds.nl/.git exists, containing all the objects of this blog, we could simply run wget --mirror -I .git https://seds.nl/.git to download the repository.

How to find exposed git repositories

We can simply try out every URL we know by adding /.git at the end of the TLD.

Just kidding.

An easy way of finding websites which currently expose .git is using Google D0rks. If you are not familiar with Google D0rks, it’s basically a few operators Google offers you to filter out a few queries. Here is a list of a few of them. The one we need is the intext: operator.

intext:"Index of /.git"

This query makes use of the intext operator. It allows us to ask Google to find all the pages that have a specific word in the body somewhere forcing inclusion on the page (source)

Holy sh1t. That’s ~89,900 results from Google. Can you imagine how much sensitive information there must be?

How to fix this

Well, first of all, find a better way to getting your source code to a remote server and simply pointing your web-server to that directory. If you don’t feel like finding a better way, or just want to keep things simple, here is what you need to do.

Nginx

Add the following telling Nginx to deny any request to a .git directory:

location ~ /.git/ {
    deny all;
}

Apache

Add the following telling Apache to deny any request to a .git directory:

<DirectoryMatch "^/.*/\.git/">
    Order deny,allow
    Deny from all
</Directory>

The only question that remains is: Is there anyway you could extract a .git from a web-server that has directory listing disabled? I haven’t looked much into it, but I wonder if there is anyway we could use git against itself.

Top comments (5)

Phil Ashby • Jan 23 '20

Thanks Ben, good advice!

If you are using git, then an archive is a neat way to generate deployable content: superuser.com/questions/81173/git-... You can do this away from the server and push the archive, or on the server (after the git pull!) to the web folder.

Subversion has the export command that operates similarly, generating a tree without the VCS control files.

Ben Mezger (seds) • Jan 23 '20

git archive is neat indeed, though I prefer to actually have a CI on the other end and take care of deploying the application for me. If you use AWS's Elastic Beanstalk, you can create a zip and deploy with eb deploy (although afaik it already creates a zip out of your git repo by using git archive)

I didn't know about subversion, it's a neat feature actually, does git have something similar without a zip?