Hello everyone, I am back with another tool. This time it is a tool to generate wordlists based on GitHub repositories. I have named it Repolist
It is a simple tool written in Python. The code is available on GitHub and the package is available on PyPI.
The story behind Repolist
I was working on pentesting a website. I was trying to bruteforce the directories and files on the website. Using the common wordlists from seclists
didn't help much. So I thought of creating a custom wordlist.
I know for a fact that the website is using an open source e-commerce platform called PrestaShop for its backend. So I thought of creating a wordlist based on the files and directories of PrestaShop.
I didn't want to manually copy the files and directories. So I thought of creating a tool that would do it for me.
I'm sure there are other tools that do the same thing. But I wanted to create my own tool just for fun. Python is not my primary language for development. So I thought it would be a good opportunity to use Python for this project.
What is Repolist?
Repolist is a tool that generates wordlists based on GitHub repositories. It uses GitHub API to fetch the files and directories of a repository. It then saves the files and directories in a text file.
To use Repolist, just run the following command:
pip3 install repolist
To generate a wordlist, run the following command:
repolist -u "https://github.com/username/repository"
Options
Arguments:
-h, --help show this help message and exit
-u URL, --url URL Github repository URL (required)
-o OUTPUT, --output OUTPUT
Output file (optional)
-b BRANCH, --branch BRANCH
Use a specific branch (optional)
-t TOKEN, --token TOKEN
Github token (optional)
-p PREFIX, --prefix PREFIX
Prefix (optional)
-s SUFFIX, --suffix SUFFIX
Suffix (optional)
-f, --files Get only files (optional)
-d, --directories Get only directories (optional)
-v, --verbose Verbose mode (optional)
--proxy PROXY Proxy (optional)
Combining Repolist with other tools
Using RepoList with tools like ffuf
, httpx
and gobuster
can be very useful for penetration testing and bug bounty programs.
For example, you can use ffuf
to bruteforce the files and directories of a website using the wordlist generated by Repolist.
repolist -u "https://github.com/WordPress/WordPress" | ffuf -u "http://example.com/FUZZ" -w -
If you have other tools in mind, please let me know in the comments below.
How I made Repolist?
I've used Python with Poetry to create Repolist. Poetry is fairly new to me and It was a great experience using it. Easy setup and dependency management. With few commands, I was able to create the project and publish it to PyPI. I will definitely use it for my future projects.
Argparse
is used to parse the command line arguments. Requests
is used to make the HTTP requests to GitHub API.
The code behind Repolist
The code is fairly simple. It uses the GitHub API to fetch the files and directories of a repository. It then saves the files and directories in a text file.
Here is a small snippet of how it works:
def _get_files_and_directories(self, username="", repo="", branch="main"):
"""
Get files and directories from a repository (recursive)
https://docs.github.com/en/rest/reference/git#trees
"""
url = "https://api.github.com/repos/{}/{}/git/trees/{}?recursive=1".format(
username, repo, branch)
r = self._make_request(url) # add headers if token is specified
if r.status_code == 200:
for file in r.json()["tree"]:
self.repo_content.append({
"path": file["path"],
"type": file["type"]
})
else:
self._log_error(type=r.status_code, msg=r.text)
exit(1)
Using Poetry to publish to PyPI
Poetry makes it very easy to build and publish the package to PyPI. Those who are new to Poetry, here is how you can do it:
poetry new repolist
poetry build
poetry install
poetry publish
You can read more about it here.
Rate limit and proxies
Github API has a rate limit. So I have added an option to specify proxies and tokens. You can also specify a specific branch to get the files and directories.
Conclusion
If you read this far, thank you for reading. I hope you find RepoList useful. If you have any suggestions or feedback, please let me know in the comments below.
RepoList - Generate Wordlists from GitHub Repositories
Repolist is a command-line interface (CLI) tool designed to generate wordlists from GitHub repositories. It simplifies the process of extracting files and directories from GitHub repos, enabling the creation of custom wordlists for penetration testing and bug bounty programs.
You can read more about it in this blog: https://ademkouki.tech/posts/repolist
Table of Contents
- Features
- Installation
- Usage
- Options
- Why RepoList?
- Rate Limiting
- Contributing
- Disclaimer
- License
- Author
Features
- Wordlist Generation: Easily create wordlists from GitHub repositories. Choose between generating a wordlist of files, directories, or both.
- Customization: Add custom prefixes and suffixes to the generated wordlists, such as appending .php to each word.
- Support for Private Repositories: Access and generate wordlists from both private and public repositories by providing a GitHub token using the
-t
option. - Branch Selection: Specify a different branch using the
-b
option. - Proxy Support: Utilize a proxy by using the
-p
option.
Top comments (1)
I hope you find RepoList useful. If you have any suggestions or feedback, please let me know in the comments below