After hunting for security bugs I've realized clients I’m working with are not familiar enough (or at all) with basic “hacking” techniques. API keys, passwords, SSH encrypted keys, and certificates are all great mechanisms of protection, as long they are kept secret. Once they’re out in the wild, it doesn’t matter how complex the password is or what hash algorithm was used to encrypt it somewhere else. In this post, I’m going to share concepts, methods, and tools used by researchers both for finding secrets and exploiting them. I'll also list mitigation action items that are simple to implement.
It’s important to mention that the attack & defend “game” is not an even one; an attacker only needs one successful attempt to get in, whereas the defender has to succeed 100% of the time. The hard part is knowing where to look. Once you can list your virtual “gates” through which hackers can find their way in, you can protect them with rather simple mechanisms. I believe their simplicity sometimes shadows their importance and makes a reason to be overlooked by many teams.
So here’s a quick and simple, yet not one to overlook TL;DR:
- Enforce MFA everywhere - Google, GitHub, Cloud providers, VPNs, anywhere possible. If it's not optional, reconsider the system in use
- Rotate keys and passwords constantly, employ and enforce rotation policies
- Scan your code regularly. Preferably as part of the release process
- Delegate login profiles and access management to one central system where you control and monitor
These are the 20% actions for 80% effect to prevent leaks and access-control holes
The attack & defend “game” is not an even one; an attacker only needs one successful attempt to get in, whereas the defender has to succeed 100% of the time.
So, what do hackers do and use to find passwords and application secrets?
They find them in your JavaScript files
API keys are all over the internet exposed to the world. This is a fact. Often times for no good reason. Developers forget them all around:
- For debug purposes
- For local devlopement
- For future maintainers as comments
Blocks such as this one are all over the internet:
// DEBUG ONLY
// TODO: remove -->
API_KEY=t0psecr3tkey00237948
While many hackers actually sit and read through javascript files, the vast majority of them will automatically scan with tools like meg and then scan them for patterns.
How do they do that? After using a scanner like "meg" they scan their findings for a string that matches different templates. An example of another great tool by the same author that does exactly that is gf which is just a better grep
. In this instance, using truffleHog or the trufflehog
option in the gf
tool can find the high-entropy string that most API keys identify with. The same goes for searching API_KEY
as a string that yields results (too) many times.
Often times, keys have a good reason to appear where they are, but they're not protected from being used externally. One example is a client I've been working with lately, who, like many other platforms use maps as a third-party service. In order to fetch maps and manipulate them, they would call an API with a key and use it to get the relevant map back. What they forgot to do is configure their map provider to limit the origins from where incoming requests with that specific key can originate. It's not hard to think of a simple attack that will drain their license quota, effectively costing them a lot of money, or "better" yet (in terms of the attack) bringing their map-oriented service down.
JS files are not only used to find secrets by hackers. This is your application code open to any prying eyes. An intelligent hacker might read the code thoroughly to understand naming conventions, API paths, and find informational comments. These are later on extrapolated to a list of words and paths and loaded into automated scanners. This is what's referred to as an intelligent automated scan; one where the attacker combines automated processes and gathered organization-specific information.
A real comment left on a target's front page, revealing a set of unprotected API endpoints leaking data.
/* Debug ->
domain.com/api/v3 not yet in production
and therefore not using auth guards yet
use only for debugging purposes until approved */
What should you do then?
- Minify / Uglify - Adds a layer of obfuscation and utilization. While usually reversible it can help flying under the radar of many automatic scanners, reducing the attack surface
- Keep only the bare minimum keys and permissions - while some are essential, most are not. Leave only keys that have got to be part of the code
- Reduce the key permissions to the bare minimum necessary - as with the maps service example, make sure the key can only do what it's intended to, and where it's intended to operate from. Make sure you leave no room for exploitation
- Use the same tools attackers would use to automatically scan code on CI builds. Especially with string pattern matching tools that are quick to run. Utilize simple
grep
s orgf
to scan for patterns. Much like tests, these can help ensure developers don't leave holes that can be exploited or used to breach the system - Practice code review to have another eye on the code - all the scanners in the world cannot scan and detect 100% of the use cases. Another human eye is a great practice, both for quality and security
They take a look back at the Wayback machine
The internet archive, also known as the "Wayback Machine" holds periodic scans of websites all over the internet for years and years back. This is a mining field for hackers with a target. With tools like waybackcurls (based on waybackcurls.py) one can scan any target of old files. This means that even if you've found and removed a key but did not rotate it, a hacker might still find it in an old version of your website and use it against you.
Found a key laying around where it's not supposed to?
- Create a replacement key
- Release a version that uses the new key and removes the clear text mentioning
- Delete the old one or deactivate it
The way WaybackMachine is not only good for finding keys
Old code reveals all kind of interesting information for exploiters:
- Secret API paths - Unprotected API endpoints that you thought would never be found. While the ones that are found may be unexploitable they still help attackers map the API structure and conventions in the system. When your code is out in the wild there's no control over it, this is key to remember and put in the back of any developer's mind
- Web administration panels, much like API endpoints, are left around for different purposes and serve as one of the common attack vectors hackers find and exploit. These are mostly found in large organizations and installed by IT teams. A good idea is to periodically review all administration panels in use and their access management. A recent automotive manufacturer breach happened through such a panel that was bypassed by removing the
s
from thehttps
prefix of the address. Yes: 🤦.
They use GitHub
GitHub is a goldmine for hackers. With a simple search, knowing where to look can yield interesting results. If your account is not enforcing MFA, each and every user in the organization is a walking security hole. It's not far-fetched to assume that one of the collaborators in the organization is not using a unique password and that his password was once leaked through another system. A hacker that targets the organization can easily automate such a scan or even go manually through it.
The list of employees can be generated with OSINT like searching for employees on Linkedin, or in the GitHub public users list.
For example, here's a good starting point if you're trying to probe Tesla:
https://api.github.com/orgs/teslamotors/members
Even if the company doesn't use GitHub as their git provider, often the leaks won't be caught there anyway. It's enough to have one employee that uses GitHub for his personal projects and has a small leak in one of them (or their git history) to turn it into a breach.
Git's nature is to track the entire history of changes in every project. In the security context of things, this fact becomes significant. In other words, every line of code every written (or removed) by any user with current access to any organizational system is jeopardizing the company.
Why does it happen?
- Companies don't scan themselves for leaks
- Those that do, usually don't consider going through their employees' personal (yet publically available) accounts
- Those that do scan employees (a guesstimation of less than 1%) many times fail over reliance on automation and skipping commit history (not scanning the entire git tree but just the surface which is the current snapshot of the code)
- Lastly, companies don't rotate keys or use 2FA often enough. Those two can eliminate most of the holes above
Dorks 101
"Dorks" are search lines that utilize the search engine different features, with targeted search strings to pinpoint results. Here's a fun list of Google searches from the exploit DB.
Before giving the gist of it, if you want to go deep here, and I personally recommend that you do, here's an invaluable lesson from a talented researcher.. He discusses how to scan, how to use dorks, what to look for and where when going through a manual process.
GitHub dorks are less complex than Google simply because it lacks the complexity of features Google offers. Still, searching for the right strings in the right places can do wonders. Just go ahead and search one string of the next list on GitHub, you're in for a treat:
password
dbpassword
dbuser
access_key
secret_access_key
bucket_password
redis_password
root_password
If you try targeting the search to interesting files like filename:.npmrc _auth
or filename:.htpasswd
you can filter the type of leak you're looking for. Read further SecurityTrails' great post.
Mitigation
- Scan for leaks as part of any CI process, GitRob is a great tool
- Scan employees accounts; Gitrob does that for you unless disabled with
-no-expand-orgs
flag - Go deep into the history, Gitrob's default is 500 commits, you can go further with
-commit-depth <#number>
- Enforce GitHub two-factor authentication!
- Rotate access keys, secrets, and password of each and every system. A good practice would be to use federated access through one system like GSuite or ActiveDirectory and make sure they employ policies of password rotation and complexity
A post-publish important remarks by there readers @codemouse92 and @corymcdonald about password complexity, rotation, and physical devices assisting:
Use a unique, complex password for each login that requires one... but understand that complex does not imply esoteric. Long phrases are currently considered the best strategy.
...
I'd add one thing to the topic of password managers: while you should definitely use one, it's best to still use phrase-based passwords that can be entered reasonably by a human.
- Jason C.McDonald
--
In my line of work ... everyone is issued a hardware-based MFA. Everyone gets 2 YubiKeys ...
Additionally, we have 1Password and separate vaults for each team. When an employee leaves the company the support team goes and rotates all the passwords in each vault the person had access to.
Personally I've made the cursed mistake of pushing up AWS secrets to Github. It's recommended everyone add git-secrets to their pre-commit workflow to prevent pushing up anything resembling a secret.
- Cory McDonald
They use Google
Now that we're generally familiar with dorks, taking them to Google reveals an entirely new field of features. Being the powerful search engine it is, Google offers inclusion & exclusion of strings, file format, domains, URL paths, etc.
Consider this search line:
"MySQL_ROOT_PASSWORD:" "docker-compose" ext:yml
This is targeting a specific file format (yml
) and a vulnerable file (docker-compose
) where developers tend to store their not-so-unique passwords. Go ahead and run this search line, you'd be surprised to see what comes up.
Other interesting lines may include RSA keys or AWS credentials, here's another example:
"-----BEGIN RSA PRIVATE KEY-----" ext:key
The options are endless and the level of creativity and width of familiarity with different systems will determine the quality of findings. Here's a large list of dorks if you want to play a little.
They get to know your system
When a researcher (or a motivated hacker) gets "involved" with a system, he goes deep. He gets to know it; API endpoints, naming conventions, interactions, different versions of systems if they're exposed.
A not-very-good approach to securing systems is introducing complexity and randomness to their access paths instead of real security mechanisms. Security researchers trying to come up with vulnerable paths and endpoints use "fuzzing" tools. These tools use lists of words, combining them into system paths and probing them to see if valid answers are being returned. These scanners will never find a completely random set of characters, but they are superb at identifying patterns and extracting endpoints you either forgot about or did not know that exist.
Remember, security through obscurity is not a good practice (although don't ignore it completely)
That's where Github dorks which we've discussed earlier come in; knowing a system's endpoints naming convention, e.g. api.mydomain.com/v1/payments/...
can be very helpful. Searching the company's Github repos (and their employees) for the basic API string can many times find those random endpoints names.
However, random strings still have a place when building systems; they are always a better option than incremental resource IDs, like users, or orders.
Here's an incredible string lists repo called "SecLists". It's being used by almost everyone in the industry. Often with a personal twist and touch in the context of the target, it's a massive source. Another powerful tool to leverage string lists is FFuf, an ultra-fast fuzz tool written in Go.
Wrapping it up
Security is often taken lightly in startups. Developers and managers tend to prioritize speed and delivery times over quality and security. Pushing clear text secret strings to code repos, using the same keys over and over in systems, using access keys when other options are available can sometimes seem faster, but they can be detrimental down the road. I've tried showing how those strings you think are protected by being in a private repo, can easily find their way to a public gist. Or an employee's unintentional Git clone that was made public. If you set the ground for secure work like using password sharing tools, central secret store, policies for passwords and multi-factor authentication, you'd be able to keep making fast progress, without sacrificing security completely.
"Move fast and break things", is not the best mantra in the context of information protection
Knowing how hackers work, is usually a very good first step in understanding security and applying it to systems as a protector. Consider the approaches above and the fact that this is a very limited list of paths hackers take when penetrating systems. A good thing to do is to keep in mind the security aspects of anything being deployed to a system, regardless of its customer-facing/internal nature.
Managing security can sometimes be a pain in the ass, but rest assured, the mayhem you're avoiding by just taking care of the very basic elements, will keep you safe and sane.
Thank you for reading this far! I hope that I've helped to open some minds to risks that are out there and we all miss or overlook.
Feel free to reach out with any feedback or questionsץ Any form or shape or discussion is most welcome!
Top comments (31)
Mostly wise, but I have a couple of concerns.
Security researchers no longer recommend this with passwords. It doesn't work. Use a unique, complex password for each login that requires one...but understand that complex does not imply esoteric. Long phrases are currently considered the best strategy. If you have no reason to suspect a breach or significant possibility of a breach on an account, rotating the password does not decrease the chances of one. In fact, password rotation usually only leads to bad password strategies.
I could argue the same for keys, especially as they're even harder to crack than passwords, and far more likely to result in an unrecoverable account if you lose them.
That said, rotation policies may make more sense from a computer-to-computer perspective, if (and only if) there's any risk of exposure.
While I mostly agree with MFA, it is also not a guarantee of security — see sim card attacks — and it has its own hazards. If the service provides MFA, e.g. a code sent to a specific device, but no secure account recovery method, you may want to think twice. If you lose or destroy your phone (not that anyone has ever done that!) or it gets stolen, you're permanently locked out of any account that lacks recovery means. (PayPal is a prime example. No, they will not help. They'll go out of their way to FRUSTRATE helping, in fact.)
(One more note on that topic: there have been multiple security researchers who have warned that most consumer biometrics are less secure than passwords. Many fingerprint readers and facial recognition only check for partial match, and can be fooled. Consumer-focused biometrics should only be paired with a more secure authentication method, e.g. a key or password. Yes, I just said password.)
Be secure. But also be certain you aren't locking yourself out of Fort Knox permanently. It sucks at least as much as a breach.
I actually agree with 100% of your points and it makes me think whether I should sharpen my message;
I'm discussing mostly software teams and companies (obviously not only but that's the target audience). With that in mind, I address both personal user passwords and authentication keys.
To answer your points directly:
Password complexity - completely right, when I talk about complexity it's important to stress the importance of length rather than complexity. I would argue however, that everyone are far better off with a personal password manager like 1Password instead of managing their own passwords. That's another key point when talking about rotation.
MFA - again, correct, this is brought in the context of web login profiles for the 3rd party services team use daily. This is most certainly not a replacement for a password but an extra layer of security. And again - the context is a password leaking out. MFA in that context makes it usually useless.
Summing up, yes, of course, everything should be done with reason. In my experience, 99% of the teams need the push towards better security strategy rather than limiting the layers of protection they put on their processes. That being said, it's a great and important discussion which I must agree with. TBH just thinking about it raises some cases I've dealt with before, mainly in large organizations where the authentication processes and policies were so extremely hard that it actually did hurt productivity and progress.
Thank you for taking the time to read and respond!
Thanks for your response! I agree with you as well.
I'd add one thing to the topic of password managers: while you should definitely use one, it's best to still use phrase-based passwords that can be entered reasonably by a human. There are still times that situation occurs in the real world, as much as we like to pretend our password manager will always work perfectly. This is particularly true of central accounts like GitHub and email.
Besides that, you really should keep a copy of your most important passwords and keys on paper in a fire safe, in case of electronic catastrophe, or your own untimely demise.
In other words, the one time you need to enter your password by hand is the one time you're going to regret an esoteric password.
false-overspend-foe-float-stack
is going to be a better password for human use than3FaqtgSr2T9pgVJRwGxauzDmn
, as just as secure. (Bonus, you have a realistic chance to spot when the former is wrong or outdated.)If websites are still demanding their numbers and special characters, you can incorporate a consistent pattern unique to you. Numbers and symbols don't actually reduce the probability of cracking as once thought, so merely adding them to the phrase you would have used is perfectly fine; it's the phrase that's the secret, ultimately.
Again - 100% :)
I remember a really good post explaining what you just mentioned scientifically, in terms of computation complexity and comparing short complex passwords to long sensible strings.
I'd try to find it and maybe add it here.
Thanks again!
I'd be happy to quote some of your responses and incorporate in the post. I think they're extremely valuable to the readers!
With credit of course. Would that be okay with you?
Go for it! Thanks.
A good option is to use Yubikeys because in case of Google Authenticator if you lost phone then you're doomed but Yubikey stores codes on hardware what's really great. Moreover, you can have several backup keys, so if you even lose one you can insert another key into any machine, phone with type C and be happy.
Good password protection can be built on top of pass & xkcdpass utils.
True!
I personally use 1Password as my 2FA store which makes it a bit more secure through the gate of the single passphrase or a fingerprint. The downside is having both the password and the 2FA code accessible father successfully authenticating a single system.
I do agree that physical hardware takes it a step further, but would you say it's a feasible request from every team our there - even the smaller ones?
I store passwords in an encrypted format on my own Git server that only accessible through a specific IP address what's my own VPN + DNS that really don't store logs but SSH port still open, so I can push/pull updates from any machine but web interface only through VPN and again, ssh key stored on Yubikey, so an attacker needs physically to have access to my key and know the PIN. Remote vector of attack I cannot imagine due to my limited knowledge of security/crypto field but should be secure (I guess).
I talk here more about personal security and it's not so attractive for teams, indeed, but it's really secure security versus imposter security :) 1Password/LastPass should be good options for teams.
Got it.
Well about secret storage for teams I usually suggest Hashicorp's Vault. My experience with it is excellent. It's open-source, secure, and really thought through in terms of features.
For personal use - good thinking.. I'll consider it myself :)
Althrough someone a few comment above you mentioned they as a team where getting personal Yubikey's for everyone with a Vault specific namespace which was rotated everytime an employee left...
Sounds really great. I heard of Hashicorp's Vault many times but didn't have a chance to learn it more. Will add this to my todo list, thank you.
P.S. Great article.
Thanks mate!
Yeah, Vault is awesome especially when you deal with Terraform. I've just tried this practice on Digitalocean and it's pretty straightforward. digitalocean.com/community/tutoria... "You’ll use Packer to create an immutable snapshot of the system with Vault installed, and orchestrate its deployment using Terraform. In the end, you’ll have an automated system for deploying Vault in place, allowing you to focus on working with Vault itself, and not on the underlying installation and provisioning process."
In my line of work, we've had sim-swapping attacks happen to a few employees. To mitigate this everyone is issued a hardware based MFA. Everyone gets 2 YubiKeys so just in case they lose one they can restore access to their accounts.
Additionally we have 1Password and separate vaults for each team. When an employee leaves the company the support team goes and rotates all the passwords in each vault the person had access to.
Personally I've made the cursed mistake of pushing up AWS secrets to Github. It's recommended everyone add git-secrets to their pre-commit workflow to prevent pushing up anything resembling a secret.
This is fantastic.
Both the security processes you guys use and the pre-commit tools by AWS I did not know.
Thanks for sharing!
Hey,
Yes, you’re getting here into the realm of static code analysis.
I did mention ways of simple code scan to identify leaked strings, but I consider STA to be a field of its own that requires commercial solutions.
I wasn’t aware of the style check on GitHub and would look it up.
Thanks!
Great article. Didn't know that using these ways a hacker can pass through security.
Thank you Sarsa!
Certainly. lots of times I hear about "hacks" and sophisticated methods where the truth lies somewhat between; scans can be sophisticated and thought through, but eventually it's a way to figure out someone's password and use it to log in.
I guess the percentage of real sophistication, research and bypassing complex mechanisms is extremely low. And so when it comes to security we actually do have a lot to do to prevent the vast majority of vectores and leave very little attack surface.
Fantastic article, mate! Thanks for sharing :)
Thank you 💪😁
This is a unique kind of post. I really like these. Well done!
Thanks mate! Much appreciated
Comprehensive and very informative article!
Thanks
Some comments may only be visible to logged-in visitors. Sign in to view all comments.