Usually, when you leave your email address on your website for people to click on, you may do something like this:
<a href="mailto:name@example.com">name@example.com</a>
Right?
This is, however, the perfect recipe to get spam into your email!
This is how email harvesting works
In order for spambots to get a nice long list of emails to annoy people to click their suspicious-looking links, they use email harvesters to curate these lists. Emails are typically found on websites where people leave theirs to be contacted.
Some people think this is easy to solve simply by masking the email like name[AT]example[DOT]com
. This, however, doesn't solve anything due to two things:
- The mailto link still contains the actual email address as you can't replace it with the one above. Since email harvesters look into the source code of your website, they'd still be able to get your email.
- Most email harvesters are advanced enough to detect common patterns like [AT] and (AT) and such, so they won't do much.
So, what now?
Encode your email address
Fortunately, there's a way to make your email address unreadable for email harvesters!
You may have seen characters like &
and >
in HTML before. These are called HTML entities. These are symbols that have been encoded so they won't be mistaken for HTML tags.
However, what not many people know is that you can encode every single character into an HTML entity. And even better, putting these into your hrefs will convert them back into regular text for normal visitors that are visiting your website rather than looking at the source code. It's perfect for this situation!
HTML entities for regular letters are made of HEX encoding. The HTML entities would look like like &#HEXCODE;
Let's do it!
Use this handy tool to convert! Make sure to copy the entire href link, not just your email address!
After that, copy-paste that string into your href and you're done! Here's how it should look like:
<a href="mailto:name@example.com">My email</a>
This makes the whole thing a lot harder to decypher for most email harvesters. That, while still keeping the link clickable for others! On top of that, using inspect element to check the HTML gives you the decoded email, even though the source code has it encoded!
This means that it still ends up being readable for humans.
It's even better if you use some non-traditional way of masking your email address in the actual text, or just don't use your email at all (like in the sample above).
We're all good now! No more disappointments that you think you just got a client but it turns out to be spam!
Top comments (12)
This is an example of Security through obscurity: en.wikipedia.org/wiki/Security_thr...
"Most email harvesters are advanced enough to detect common patterns"
The obfuscation technique used here is actually easier for bots to decode than adding things like [at] instead of @ in the text.
Common web scarping language, like PHP, have a built in method to decode HTML encoded entities, and the bots use these.
As noted in my article, this solution is meant for mailto links themselves, where you can't obscure your email with things like [AT]. That's when the next best solution is to encode your email when you insist to use a mailto link at all.
Solutions like this one are never going to solve everything due to security through obscurity, but it at least gets rid of the scraping bots that can't decode these entities.
Of course, a more effective solution would be not to use mailto links at all and obscure your email effectively as you said. However, the article was about when you do have a mailto link. :p
or you could hide the email link behind a captcha check which would do a better job at fighting bots
HTML encoding is a system aimed at letting programs decode those patterns into characters. Trying to hide readable characters behind a system aimed at being more readable by programs than by humans is utterly pointless.
Thx for this article, very interesting and useful, bookmarked :)
This was witty made me smirk 😄
Haha that's awesome! No problem 😉
Cool good info shared here.
Awesome article! short and to point. If you'll excuse me, I have some updates to do on my website....
Succinct! 👏👌
Would this work with a WordPress site? I'm using a theme which has fields to fill in with this info. Do I have to use the encoding for that?
Wordpress has a page about this: codex.wordpress.org/Protection_Fro...