A couple months ago, I posted probably my second-most-popular social post ever, and it was reposting what I found was an insightful blogpost by Jeremy Keith. The message was simple, but hit me deep.
"We can rescind our invitation to Google."
-- Jeremy Keith
Part of every website go-live checklist I ever completed included integrating Google Analytics, or at the very least somehow ensuring that the site was being crawled as expected. I never actually questioned it.
When I started building this site, I also looked at moving my analytics interface elsewhere. Getting another website to collect, host and display my data felt less unsavoury than Google doing it, especially because I figured Google did other things with it, in ways and at a scale that no other website could. The more I thought about it, the more intrusive it felt.
Why
I never actually considered saying "actually don't crawl me ever" until I read Jeremy's post. I don't actually need to result on Google for my personal site at all.
- I'm not monetising it or selling any ad space
- I don't directly benefit from increased traffic
- I don't want to compete for high positions in results lists with paid professionals who benefit from 1. and 2.
- I don't care to teach Google stuff
Upon further consideration I decided I'd rather
- Get organic traffic from sources and circles I'm actively involved in (and post to)
- Leverage the SEO Pros of platforms that are incentivised to help me show up in search results
- Be selective (or at least have control of) the content I share to 1. and 2.
And so, I block bots.
How
- my commit for blocking Google's bots, apparently.
- And in case it's changed, my robots.txt.
As many people have pointed out, Google could just as easily ignore this, there are probably more thorough ways to do this, and even (slightly) malicious ways to protest. I haven't gotten that far and am content with this for now. If you've got any other things I should be adding to my robots file, please reach out.
This was originally posted on my personal website at miko.ademagic.com/blog/i-block-bots
Top comments (3)
Depending on how much you don't care to be tracked, I'd be careful of crawlers who don't respect the robots.txt.
Might be worth logging user agents and then creating a black list for them.
yeah I don't care enough. People have pointed out that robots.txt is followed more as a courtesy, really.
Most of my reason for doing this was that I didn't like that it was my default to let/beg Google to crawl me without ever questioning that process. In this case didn't need them to, so I didn't want them to. There are other ways to exist on the internet.
hope that's a takeaway others have too :)
Haha yea that's fair. I think I've done the same thing in the past, and set up web admin without thinking.
Good call out to think about what you need before going and doing something like indexing.