DEV Community

Mladen Ružičić
Mladen Ružičić

Posted on • Edited on

8 1 1

How I made Instagram scraper on Firebase Console

The last couple of days I read about Instagrams new Graph API and was thinking about some creative ways of using it. To my surprise, they did not support anything related to user registration or at least account username availability. After some research, I saw one “tool” doing exactly that - checking if a provided text is a valid string and available Instagram username.
But now, I don't care about Graph API - I want to figure out how did they make it, without API!

Research

Of course, the first thing I tried was to inspect their source code and check network requests. All I could see was that it’s something hosted on Heroku. No help. After that, I went to Instagram official sign up page and inspected their code. instagram.com/accounts/web_create_ajax/ - I googled it - turns out it’s not publicly available.

Ok, now I want to create the tool myself. Why? Because I was the one googling "check Instagram username availability" a few days ago, so I hope I am not the only one who refuses to go to the official site to do that. Expectations? Coding all day and learning about new technologies.

Idea

After investigating the behavior of their form validation, my first idea was a NodeJS script, running puppeteer, populating Instagrams official sign up form and waiting for success or error element to show up on the screen. (I wrote some e2e tests at work last week, so I'm totally into this at the moment).
The plan for puppeteer is:

  • Navigate to the Instagram signup page
  • Click username input and fill in some text
  • Click body (to trigger blur event validation check)
  • Observe DOM and return whether the field is valid

e2e - for different needs

This is what I came up with, and - it was working!

Puppeteer

Note: Since the puppeteers waitForSelector method will throw if an element does not appear in the DOM, I will assume it does not exist, and the username input field is valid.

Now, when I know the script works, let's deploy it!

Adjustments for Firebase Functions

For quite some time, I wanted to make practical use of Firebase Functions. This was a perfect moment to try it. I had some experience using Firebase before, so I went to Firebase Console, created a new project and initialized it locally.

There are two types of Firebase Functions:

  1. The ones you call from Firebase app (an app hosted on Firebase, or an app authenticated with Firebase)
  2. The ones you want anybody to access - through HTTP requests.

I wanted both. One for my web app, and the other for everyone else.

First things first. Create a new project on Firebase Console, go to Functions tab and click "Get Started" with functions.
Second thing, install Firebase CLI locally:

npm install -g firebase-tools

Authenticate to your Firebase account

firebase login

And initialize a new project, answering the on-screen questions (defaults are just fine)

firebase init

It is important to configure your app to use the Firebase project you created a minute ago. Use firebase use --add.

Function - onRequest

Okay. Now let's make this function available through an HTTP request. We must use onRequest. Arguments are the same as for Express.js - request and response, meaning the Request object gives you access to the properties of the HTTP request sent by the client, and the Response object gives you a way to send a response back to the client.
You can easily create a new Express app and export it as a Firebase Function, but that would be overkill for this use case.

onRequest

Notice the CORS wrapper - required for accessing this function from outside our Firebase app domain.

We can test it right now using Postman

GET https://us-central1-your-project-name.cloudfunctions.net/check?username=awesome_username_99

and confirm it's working - Status 200 OK

{
    "available": true
}

Function - onCall

Now, the easier part - export a callable function to use from within Firebase app.

onCall

Web App - httpsCallable

What about testing this one? I want a simple webpage. Actually, validating input in vanilla javascript took me a while, but I liked it. If you spend most of your time using angular/vue/react/whatever, you probably forgot what it takes to check and set a validity to form elements, at least I did. I'll skip the boring part (link to the source code will be at the end of text).

httpsCallable

Once I was happy how my form looked I deployed it to Firebase Hosting and tested.

firebase deploy 

I took it a step further - bought an SEO-friendly domain name, set up Google Tag Manager and Google Analytics Goals, to actually track how many visitors (if I get any) hit available username.

P.S. I actually spent more time getting familiar with Google Tag Manager and writing this blog post, than I did coding. xD

You can see this app live at https://instagram-username.firebaseapp.com/ and the source code at GitHub.

Update February 27.

Instagram asked me to stop using domain containing their name (two days after I registered instagram-username.com), so I did. Demo still available on firebase subdomain.

Image of Bright Data

Cut Costs, Save Time – Get your datasets ready for action faster than ever.

Save on resources with our ready-to-use datasets designed for quick integration and application.

Reduce Costs

Top comments (5)

Collapse
 
panosgr94 profile image
Panagiotis Grasis

Thank you for posting this, it was really helpful and informative as I'm about to go on the adventure of uploading puppeteer to GCF. I've got my scraper working locally but I'm super anxious about what may come up in the process of actually hosting it in a function. This post helped my anxiety calm down after seeing it is possible and doesn't require extreme amounts of work!

PS. I'm sorry about that domain, lol. :P

Collapse
 
ruzicic profile image
Mladen Ružičić

Nice to hear it helped! Happy coding! :)

Collapse
 
panosgr94 profile image
Panagiotis Grasis

Were you by any chance on the Blaze Plan when doing this tut? Cause I tried it with the Free Tier but kept on bumping into a "net:ERR_NAME_RESOLUTION_FAILED" error which from what I figure has to do with outbound connection block.

Thread Thread
 
ruzicic profile image
Mladen Ružičić

I switched to the "Pay as you go" plan.

Collapse
 
xtealer profile image
XTEALER

Amazing work! I'm currently going through some issues making a scrapper for posts of public profiles. The profilePage data object seems to be missing from axios response.

Imagine monitoring actually built for developers

Billboard image

Join Vercel, CrowdStrike, and thousands of other teams that trust Checkly to streamline monitor creation and configuration with Monitoring as Code.

Start Monitoring

👋 Kindness is contagious

Engage with a sea of insights in this enlightening article, highly esteemed within the encouraging DEV Community. Programmers of every skill level are invited to participate and enrich our shared knowledge.

A simple "thank you" can uplift someone's spirits. Express your appreciation in the comments section!

On DEV, sharing knowledge smooths our journey and strengthens our community bonds. Found this useful? A brief thank you to the author can mean a lot.

Okay