Last week I wrote a post about how to set up serverless function with Zeit Now. I want to continue the topic about serverless functions and create an API which extracts, transforms, and loads data back to the user.
For example, I picked showsrss.info
website, which does not have an API, but responds with HTML document.
Selecting Tools
Tools we will use:
- Now CLI
- axios
- cheerio
First of all, we need an HTTP client for node.js to request the website. I am very fond of axios
which covers almost all browsers and node.js.
cheerio
is a library of core jQuery features for parsing markup and providing an API for traversing/manipulating the resulting data structure in the server. We will use it to scrape the HTML.
Also if you want to follow the steps, install Now CLI globally npm i -g now
.
Setup Project
Let's start from scratch and initialize project in the terminal:
mkdir showrss && cd showrss
npm init --yes
npm install axios cheerio
If you wonder what does npm init --yes
mean, it responds to all answers with Yes
.
Great, we have a new project with just package.json
and node_modules
. Next step is to create an API endpoint for the users to send a request.
For Zeit Now to create a serverless function with an endpoint, the project has to have a folder, named api
in the root
directory, and a file inside, which name will reflect the endpoint. Again in the terminal run:
mkdir api && cd api
touch shows.js
Create a Serverless Function
A file show.js
should export a default function, which receives two arguments request
and response
. These are the standard HTTP request and response objects but enhanced with some helpers by Zeit Now.
// show.js
module.exports = (request, response) => {
// send() method can receive a string, object or buffer
// json() will send only JSON object
response.send('Hello there!');
}
To test the endpoint, build the function in dev mode from project root
directory with now dev
.
If you send a request to in the browser by calling http://localhost:3000/api/shows
, you should get a greeting message.
Extract Data from Third-Party Service
When calling the serverless function, we can make a request call to the third party service showrss.info
inside the function with axios
helper.
Couple points to be aware:
- HTTP call will take some time, so we need to declare default function as an
async
-
axios
returns an object which has a fielddata
, where the actual response will be stored.
const axios = require('axios');
module.exports = async (request, response) => {
const showsResponse = await axios.get('https://showrss.info/browse');
const htmlData = showsResponse.data;
response.send(htmlData);
};
By now you should see a rendered HTML response.
Transfrom HTML into JS Object
After the response from the website, we get an HTML document. To traverse it and extract each show information: id, title and individual RSS link to the feed, load the document into cheerio
with load()
method. After that, data will be ready for extraction and accessible the same way as with jQuery.
const cheerio = require("cheerio");
...
const $ = cheerio.load(htmlData);
const options = $("#showselector option");
const showList = Object.keys(options)
.map(key => {
const show = $(options[key]);
return {
id: show.attr("value"),
title: show.text(),
rss: `http://showrss.info/show/${show.attr("value")}.rss`
};
});
response.send(showList);
...
Try to access en endpoint now, and you should receive an array with the list of all tv shows and an individual RSS feed link.
Deploy and Enjoy
The last step is to deploy the serverless function by running now
in the terminal. After successful deployment, you'll get an access link, and on the client, you can fetch already transformed list.
I challenge you to make a new serverless function that receives show id as an argument and responds with additional data of that show.
Top comments (0)