Moving my articles from Medium to Dev.to
This article was originally posted here:
The code for the article can be found here.
Nodejs is good for IO intensive tasks but bad for CPU intensive tasks. The reason Nodejs is bad for CPU intensive task is that it runs on the event loop, which runs on a single thread.
The event loop is responsible for everything that runs on the user-land of Nodejs. This event loop runs on a single thread. When this thread is blocked all other tasks would have to wait for the thread to be unlocked before they can be executed.
I am not an expert on this issue, I am only giving a way in which I achieved this, so if anyone has something to add or some corrections to make about the post I’m open to advice.
Running Fibonacci
In this article, I would be using Fibonacci as my CPU intensive task (it takes time to get the Fibonacci number of numbers above 45). I am going to create a server that serves
a simple response for any URL that does not match /fibo
, and when the URL matches /fibo
I will serve a Fibonacci result.
In this article I will not use any npm module; I will just be using core node modules in this article.
The Server
The server for this article would only return two types of response:
- A Fibonacci number for the
req.headers.fibo
value when the URL route is equal tofibo
- A
hello world
string for any URL route that does not equalfibo
Lets run the fibo normally
First to show how Fibonacci blocks the event loop, I will create a server that serves a Fibonacci that runs on the same process as the simple hello world
response.
Create a file called fibo_in_server.js
. This file would return the Fibonacci number of a number passed into the
req.headers.fibo
when the URL route is equal to the /fibo
and return’s hello world
for any other URL match.
const http = require("http");
function fibo(n) {
if (n < 2)
return 1;
else return fibo(n - 2) + fibo(n - 1);
}
const server = http.createServer((req, res) => {
"use strict";
if (req.url == '/fibo') {
let num = parseInt(req.headers.fibo);
console.log(num)
res.end(`${fibo(num)}`)
} else {
res.end('hello world');
}
});
server.listen(8000, () => console.log("running on port 8000"));
We can run the above code and check the response. When the req.url
is not /fibo
the response is hello world
and the Fibonacci number of the number passed into the header fibo
field for a req.url
that is equal to /fibo
.
I’m using the Postman Chrome extension for requesting the server.
If we send a number like 45 to the server, the request would block the event loop until it is done getting the Fibonacci number. Any request to get the hello world string would have to wait until the long-running Fibonacci is done.
This is not good for users who want to get only a simple response, because they have to wait for the Fibonacci response to be completed.
In this article, what I am going to do is look at some ways to fix this problem. I am not a Pro Super NodeJs Guru User, but I can give some methods of dealing with this problem.
Methods of dealing with this problem
- running Fibonacci in another Nodejs process
- using method 1 with a batch queue to process the Fibonacci
- using method 2 with a pool to manage the processes
Method 1: Running in another process
What we can do is run the Fibonacci function in another Nodejs process. This would prevent the event loop from getting blocked by the Fibonacci function.
To create another process we use the [child_process]() module. I am going to create a file, fibonacci_runner.js
, that runs as the child
process, and another file called server_method1.js
, the parent process.
The server_method1.js
serves the response to the client. When a request to the /fibo
is made the server gives the work to its child process fibo_runner.js
to
handle. This prevents the event loop on the server from getting blocked, making it easier for a smaller request to be handled.
Here is the code for fibonacci_runner.js
process.on("message", (msg) => {
"use strict";
process.send({value: fibo(parseInt(msg.num)),event:msg.event})
});
function fibo(n) { // 1
if (n < 2)
return 1;
else return fibo(n - 2) + fibo(n - 1)
}
And here is the code for server_method1.js
:
const http = require("http");
const {fork} = require('child_process');
const child = fork(`${__dirname}/fibonacci_runner.js`);
let {EventEmitter} = require('events');
let event = new EventEmitter();
const server = http.createServer(function(req, res){
if (req.url == '/fibo') {
let rand = Math.random() * 100; //generate a random number
child.send({num:req.headers.fibo,event:rand}); //send the number to fibonacci_running
event.once(rand, (value) => { //when the event is called
res.end(`${value}`)
})
} else {
res.end('hello world');
}
});
child.on("message",(msg)=> event.emit(msg.event,msg.value)); //emit the event event sent
server.listen(8000, () => console.log("running on port 8000"));
Now if we visit the URL route /fibo
with a value >= 45 in the req.headers.fibo value, it won’t block the request for the hello world
. Better than what we had before.
The next step is to reduce the amount of computation the fibonacci_runner
does. One way of reducing this is by using a batch queue with/or a cache (Note:
there are still other methods of doing this).
In this article, I am going to discuss the batch queue alone.
You can check out these articles to know more about the cache :
https://community.risingstack.com/redis-node-js-introduction-to-caching/amp/
https://goenning.net/2016/02/10/simple-server-side-cache-for-expressjs/
Method 2: Batching queue
When dealing with asynchronous operations, the most basic level of caching can be achieved by batching together a set of invocations to the same API. The idea is very simple:
if I am invoking an asynchronous function while there is still another one pending, we can attach the callback to the already running operation, instead of Creating a brand new request. — “Nodejs Design Patterns”
From the definition above, we want to batch requests with the same req.headers.fibo value together, Instead of calling a new Fibonacci call while one with the same req.headers.fibo value
is still pending.
I am still going to use the fibonacci_runner.js
to run the Fibonacci operation, but I’m going to create a new file, server_method2.js
, that has
an asyncBatching function that sits between the fibonacci_runner.js
and the call to process the req.headers.fibo
.
Here is the code for server_method2.js
const http = require("http");
const {fork} = require('child_process');
const child = fork(`${__dirname}/fibonacci_runner.js`);
let Queue = {}//1
function asyncBatching(num, cb) {
if (Queue[num]) {
Queue[num].push(cb) //2
} else {
Queue[num] = [cb]; //3
child.send({num: num, event: num})//4
}
}
const server = http.createServer(function (req, res) {
if (req.url == '/fibo') {
const num = parseInt(req.headers.fibo)
asyncBatching(num,(value)=>res.end(`${value}`))
} else {
res.end('hello world');
}
});
child.on("message", (msg) =>{
"use strict";
let queue = [...Queue[msg.event]];
Queue[msg.event] = null; //empty the Queue
queue.forEach(cb=>cb(msg.value))
console.log(`done with ${msg.event}`)
});
server.listen(8000, () => console.log("running on port 8000"));
I would use the Apache benchmark to run this test
$ ab -n 10 -c 10 -H 'fibo: 39' http://localhost:8000/fibo
It takes 3.196 on my machine for method2,and 32.161 for method1. This means method2 responds n times faster than method1
(number of concurrent users sending the same req.headers.fibo value).
To improve method2 further we can use a cache to save the value of the Fibonacci but am not going to touch caching in
this article :(.
What is going to do here is improve on method2 by increasing the number of child processes. I am going to use a pool that
would manage the distribution of work among the child processes.
Method 3: Pooling and managing multiple processes
Creating multiple child processes to handle the Fibonacci operation would make it respond faster and better. You have to know that running many processes is making
use of system resources. Creating too many processes is bad; Just create enough.
The Pool is responsible for handling child processes. First, let’s create a Pool file, Pool.js
, that exports a Pool class.
Code for Pool.js
file:
const child = require('child_process');
class Pool {
constructor(file, maxPool, messageCb) {
this.pool = [];
this.active = [];
this.waiting = [];
this.maxPool = maxPool;
let releaseWorker = (function (worker) {
//move the worker back to the pool array
this.active = this.active.filter(w => worker !== w);
this.pool.push(worker);
//if there is work to be done, assign it
if (this.waiting.length > 0) {
this.assignWork(this.waiting.shift())
}
}).bind(this);
for (let i = 0; i < maxPool; i++) {
let worker = child.fork(file);
worker.on("message", (...param) => {
messageCb(...param);
releaseWorker(worker)
});
this.pool.push(worker)
}
}
assignWork(msg) {
if (this.active.length >= this.maxPool) {
this.waiting.push(msg);
console.log(this.waiting)
}
if (this.pool.length > 0) {
let worker = this.pool.pop();
worker.send(msg);
this.active.push(worker)
}
}
}
module.exports = Pool;
The Pool class
As said before, the Pool is responsible for handling the child process. It has only one method, the assignWorker
method. The assignWorker
method
assigns work to a worker (child process) to handle. If all the workers are busy the work would be done as soon as one is free.
The Pool Object takes three parameters on creation. These arguments are :
- the file to run as the child process
- the number of processes to create
- the function to call when the workers send a message back
Now let’s create server_method3.js
file that makes use of the Pool Object.
The code for server_method3.js
:
const http = require("http");
let Queue = {};
const Pool = require("./Pool");
let Pooler = new Pool(`${__dirname}/fibonacci_runner.js`,2, (msg) => {
"use strict";
let queue = [...Queue[msg.event]];
Queue[msg.event] = null; //empty the Queue
queue.forEach(cb => cb(msg.value));
console.log(`done with ${msg.event}`)
});
//responsible for batching
function asyncBatching(num, cb) {
if (Queue[num]) {
Queue[num].push(cb)
} else {
Queue[num] = [cb];
Pooler.assignWork({num: num, event: num})
}
}
const server = http.createServer(function (req, res) {
if (req.url == '/fibo') {
const num = parseInt(req.headers.fibo);
asyncBatching(num, (value) => res.end(`${value}`)) //
} else {
res.end('hello world');
}
});
server.listen(8000, () => console.log("running on port 8000"));
server_methodw3.js
runs more than one child process, so we can run multiple Fibonacci operations at the same time,
instead of waiting for the one to finish.
The number of Fibonacci we can run at the same time depends on the number passed as the second parameter to the Pool
constructor.
Note: limit the number of processes you spawn ups.
Conclusion
Running heavy task on node event loop is a bad idea, and remember to pass the task to another process to handle, be it Nodejs or not (you can start a C++ to handle
very heavy operations).
Remember to always keep the event loop from getting blocked by any operation.
Read this article for more about the Event Loop.
Badewa Kayode, peace out :).
Top comments (3)
Thanks for this great article; I found a solution to my strenuous tasks after reading this.
Nice post!
Came here searching for time consuning common tasks usually devs wants to run in nodejs.. Im developing a distributed queue/task runner , and searching for good use cases to apply as examples..
queueXec
how about use c++ addon to do the cpu intensive part instead of using js for it