Quite often we hear disputes about what is better to use Node.js or Python.
But what if one day, you come up with a great idea to combine those, for instance to use some Python libraries in your application, but you just do not have any idea how to integrate it with your Node.js application. Of course you can always build API on top of Python backend(Flack, etc), but in that case you need to build, host and manage one more application, when you just need to run a single Python script. That's why I want to give you step by step instructions on how to achieve this.
To begin with, let's briefly define what are both languages.
Node.js is a server-side platform developed on Google Chrome’s Javascript Engine. Node Js allows to develop data-intensive real-time web applications. It is written on JavaScript and allows to run applications on various operating systems such as Windows, Linux, and Mac OS.
Python is an object-oriented programming language used to create dynamic web applications. The syntax of Python and dynamic typing make it an ideal programming language for scripting.
Why you might need to run Python scripts from Node.js
Node js can't run heavy multitask as it works on a single thread basis, making it difficult for Node Js to execute CPU-bound tasks. Whenever any request is received, the system will complete the request once and only accept the other request. This process slows down the system and ends up in a massive delay.
Python is much more suitable for that back-end applications, that use numerical computations, big data solutions, and machine learning.
Python has one of the largest ecosystems out of any programming community and a lot of open-source libraries. For instance some of built-in AI libraries (Pylearn2, Tensorflow for neural networks, and Scikit-learn for data analysis)
Сall your Python script from Node.js using Child Process module
You will find several libraries that can help you with this issue. As for me, I prefer to use built in child_process. The node:child_process module provides the ability to spawn subprocesses. This capability is primarily provided by the child_process.spawn() function.
First, we need to create a server.js file as the server for our app. Where we suppose that X
and y
are large arrays.
const express = require('express');
const { spawnSync } = require('child_process');
const { readFile } = require('fs/promises');
const { appendFile } = require('fs/promises');
const { join } = require('path');
const app = express();
app.get("/", async (req, res, next)=>{
const X = [1,2, 5] // large array
const y = [[1,2], [2,3], [1,2]] // large array
await appendFile(
join(`scripts/args.json`),
JSON.stringify({ X, y }),
{
encoding: 'utf-8',
flag: 'w',
},
);
const pythonProcess = await spawnSync('python3', [
'/usr/src/app/scripts/python-script.py',
'first_function',
'/usr/src/app/scripts/args.json',
'/usr/src/app/scripts/results.json'
]);
const result = pythonProcess.stdout?.toString()?.trim();
const error = pythonProcess.stderr?.toString()?.trim();
const status = result === 'OK';
if (status) {
const buffer = await readFile('/usr/src/app/scripts/results.json');
const resultParsed = JSON.parse(buffer?.toString());
res.send(resultParsed.toString())
} else {
console.log(error)
res.send(JSON.stringify({ status: 500, message: 'Server error' }))
}
});
//Creates the server on default port 8000 and can be accessed through localhost:8000
const port=8000;
app.listen(port, () => console.log(`Server is listening on PORT ${port}`));
The first parameter, that was passed to the spawnSync
is command<string>
(the command to run, in this case is python3
- the version of python that is required to run your libraries), the second is args <string[]>
(list of string arguments). You can also pass the third parameter options
, the more detailed information can be found by the link.
I propose to consider in more detail the list I am passing as the second argument
- '/usr/src/app/scripts/python-script.py' - is the path of the script I need to execute
- 'first_function' - is the name of the function I want to run inside that file
- '/usr/src/app/scripts/args.json' - is the path to the file with arguments to this function. Actually you can directly pass those arguments using
JSON.stringify()
function in Node.js, but as for me it's better to write those arguments down into the file and then read that from Python script. One of the reasons is that you might need to pass as arguments large arrays of data and this will cause errors. - '/usr/src/app/scripts/results.json' - is the path, where I want my Python script to write results, after it is finished, so I can read that from node.js.
Passed arguments can be found in sys.argv
array in Python script. The sys.argv
list is a list that contains command line arguments in a python program. Whenever we run a python program from a command line interface, we can pass different arguments to the program. The program stores all the arguments and the file name of the python file in the sys.argv list. This can be seen in more detail by writing a simple example script
import json
import sys
def first_function():
json_obj = open(sys.argv[2])
data = json.load(json_obj)
calculatedResults = [1,2,4,3] # it is just example of data to return, in fact you will calculate it bellow
X = data["X"]
y = data["y"]
# do some your calculations based on X and y and put the result to the calculatedResults
# print(make_pipeline)
json_object_result = json.dumps(calculatedResults, indent=4)
with open(sys.argv[3], "w") as outfile:
outfile.write(json_object_result)
print("OK")
if sys.argv[1] == 'first_function':
first_function()
sys.stdout.flush()
In two words, in Node.js script we write down to the file all required arguments, run spawnSync
passing list of arguments and after Python script reads passed arguments from the file, makes all calculations and writes down to the file all results. At the moments all this results can be read in Node.js from file.
Run Python script, that uses specific Python libraries
One of the main goals is to have ability to use built-in AI or other libraries. It's always not a good idea to install dependencies and libraries globally in a system, it can lead to version conflicts, deployment difficulties, and problems with multiple developers working on a project at the same time, for example. That's why the perfect way is to use Docker. You can build the image yourself, but the easiest solution is to use an existing one.
FROM nikolaik/python-nodejs:latest as development
WORKDIR /usr/src/app
RUN pip install scikit-learn numpy pandas simplejson
COPY ./package.json ./yarn.lock ./
RUN yarn
RUN yarn install
COPY ./ ./
RUN yarn build
ENV PORT 4000
EXPOSE $PORT
# Production mode
FROM nikolaik/python-nodejs:latest as production
RUN pip install scikit-learn numpy pandas simplejson
ARG NODE_ENV=production
ENV NODE_ENV=${NODE_ENV}
WORKDIR /usr/src/app
COPY --from=development /usr/src/app ./
ENV PORT 4000
EXPOSE $PORT
CMD ["node", "dist/main"]
For now we can use scikit-learn
, numpy
, pandas
and simplejson
libraries on Python script, as they are available in the created docker container.
By following the steps above, you can create a new process and execute a Python script from your Node.js application. This can be useful in cases where you need to integrate Python code into your Node.js application or when you need to perform certain tasks that are better suited for Python. Described simple app can be downloaded from the Github, there you can also find instructions how to run it.
Happy coding and wish you to have a nice day!
Top comments (0)