javascript- working with stream and large files

#javascript #node

hey
in most cases its ok to read file into the memory
but when the size of the file get greater then the memory usage on the os get greater

assumingthat we working with a server and we recived 50 request simutensly and each request size is 20mb

the the os memory usage jump to 20*50= 1000m
in that way the os run out of memory and the apliication will crash

in such cases we should working with streams
the stream allowed us to get the file in pieces(chunks)
just like Iterate through array

the stream based on events
and we have the following events

//data get the current "pice of the file"
source.on('data', function (chunk) {
});
//end will start after the all file passed through the data event 
source.on('end', function () {
});
//will rise in case of an error 
source.on('error', function (err) {
});

to set the stream we use the common fs

 const read = fs.createReadStream("bigfile.txt")
 const write = fs.createWriteStream("bigfile.txt")

here is example of the whole process

for this example the size of the file bigfile.txt is 50 mb

//in this case we loaded 50 mb into the memory 
const file= fs.readFileSync("bigfile.txt", "utf8")

// in this case we read only one pice from the file in every given time
const source = fs.createReadStream("bigfile.txt", "utf8")
//after that we set the stream variable we can start geting the file data
source.on('data', function (chunk) {
    console.log(chunk)
});
source.on('end', function () {
    console.log("end");
});
source.on('error', function (err) {
    console.log("error" + err);//cant find file or something like that
});

so in this example you can read file even if the request / file size is 5GB
and the memory does not jump at all

and if you want to write into file its pretty much the same

const destination = fs.createWriteStream("bigfile2.txt")
destination.write(chunk)
//and in the end we will close the stream
destination.end()
//and we have the finish and the error event just like the Example above

now lets combine the read and the write


const source = fs.createReadStream("bigfile.txt", "utf8")
const destination = fs.createWriteStream("bigfile2.txt")

source.on('data', function (chunk) {
//write into the file  one piece at a time
    destination.write(chunk)
});
source.on('end', function () {
    //after that we read the all file piece  by piece we close the stram 
    destination.end()

});


destination.on("finish", () => {
//the function destination.end() will rise the finish event 
    console.log("done write into bigfile2.txt")
})

after that we know to working with files
we can implement the stream on other operations
let say we want to read file , compress the data and write the compressed data into a new file

for that we use the liberary zlib and we use the pipeline
pipeline take the readble stream from one side
pass the data through some kind of middleware and then take the output from the middleware and passed it to to destination stream

so in this example we will read a file
compressed him and write him into a new file

const { pipeline } = require('stream');
const { createGzip} = require('zlib');
const gzip = createGzip();

const source = fs.createReadStream("bigfile.txt", "utf8")
const destination = fs.createWriteStream("bigfile3.zip")

pipeline(source, gzip, destination, (err) => {
    if (err) {
        console.error('An error occurred:', err);
    }
});