DEV Community

Mario
Mario

Posted on • Originally published at mariokandut.com on

How to handle binary data in Node.js?

Handling binary data in server-side programming is an essential feature, and a must-have for every developer dealing with Node.js. In Node.js binary data is handled with the Buffer constructor. Let's have a look at the anatomy of a Buffer instance.

The Buffer instance in Node.js

The Buffer constructor is a global, hence, no import is needed to use it.

Type node -p "Buffer" in your terminal and have a look at the output.

[Function: Buffer] {
  poolSize: 8192,
  from: [Function: from],
  of: [Function: of],
  alloc: [Function: alloc],
  allocUnsafe: [Function: allocUnsafe],
  allocUnsafeSlow: [Function: allocUnsafeSlow],
  isBuffer: [Function: isBuffer],
  compare: [Function: compare],
  isEncoding: [Function: isEncoding],
  concat: [Function: concat],
  byteLength: [Function: byteLength],
  [Symbol(kIsEncodingSymbol)]: [Function: isEncoding]
}
Enter fullscreen mode Exit fullscreen mode

The Buffer constructor was introduced into Node.js, when Javascript did not have a native binary type. Javascript evolved and different views of a buffer got added to the language, like an ArrayBuffer or other typed arrays.

For instance, an ArrayBuffer instance can be accessed by a Float64array, where each set of 8 bytes is interpreted as a 64-bit floating-point number. Have a look at the MDN article Javascript Typed Arrays. When these new data structures where added, the Buffer constructor internals were refactored on top of the Uint8array typed array. This means a buffer object is both, an instance of a Buffer and an instance of Uint8array.

Let's open the REPL and double-check this.

# enter REPL
node

## Allocates a new Buffer of size bytes.
const buffer = Buffer.alloc(10)

buffer instanceof Buffer
## returns true

buffer instanceof Uint8Array
## returns true
Enter fullscreen mode Exit fullscreen mode

Important: The method Buffer.prototpye.slice overrides the Uint8Array.prototype.slice method. The Uint8Array method will take a copy of a buffer, the Buffer method will return a buffer instance that references the binary data.

Allocating Buffers

Usually a constructor is called with the new keyword, with the Buffer constructor this is deprecated. The correct and safe way to allocate a buffer of a certain amount of bytes is to use Buffer.allocate, like:

const buffer = Buffer.allocate(10);
Enter fullscreen mode Exit fullscreen mode

The Buffer.alloc function produces a zero-filled buffer by default. Let's use the dynamic evaluation to see the output directly.

node -p "Buffer.alloc(10)"
## the output should be <Buffer 00 00 00 00 00 00 00 00 00 00>
Enter fullscreen mode Exit fullscreen mode

When a buffer is printed to the terminal, the ellipsis in <Buffer ...> are hexadecimal numbers. For example a single byte buffer with a decimal value of 100, is 1100100 in binary and 64 in hexadecimal. Hence, the output would be <Buffer 64>

There is also an unsafe way to allocate buffers.

const buffer = Buffer.allocUnsafe(10);
Enter fullscreen mode Exit fullscreen mode

Any times a buffer is created, it's allocated (or assigned) from unallocated (or unassigned) memory. Unassigned memory is only unlinked, it is never wiped. This implies, that unless the buffer is overwritten, (zero-filled), it can contain fragments of previously deleted data. This poses a security risk. The method allocUnsafe is used only for advanced use cases, like performance optimization. If you have to create a buffer, only use the safe method Buffer.alloc.

Converting Strings to Buffers

The String primitive in JavaScript is a frequently used data structure.

A buffer can also be created from a string by using Buffer.from. The string characters are converted to by values.

const buffer = Buffer.from('Hello World');
Enter fullscreen mode Exit fullscreen mode

Let's dynamically evaluate this.

node -p "Buffer.from('Hello World')"
Enter fullscreen mode Exit fullscreen mode

The output is <Buffer 48 65 6c 6c 6f 20 57 6f 72 6c 64>.

In order to convert a string to a binary representation, an encoding has to be defined. The default encoding for Buffer.from is UTF8. The UTF8 encoding may have up to four bytes per character, so string length will not always match the converted buffer size. Especially, when dealing with emojis.

node -p "'šŸ”„'.length"
## will return 2

node -p "Buffer.from('šŸ”„').length"
## will return 4
Enter fullscreen mode Exit fullscreen mode

When the first argument passed to Buffer.from is a string, a second argument can be passed to set the encoding. Two types of encodings are available in this context: character encodings and binary-to-text encodings. UTF8 is one character encoding, UTF16LE is another one. Different encodings result in different buffer sizes.

Converting Buffers to Strings

To convert a buffer to a string, call the toString method on a Buffer instance. Let's try it out, use the REPL or create a file and run it with node.

const buffer = Buffer.from('hello world');
console.log(buffer); // prints <Buffer 68 65 6c 6c 6f 20 77 6f 72 6c 64>
console.log(buffer.toString()); // prints 'hello world'
Enter fullscreen mode Exit fullscreen mode

The toString method also accepts an encoding argument.

const buffer = Buffer.from('mario');
console.log(buffer); // prints <Buffer 6d 61 72 69 6f>
console.log(buffer.toString('hex')); // prints '6d6172696f'
Enter fullscreen mode Exit fullscreen mode

JSON Serializing and Deserializing Buffers

A very common format for serialization is JSON , especially when working with JavaScript-based applications. When JSON.stringify encounters any object, it attempts to call the toJSON method on that object, if it exists.Buffer instances have a toJSON method, which returns a plain JavaScript object.

node -p "Buffer.from('hello world').toJSON()"
Enter fullscreen mode Exit fullscreen mode

Calling toJSON on the above Buffer instance returns the following JSON.

{
  "type": "Buffer",
  "data": [104, 101, 108, 108, 111, 32, 119, 111, 114, 108, 100]
}
Enter fullscreen mode Exit fullscreen mode

Buffer instances are represented in JSON by an object, that has a type property with a string value of Buffer and a data property with an array of numbers, representing the value of each byte.

TL;DR

  • If you have to create a buffer, only use the safe method Buffer.alloc.
  • The Buffer.alloc function produces a zero-filled buffer by default.
  • The Buffer constructor does not need the new keyword.
  • There is an unsafe way to allocate buffer allocUnsafe(), this poses a security risk , though there are some advanced use cases.

Thanks for reading and if you have any questions , use the comment function or send me a message @mariokandut.

If you want to know more about Node, have a look at these Node Tutorials.

References (and Big thanks):

JSNAD - Using BuffersNode.js - Buffer,MDN - Javascript Typed Arrays

Top comments (0)