DEV Community

Michael Monerau
Michael Monerau

Posted on • Edited on

Compute MD5 checksum hash for a File in Typescript

When implementing a file uploader component in your webapp, you may need to compute the MD5 checksum of a file.

It is typically useful when your frontend uploads a file to some cloud storage and needs to make your backend aware of the file that was just uploaded. Armed with the MD5 hash of the file, the backend can then validate the integrity of the file when accessing it later on.

At least, that's the way it works in Ruby on Rails & Active Storage.

Quite surprisingly though, there is no easy straightforward way to get the MD5 checksum for a File object in Typescript / Javascript.

Building on this SO post, the great Spark-MD5 library and its test examples, here is a simple solution.

The spark-md5 package needs to be installed in your project:

yarn add spark-md5
# or npm install --save spark-md5

Then the following function does the computation itself, returning a Promise of the MD5 hash as a base64 encoded string. It reads the file in chunks to avoid loading the whole file into memory at once, which could be a performance disaster.

import * as SparkMD5 from 'spark-md5';

// ...

computeChecksumMd5(file: File): Promise<string> {
  return new Promise((resolve, reject) => {
    const chunkSize = 2097152; // Read in chunks of 2MB
    const spark = new SparkMD5.ArrayBuffer();
    const fileReader = new FileReader();

    let cursor = 0; // current cursor in file

    fileReader.onerror = function(): void {
      reject('MD5 computation failed - error reading the file');
    };

    // read chunk starting at `cursor` into memory
    function processChunk(chunk_start: number): void {
      const chunk_end = Math.min(file.size, chunk_start + chunkSize);
      fileReader.readAsArrayBuffer(file.slice(chunk_start, chunk_end));
    }

    // when it's available in memory, process it
    // If using TS >= 3.6, you can use `FileReaderProgressEvent` type instead 
    // of `any` for `e` variable, otherwise stick with `any`
    // See https://github.com/Microsoft/TypeScript/issues/25510
    fileReader.onload = function(e: any): void {
      spark.append(e.target.result); // Accumulate chunk to md5 computation
      cursor += chunkSize; // Move past this chunk

      if (cursor < file.size) {
        // Enqueue next chunk to be accumulated
        processChunk(cursor);
      } else {
        // Computation ended, last chunk has been processed. Return as Promise value.
        // This returns the base64 encoded md5 hash, which is what
        // Rails ActiveStorage or cloud services expect
        resolve(btoa(spark.end(true)));

        // If you prefer the hexdigest form (looking like
        // '7cf530335b8547945f1a48880bc421b2'), replace the above line with:
        // resolve(spark.end());
      }
    };

    processChunk(0);
  });
}

Now, profit:

// your_file_object: File
// ...
computeChecksumMd5Hash(your_file_object).then(
  md5 => console.log(`The MD5 hash is: ${md5}`)
);
// Output: The MD5 hash is: fPUwM1uFR5RfGkiIC8Qhsg==

Top comments (1)

Collapse
 
deskevinmendez profile image
Kevin Mendez

Thaks for sharing.
you save my life with this.