DEV Community

JSON is Slower. Here Are Its 4 Faster Alternatives

Nik L. on October 31, 2023

Edit 2: Lots of insightful comments at the bottom, do give them a read, too, before going with any alternatives! Edit 1: Added a new take on 'Optim...

Read full post

Ben Sinclair • Nov 1 '23

First off, efficiency in data formats like these only matters when you're manipulating a lot of data. If you're using a gigabyte of JSON then you're probably doing something wrong.

As far as your optimisations are concerned, I disagree on a few points.

This example:

   // Inefficient
   {
     "order": {
       "items": {
         "item1": "Product A",
         "item2": "Product B"
       }
     }
   }

   // Efficient
   {
     "orderItems": ["Product A", "Product B"]
   }

isn't a way of making JSON more efficient, it's a way of changing your schema. If you want to store relationships, or have a collection of order objects, then you use the "inefficient" hierarchy, and if you need to have keys for the items (not really relevant in this example) then you use a keyed object, otherwise you use an array. My point is that these are things you will change depending on your schema, and they have little bearing on efficiency and none on JSON in particular.

Use Short, Descriptive Keys

Rather, use keys which are consistent with the rest of your code.

Abbreviate When Possible

Don't do this. When you're loading data, unless you're going through another parsing stage, these abbreviations are going to directly correspond with objects with confusing names. If you wouldn't write it as a variable name in your clear, self-documenting code, don't use it as a key in a data structure.

You talk about JSON being language agnostic, and efficiency in things like parsing numerical data, but really we know we're talking about the performance over the web. Your native application storing its configuration in JSON isn't going to notice any of these performance changes. This post is more like, "improving performance of sending human-readable data structures over the wire".

Nik L. • Nov 1 '23

True Ben, your point makes sense. I was considering that for some use-cases, where data may need to be serialized or deserialized frequently, changing the schema may reduce verbosity.
But, yeah, your points are valid overall.

magnus • Nov 1 '23

Thanks you've spared me some minutes writing the exact same comment

Maxi Contieri • Nov 3 '23

Indeed. This hurts readability and seems unnecessary most of the time

Remember that premature optimization is the root of all evil

Comment deleted

Ben Sinclair • Nov 2 '23

I think that's unfair - there's nothing particularly wrong with the article if it's pitched as saving bandwidth, and the author has clearly made an effort to make it readable and interesting.

Nicolás Danelón • Nov 2 '23

saving bandwidth in 2023 =P is not 100% shitty tho

Raí B. Toffoletto • Nov 2 '23

"Language Agnostic" !? It literally has JavaScript in its acronym... I think you mean it's supported by several languages. Almost all languages will have a json converter. But it's not agnostic... let's take numbers for example: numbers is JS are just numbers, independently if they have or not decimals... however it's the converter job to transform it into int/float/decimal whatever type of numbers the other language work with.

punund • Mar 2

Numbers in JSON are symbolic representations of numeric data, which type is not imposed by the format. It is up to the implementation to interpret, for example, numbers without decimal separator as integers, others as double-precision 64 bit just like JS, or single precision 32-bit numbers if the mantissa is short enough.

Anchorwave • Apr 29

"'Language Agnostic' !? It literally has JavaScript in its acronym..."
You can use JSON in a Golang program, yeah? So it's language agnostic.

Samuel Rouse • Nov 1 '23

This is an interesting collection of notes and options. Thanks for the article!

Can you provide links, especially for the "Real-World Optimizations" section? I would appreciate being able to learn more about the experiences of these different companies and situations.

In the "Optimizing JSON Performance" section the example suggests using compression within Javascript for performance improvement. This should generally be avoided in favor of HTTP compression at the connection level. HTTP supports the same zlib, gzip, and brotli compression options but with potentially much more efficient implementations.

While protocol buffers and other binary options undoubtedly provide performance and capabilities that JSON doesn't, I think it undersells how much HTTP compression and HTTP/2 matter.

I did some small work optimizing JSON structures a decade ago when working in eCommerce to offset transfer size and traversal costs. While there are still some benefits to using columnar data (object of arrays) over the usual "Collection" (array of objects), a number of the concerns identified, like verbose keys, are essentially eliminated by compression if they are used in repetition.

HTTP/2 also cuts down overhead costs for requests, making it more efficient to request JSON – or any format – in smaller pieces and accumulate them on the client for improved responsiveness.

There are some minor formatting issues, and it is lacking in sources, but it provides a great base of information and suggestions.

Nik L. • Nov 1 '23

You're right about the HTTP compression, Samuel. Have added your perspective in that section. For referring resources in Example, I've added the links in those sections.

Zhang Li • Nov 1 '23

On point 4.a (avoid repetitive data), you might actually need to use the "inefficient" way, as you will most likely need an "id" type of field.

In the "inefficient" way, the property name can serve as an "id", but in the "efficient" way, you would need a new "id" property.

For most of these tips, they seem to all be micro-optimizations, as the majority of the slowness would be down to the frontend framework you use, as well as the many third party packages.

Adaptive Shield Matrix • Nov 1 '23

Being able to improve performance is nice,
but as a dev I care the most about the developer experience:

How can I debug/decrypt messages if something goes wrong?
Do problems, for example, inconsistencies between versions happen offen?
How to avoid these?

And missing for me is real examples, with open source code on

how the differences look like
if its readable / maintainable
performance/benchmark/message size comparison
increase of browser bundle size because of the library

All the features listed for each serialization format sound completely identical to me.

Binary Encoding
Efficient Data Structures
Compactness
Binary Data Support

Since most of these tools come from server environments its even questionable if you can

make use of any of the features
since a client library has to exist,
has to provide types for typescript
has to implement all protocol features.
Optionally has to do it while not exploding the shipped bundle size.
Then still has to do it faster/more efficient than JSON (thats not a given)

Junior Hernandez • Nov 4 '23

The solution for JSON to be sent compressed in Brotli or GZip format is very good. But one thing, if you run an application in production, the safest thing is to run it behind a proxy server (Nginx, Apache, Cloudflare, ...) to which it is preferable to enable serving all string responses (json, html, xml, css, js, ...) in compressed mode (brotli, gzip). Better to let the proxy compress the responses and not your application or you will use up important application resources that it may need for other processes.

On the other hand, although it may seem like a closed way of thinking, I have never been inclined towards JSON Schema. This has its reason for being, adding validation of the value, but it has a serious disadvantage: The type of data (among other things) that allows validating the value must be passed through each key, leaving an extremely heavy JSON. In essence, it should only be used when the response of a form is sent to an API, so that the server can also validate what it receives. Schema would be terribly inefficient for the response of a list (for example, a list of products from a catalog) because of the burden of unnecessary extra data.

Speaking of catalog products, a while ago I was in charge of an e-commerce that, instead of saving the shopping cart in the application, temporarily saved it in the browser's localStorage. When the buyer adds a product to the cart, the SKU and the requested quantity ["SKU89038", 5] are accumulated in an array and then go down to localStorage.

When placing the purchase order, what is sent from the customer to the application is a list of SKU's + Quantity. No further information is needed at this point, making the process quite efficient. But when it is the other way around in the case of displaying a list of products, only exactly the data that is needed is shown and no more.

While designing how I would return responses from product lists I explored the idea of returning a json containing a list of arrays with only the values. Then, the first member of that list would be all the keys in each column (focused on the human who was programming the front-end), in order to avoid adding extra bytes to the response. In the end it was an over-optimization and we ruled out doing it, returning the json with their respective traditional key-values.

David • Nov 2 '23 • Edited

Thanks for the article. I learnt some things :)

Impact: Turbocharging Authentication and Authorization

Auth0 wasn't as impressed by Protocol Buffers as I was expecting them to be.

I have to be honest, I was hoping to come across a more favorable scenario for Protobuf. Of course, being able to handle, on a Java to Java communication, 50 thousand instances of Person objects in 25ms with Protobuf, while JSON took 150ms, is amazing. But on a JavaScript environment these gains are much lower.

Daniel G. Taylor • Dec 8 '23

I'm kind of shocked that CBOR and the excellent github.com/fxamacker/cbor library aren't covered in this. You get JSON interop without the wackiness of protobuf serialization and can be equally compact to protobufs with the keyasint and toarray struct tags when needed.

On top of that you can still utilize JSON schema fairly easily, for example huma.rocks/ supports JSON/CBOR (via client-driven content negotiation) out of the box and uses OpenAPI 3.1 & JSON Schema for validation.

Rick Delpo • Nov 9 '23 • Edited

Hey THANKS for the inspiration. I always wanted to try a Mongodb Atlas Cluster but I never considered the notion that BSON is much faster than JSON. So I gave it a whirl and found that manipulating data in Mongo is 90% faster according to my testing.

When inserting a record into a JSON array there is no append method so we need to fetch the entire array with the record added and then resave the array which overwrites the original structure. In my app this takes 3000ms to add a record. Over in Mongo we have the insertOne() method which only takes 200ms. My collection has 14k records so inserting via JSON is not practical. But for much smaller use cases like dashboards using javascript array methods on JSON can be practical.

I use AWS Lambda. In 7~ lines I can insert a record into Mongo with the following:

import { MongoClient } from "mongodb"; //a very small library
const client = new MongoClient(process.env.MONGODB_CONNECTION_STRING);
//connection string saved as key/pair in env file

export const handler = async(event) => {
const db = await client.db("test");
const collection = await db.collection("tracker3");

const myobj = {

"xyz":"value",
"other key":"123"
};
//const body = await collection.find().toArray(); //gets entire collection
const result = await db.collection("tracker3").insertOne(myobj); //this is the insert command
//return body; //the toArray methods puts in json array..then can be exported to csv
};

to do same as JSON array I have bigger process also with a much bigger import

const AWS = require('aws-sdk'); //a very large library
const fetch = require('node-fetch');
const s3 = new AWS.S3();

exports.handler = async (event) => {
const res = await fetch('xxxxx.s3.us-east-2.amazonaws.com/t...);
const json = await res.json();

// add a new record using array method push
json.push({
country: event.country2,
session: event.ses,
page_name: event.date2,
hit: event.hit2,
ip: event.ip2,
time_in: event.time2,
time_out: event.time3,
event_name: event.city2
});

     //then re write whole updated file back to s3 and it will overwrite existing which now has update append //from above

var params = {
Bucket: 'xxxxxx',
Key: 'tracker2.json',
Body: JSON.stringify(json), //pass fetch result into body with update from push
ContentType: 'json',
};

var s3Response = await s3.upload(params).promise();

};

Nik L. • Nov 9 '23 • Edited

That's pretty time saving, imo. Did you tried any other more efficient options?

Retiago Drago • Nov 2 '23

how about gRPC?

Nik L. • Nov 2 '23 • Edited

An efficient option for getting smaller payloads. Tried this one? Also this was something I wrote sometimes back with gRPC dev.to/nikl/building-production-gr...

Oscar • Nov 1 '23

Super interesting article, certainly something to think about!

Rense Bakker • Nov 2 '23

This may also be of interest: dev.to/samchon/i-made-express-fast...

Shane Kunz • Nov 1 '23

You have good details in here, but my biggest problem is that the title is click bait. You're talking about transfer protocols and schema requirements. JSON is a display format, obviously the fastest way to transfer data is not as an unmodified string. Your alternatives, while maybe not exactly JSON, even looks like JSON or use JSON to create the encodings. I think you're going to confuse new developers by the title. The last thing I'd want people to do is over engineer simple problems and stop using JSON by simply internalizing "JSON is slow". Instead of sounding like an attack on JSON, I would've better highlighted reasons people augment JSON or use different tools beyond just speed.

Gwyneth Llewelyn • Nov 9 '23

"JSON is a display format". Hmm. Well, it can certainly be displayed, and it was designed to be human-readable, but... in its origin, JSON was meant to be a quick and effective way of passing around objects in JavaScript — in those very remote times when JavaScript was still thought as merely a "cool" way to do some visual manipulation of HTML on the browser, and we only had interpreted JavaScript in any case...

Shane Kunz • Nov 10 '23

I concede that it's a "data interchange format" and not merely for display. But my point is that this article doesn't seem to acknowledge that JSON is an essential part of building a web application and is not a bottleneck for 99% of applications. A bad database query will take far longer than the time to send and receive JSON. While that may seem obvious to you, it can cause unnecessary confusion for new developers. Honestly, when I first read this article I was only thinking about web development. But after reading it again, it doesn't mention web development specifically. If you're sending data from a native mobile app to a rust backend, then using protocol buffers is a good choice. However, if you're using JavaScript (the most popular programming language in the world) you can't even instantiate a protocol buffer without using JSON. I just wanted the title to be changed from something like "JSON is Slower" to "Sending JSON over a network is too slow for large scale applications". Clickbait titles and headings could proliferate protocol buffers into every app making the world worse with unneeded complexity. If you have a JS front end and a JS backend, prematurely optimizing network calls to remove JSON would be insanity. If the thought of JSON bothers you, don't even touch web development, your soul will be crushed. The web is horribly slow in so many ways, and it's certainly not from kilobytes of JSON, trust me. New developers need this nuance. I rest my case.

solankidixit • Nov 1 '23

very helpful notes ...helpful

xiaoyao • Nov 1 '23

thinks

Nik L. • Nov 1 '23

✌️

Clayton Kehoe • Nov 15 '23

Appreciate the insights - lots to think about in here

Nikunj Bhatt • Nov 3 '23

When transferring large JSON data between server and client, I implement like this (example is not in a particular language, consider it like an algorithm):

On the server side:

queryResult = getDataFromDb('SELECT * FROM table_1')

^ queryResult is not a scalar data type, its datatype is language specific.

fields = queryResult.getFields()

^ getFields() returns the names of the fields, got in the query's result, in an array, like the following:
fields = ['id', 'name', 'email_address']

data = queryResult.getRecordsOnlyInArray()

^ getRecordsOnlyInArray() returns rows only in arrays, instead of objects, without field names. It will be an array of arrays, like the following:
data = [ [1, 'Nikunj', 'nikunj@example.com'], [2, 'John', 'john@example.com'], [3, 'Martin', 'martin@example.com'] ]

A JSON object is sent, after converting it to a string, with two properties: fields and data:

sendDataToClient(jsonToString( {
    'fields': fields,
    'data': data
} ) )

On the client side
The string sent from the server is received, converted to objects/arrays according to JSON, and assigned to the variable fieldsAndRows.

fieldsAndRows = stringToJson(getData())

As the variable name fieldsAndRows is bigger than usual, and contains two separate properties for us to access, assign each property to two separate variables. (Here it is hoped that the new variables are just references to the two related properties of the variable fieldsAndRows. If the language does not support such referencing and instead creates duplicate arrays for fields and rows then use a shorter name, like fr = stringToJson(getData()), instead of fieldsAndRows.)

rows = fieldsAndRows.rows

Now create an object having fields' names as their properties' names, and assign index number of the same field to each respective property using a loop:

fields = {}
for(a = 0; a < fieldsAndRows.length; a++)
    fields[fieldsAndRows.fields[a]] = a

Now these can be used as below to access value of a particular field of a particular row:

rows[2][fields.name] // Martin
rows[1][fields.email_address] // john@example.com

BTW, to reduce load on the server, it is most of the times better to convert/manipulate data on the client side.

TrentQuarantino • Nov 2 '23

Informative article with great instructive comments, I will try Avro, now I'm trying
this: github.com/matteobertozzi/yajbe-da...
https://www.youtube.com/watch?v=ilbsLXa7uT8&t=110s

Tracy Gilmore • Nov 1 '23

Hi Nik,
You mentioned 'JSON Schema', but the last time I looked there was no standardised approach to this. Has the changed and if so could you please provide a reference.
Regards, Tracy

Bangkokian • Nov 4 '23

Hmm... I'm not sure about this. You're not making things more efficient as much as you're changing the schema and losing the relationships of the key/value pairs.

If you're experiencing inefficiencies due to JSON, you're probably in need of a database — or you're just not organizing your JSON files efficiently.

Masashi • Dec 31 '23

This is one of the best posts I have ever come across on this platform since the time I joined. The quality and the topic are both top notch. Excellent. I'm devoid of words to emphasize enough on how this is an ideal post.

Ghazouane • Nov 7 '23

I use the following combo as graphQL( or similar ) + protobuf (or similar) + servereside cach (when possible )+ client side cach (with optimistic update)+ http compression

Some feedback about this stack ?

Masoud Mohajeri • Nov 7 '23

There is no good support for protobuf in client side js. compilers and libraries have a lot of bugs and interpret protobuf to json in their own way and if you want to switch library or switch back to json you have to change a lot of stuff all over the app 😔