Abstract
This article presents a practical guide to using HTTP streaming for efficient data visualization in web applications.
I was inspired to write this article from the experience I had working on AI projects that leverage the streaming support provided by OpenAI API and I'd want to share my findings hoping that could be useful.
What is HTTP streaming?
HTTP streaming is a way of sending data over HTTP using a mechanism called chunked transfer encoding. This means that the server can send the response in multiple chunks, each with its own size and content. The client can start processing the response as soon as it receives the first chunk, without waiting for the whole response to be complete. This can reduce the latency and the memory usage of both the server and the client.
Compatibility
HTTP streaming is supported by most modern browsers and HTTP clients, and it works well with plain HTTP and HTTPS.
HTTP/1.1 vs HTTP/2
You can also use HTTP/2 for streaming, which has some advantages over HTTP/1.1, such as multiplexing and header compression. However, HTTP/2 requires HTTPS and some additional configuration on the server side.
For simplicity This article will focus on HTTP/1.1.
Why use HTTP streaming?
HTTP streaming can be useful for web applications that need to visualize a large amount of data, such as charts, graphs, maps, tables or a time-consuming response like a complex AI response, this mainly to offer a User a more engaging interactive experience.
Proof of Concept
I choosen to use plain javascript, Node.js and standard Fetch API for implementing the examples as proof of concept, avoiding any third-party frameworks so we will not be sidetracked by technology details, but we will focus on the streaming architecture.
βIMPORTANTβ The Async Generator
Because to implement the chunked transfer encoding over HTTP we need to split our overall computation in smaller tasks which can return a partial (and consistent) result, we will explore the async generators an incredible built-in javascript tool that is ideal for this goal.
Anatomy of Async Generator function
- An async generator function is a particular function that return an AsyncGenerator object conforms to both the async iterable and the async iterator protocol.
- An async generator function allows yielding an intermediate result during an iterative process suspending current execution and giving the possibility to use such result by code that are waiting for it.
Ultimately an async generator function combines the features of async functions and generator functions, where you can use both the await and yield keywords within the function body, so you can handle asynchronous tasks ergonomically with await, while leveraging the lazy nature of generator functions.
Server Side Implementation
The Assumption is to have an async generator function that produces data in chunks. As prrof of concept here is a simple function that sends out data chunks, with a delay between each one:
async function* generateData() {
for (let i = 0; i < 1000; i++) {
// Simulate delay
await new Promise(resolve => setTimeout(resolve, 1000));
// Yield data chunk
yield `data chunk ${i}\n`;
}
}
In real case we would have a more complex logic to generate the data chunks coming from external data sources, but for the purpose of this example we will keep it simple.
Now we can write a simple server that streams the data over HTTP to the client. It is important take a note that, to iterate over the data chunks we can to use the for await statement that creates a loop iterating over async iterable objects.
import http from 'http';
const server = http.createServer(async (req, res) => {
res.writeHead(200, {
'Content-Type': 'text/plain',
'Transfer-Encoding': 'chunked'
});
// Asynchronous iterate over the data chunks
for await (const chunk of generateData()) {
res.write(chunk);
console.log(`Sent: ${chunk}`);
}
res.end();
});
const PORT = 3000;
server.listen(PORT, () =>
console.log(`Server running at http://localhost:${PORT}/`) );
As you can see from the code above, the implementation of chunked transfer encoding over HTTP in Node.js is pretty straight forward, we iterate over data chunks asynchronously and write them to HTTP response that's all.
Client Side Implementation
On the client side we use fetch API to handle streaming response from the server. In this case we can attach a Reader to a response's body using getReader()
, that locks to the stream and waits for each chunk of data sent by server.
Decoding the data chunks
Since the data chunk is encoded we need to decode it first to be able to use it.
Bonus π― : Wrap around the Reader with an asynchronous generator function
We can wrap around the Reader with an asynchronous generator function that allows fetching data in a streaming fashion yielding each chunk of data as soon as it is available.
/**
* Generator function to stream responses from fetch calls.
*
* @param {Function} fetchcall - The fetch call to make. Should return a response with a readable body stream.
* @returns {AsyncGenerator<string>} An async generator that yields strings from the response stream.
*/
async function* streamingFetch( fetchcall ) {
const response = await fetchcall();
// Attach Reader
const reader = response.body.getReader();
while (true) {
// wait for next encoded chunk
const { done, value } = await reader.read();
// check if stream is done
if (done) break;
// Decodes data chunk and yields it
yield (new TextDecoder().decode(value));
}
}
Now we can use the streamingFetch
generator function to stream the data chunks coming from the server as shown below.
(async () => {
for await ( let chunk of streamingFetch( () => fetch('http://localhost:3000/') ) ) {
console.log( chunk )
}
})();
It's DONE! β now you can see data chunks coming from the server as soon as they are available.
Take a Note π:
We have both an async generator in the server to produce chunks of data and in the client to consume them.
Advantages π
- Snappy User Experience: You can start showing data as soon as it's available.
- Scalable API: No memory usage spikes from accumulating results in memory.
- Uses plain HTTP and a standard JavaScript API. There are no connections to manage or complicated frameworks that might become obsolete in a few years.
Disadvantages π€
- Implementation is slightly more involved than using regular API calls.
- Error handling becomes more difficult because HTTP status code 200 will be sent as soon as streaming starts (see cover image).
- What do we do when something goes wrong in the middle of the stream?
- The application must be able to determine if the stream has not completed and behave accordingly.
- Needs formatting assumptions as part of the contract or usage of an unconventional format.
Bonus π―: Streaming response from OpenAI Chat API
As said at beginning I delved into HTTP Streaming to be able to leverage the streaming support provided by OpenAI streaming API. Essentially the streaming version of the API, instead to return whole answer, return an async iterable object π€©.
So, we can use the same approach as we did before π to stream over HTTP the data chunks coming from the OpenAI server as shown below:
import http from 'http';
import OpenAI from "openai";
const openai = new OpenAI();
(async () => {
const stream = await openai.chat.completions.create({
model: "gpt-4",
messages: [{ role: "user", content: "Say this is a test" }],
stream: true,
});
const server = http.createServer(async (req, res) => {
res.writeHead(200, {
'Content-Type': 'text/plain',
'Transfer-Encoding': 'chunked'
});
// Asynchronous iterate over the data chunks
for await (const chunk of stream) {
res.write(chunk.choices[0]?.delta?.content || "");
console.log(`Sent: ${chunk}`);
}
res.end();
});
const PORT = 3000;
server.listen(PORT, () =>
console.log(`Server running at http://localhost:${PORT}/`) );
})();
Conclusion
In this article we have seen a practical guide to using HTTP streaming for efficient data visualization in web applications. We have explored the use of chunked transfer encoding over HTTP its advantages and disadvantages. We have also delved into the power of async generator functions and their use in implementing HTTP Streaming. Finally, we have seen a real use case of streaming data over HTTP using Node.js, Fetch API, and OpenAI streaming API.
Hope that this knowledge will be helpful. In the meanwhile, enjoy coding! π
References
Originally published at https://bsorrentino.github.io on February 10, 2024.
Top comments (7)
I found this article to be very well written.
Iβm excited to do some generator benchmarking against my byte-barometer. I havenβt done a write up yet but feel free to give it a try.
chahla.net/static/byte-barometer/
Great article, I found it very interesting.
I would love to see a comparison, to think about the pros and cons of using HTTP streaming against Servers sent events or websockets communication
Interesting topic
I think that SSE and/or WebSocket fit for different (and more complex) scenarios when user interactions are thought purely in asynchronous fashion, while streaming is mostly an optimization to improve the overall user experience minimizing the negative impact upon waiting of data
Great , can i work with json as Content-Type?
Absolutely yes
Simply to have to change content-type accordingly and parse JSON after client decoding
Take in account that you are responsible to produce from the server a valid JSON chunk of data
Isn't using await counterintuitive to streaming ?
Await means, wait for the api call to finish, then proceed
Await is needed because we expect that chunk of data is ready, the asynchronous iterator is used for this purpose.
At the end you have multiple await and for each of them the chunk of data produced is returned to the client.
Some comments may only be visible to logged-in visitors. Sign in to view all comments.