DEV Community

Cover image for Consuming HTTP Streams in PHP with Symfony HTTP Client and Ollama API
Roberto B.
Roberto B.

Posted on

Consuming HTTP Streams in PHP with Symfony HTTP Client and Ollama API

An HTTP stream enables clients to process data incrementally, allowing them to handle large or continuous datasets without waiting for the full response. This approach is highly efficient when dealing with real-time updates or responses from AI models.

With more services, especially AI platforms and APIs that deliver large datasets or extensive text content via HTTP protocol, adopting this streaming method enables faster response times and improved resource management. This article will show you how to consume HTTP streams in PHP using the Symfony HTTP Client, with Ollama’s API as a practical example.

Requirements

To follow along, you'll need:

  • PHP (I strongly suggest the latest version, PHP 8.3).
  • Symfony HTTP Client installed via Composer (composer require symfony/http-client).
  • Ollama running locally (accessible at http://localhost:11434).
  • A basic understanding of handling HTTP requests and responses in PHP.

Step-by-Step Implementation

Here’s a script breakdown demonstrating how to consume a stream in PHP using Symfony HTTP Client and Ollama's API to generate responses.

Install Symfony HTTP Client

First, ensure you have installed Symfony's HTTP Client package:

composer require symfony/http-client
Enter fullscreen mode Exit fullscreen mode

Creating/initializing the HTTP Client

In this script, we create an instance of the Symfony HTTP Client to handle HTTP requests. This instance is essential for making requests and consuming the response streams.

$client = HttpClient::create();
Enter fullscreen mode Exit fullscreen mode

Defining the API endpoint

Then, you'll need to set the endpoint to which you will send your request. For this example, Ollama is running locally, and the API endpoint is:

$url = 'http://localhost:11434/api/generate';
Enter fullscreen mode Exit fullscreen mode

Sending the request with JSON payload

Next, you'll create a POST request, passing a JSON payload that includes the model to use (llama3.2) and a prompt to generate content (e.g., "What is PHP?"):

$response = $client->request('POST', $url, [
    'json' => [
        'model' => 'llama3.2',
        'prompt' => 'A short and effective text about What is PHP?',
    ],
]);
Enter fullscreen mode Exit fullscreen mode

Checking the response status

Before processing the response, verifying the HTTP status code is crucial to ensure the request was successful. If the status code isn't 200, an error is displayed:

if ($response->getStatusCode() !== 200) {
    echo 'Error! ' . $response->getStatusCode();
}
Enter fullscreen mode Exit fullscreen mode

This is just a basic example. You have to manage HTTP errors better, but the idea is to detect the error before starting to parse the response body.

Consuming and processing the stream response

Symfony allows you to consume content from the stream in chunks to handle the response efficiently. This is especially useful when working with large data or when responses are generated incrementally, as with AI models like Ollama.

foreach ($client->stream($response) as $chunk) {
    $json = $chunk->getContent();
    $data = json_decode($json);
    echo $data->response;
    if ($data->done) {
        break;
    }
}
Enter fullscreen mode Exit fullscreen mode

The stream() method lets you process each chunk of data as soon as it's available, which is particularly useful for generating responses from models in real-time. In the ase of Ollama, each chunk contains a full JSON response, which is decoded and processed. The JSON of each chunk contains:

  • model: The name of the AI model used for generation.
  • created_at: The timestamp of the response creation.
  • response: The text generated by the model.
  • done: A boolean indicating if the stream is complete.

Full Example Script

Here's the complete script for clarity:

<?php

require './vendor/autoload.php';

// 001 Using the Symfony HTTP Client
use Symfony\Component\HttpClient\HttpClient;



// 002 Creating the HttpClient instance
$client = HttpClient::create();

// 003 Setting the localhost URL for Ollama for generating response
$url = 'http://localhost:11434/api/generate';

// 004 Making a POST request with a JSON payload.
// The payload defines the model and the prompt
$response = $client->request('POST', $url,
    [
        'json' => [
            'model' => 'llama3.2',
            'prompt' => 'A short and effective thext about What is PHP?',
        ],
    ]);

// 005 Checking the HTTP Header of the Response for errors
if ($response->getStatusCode() !== 200) {
    echo 'Error! '.$response->getStatusCode();
}

// 006 Get the response content in chunks
foreach ($client->stream($response) as $chunk) {
    $json = $chunk->getContent();
    // 007 The chuck is a JSON structure with:
    // - model
    // - created_at
    // - response (the text)
    // - done , a boolean when true the stream is completed
    $data = json_decode($json);
    echo $data->response;
    if ($data->done) {
        break;
    }
}


Enter fullscreen mode Exit fullscreen mode

Conclusion

This example shows how easy it is to handle HTTP streams in PHP using Symfony’s HTTP Client. By processing data in chunks, you can efficiently handle large or dynamically generated responses, such as those from AI models like Ollama. This approach improves performance and responsiveness, especially when dealing with streaming APIs.

With a simple setup, you can integrate this solution into your applications to consume API streams in a non-blocking, efficient manner.

Resources

Top comments (2)

Collapse
 
gbhorwood profile image
grant horwood

like and thumbs up applied, but would the response be 201?

Collapse
 
robertobutti profile image
Roberto B.

Thank you @gbhorwood .
As mentioned in the example:

This is just a basic example. You have to manage HTTP errors better, but the idea is to detect the error before starting to parse the response body.

You can also manage the 201 there, but you have to "extend" the logic to manage the 4** or 5** (maybe showing a helpful error message and stopping the execution).

In general, my suggestion is to review the documentation of the API you need to integrate.
Specifically, for Ollama API and the generate endpoint, I see a correct response should be delivered with 200