Serhii Korol

Posted on Jul 16, 2023

The main pitfalls in generating images with DALL-E API.

#csharp #openai #aspnet #ai

I highly recommend reading to end this article before you start using DALL-E API. I want to share my experience and what I faced when using DALL-E.

Entry

What's the DALL-E? It's a product from OpenAI that created ChatGPT. If you already use ChatGPT, you might receive a trial period for DALL-E. Your status can see by this link. In this article, we consider 3 APIs for generating images.

Preparations

You should create a simple ASP.NET MVC project and execute a small setting in your Open AI profile. Go to the https://platform.openai.com/account/api-keys and make the secret key.

After you'll do it, save this key anywhere you can find it.

Models.

Let's create several models for input data and result data. I made it in the Models folder and created a Dalle record. It'll be a root model for exchanging data between View and Controller.

public record Dalle
{
    public GenerateInput? GenerateInput { get; set; }
    public List<string>? Links { get; set; }
}

And create a model for input data.

public record GenerateInput
{
    [JsonPropertyName("prompt")] public string? Prompt { get; set; }
    [JsonPropertyName("n")] public short? N { get; set; }
    [JsonPropertyName("size")] public string? Size { get; set; }
}

And also you should create a couple of models for results.

public record ResponseModel
{
    [JsonPropertyName("created")]
    public long Created { get; set; }
    [JsonPropertyName("data")]
    public List<Link>? Data { get; set; }
}

public record Link
{
    [JsonPropertyName("url")]
    public string? Url { get; set; }
}

First API. Generating images from text.

And now when we already have all the needed models. We are able to start creating endpoints and input forms. Follow Views => Home => Index.cshtml and paste this markup:

@model TextToPicture.Models.Dalle
@{
    ViewData["Title"] = "Home Page";
}
<h1 class="display-4">Hi, my name is DALL-E</h1>

<div class="container-fluid">
    <div class="row">
        <div class="col-4">
            @using (Html.BeginForm("GenerateImage", "Home", FormMethod.Post, new { @class = "form-horizontal" }))
            {
                @Html.AntiForgeryToken()
                <div class="mb-3">
                    @Html.LabelFor(m => m.GenerateInput!.Prompt, "Prompt: ", new {@class = "form-label"})
                    @Html.TextAreaFor(m => m.GenerateInput!.Prompt, new {@class = "form-control", required = "required"})
                    @Html.ValidationMessageFor(m => m.GenerateInput!.Prompt)
                </div>
                <div class="mb-3">
                    @Html.LabelFor(m => m.GenerateInput!.N, "Number of Images: ", new {@class = "form-label"})
                    @Html.TextBoxFor(m => m.GenerateInput!.N, new {@class = "form-control", type = "number", min = "1", max = "10", required = "required"})
                    @Html.ValidationMessageFor(m => m.GenerateInput!.N)
                </div>
                <div class="mb-3">
                    @Html.LabelFor(m => m.GenerateInput!.Size, "Source Currency", new {@class = "form-label"})
                    @Html.DropDownListFor(m => m.GenerateInput!.Size, new SelectList(new List<string> { "256x256", "512x512", "1024x1024" }), new { @class = "form-control" })
                    @Html.ValidationMessageFor(m => m.GenerateInput!.Size)
                    <span>(e.g., 1024x1024)</span>
                </div>
                <div class="btn-group" role="group" aria-label="Generate">
                    <button type="submit" class="btn btn-success">Generate</button>
                </div>
            }
        </div>
    </div>
</div>

We'll be sending text, image quantity, and picture size. Go to the HomeController.cs and we'll start creating a new action.

    private readonly HttpClient _httpClient;
    private readonly IWebHostEnvironment _hostingEnvironment;

    public HomeController(IWebHostEnvironment hostingEnvironment)
    {
        _hostingEnvironment = hostingEnvironment;
        _httpClient = new HttpClient();
        _httpClient.BaseAddress = new Uri("https://api.openai.com/");
    }

    [HttpPost]
    public async Task<ActionResult> GenerateImage(GenerateInput generateInput)
    {
        try
        {

        }
        catch (Exception ex)
        {

        }
    }

I'll step by step build this action and explain how it works. Let's move on. First, let's create a base request where authorization will be.

//Request creating
var request = CreateBaseRequest(HttpMethod.Post, "v1/images/generations");
var jsonRequest = JsonSerializer.Serialize(generateInput);
request.Content = new StringContent(jsonRequest);
request.Content!.Headers.ContentType = new MediaTypeHeaderValue("application/json");

And also add this method and set your secret key that you kept earlier.

private HttpRequestMessage CreateBaseRequest(HttpMethod method, string uri)
{
        var httpRequestMessage = new HttpRequestMessage(method, uri);
        var apiKey = "your-secret-key";
        httpRequestMessage.Headers.Authorization = new AuthenticationHeaderValue("Bearer", apiKey);
        return httpRequestMessage;
}

Pay attention, you should mandatory set the content type or you'll get bad requests.

request.Content!.Headers.ContentType = new MediaTypeHeaderValue("application/json");

At the following step, we need execute request. It's ordinary action.

//Result 
var response = await _httpClient.SendAsync(request, HttpCompletionOption.ResponseHeadersRead);
response.EnsureSuccessStatusCode();

In the last step, you need to handle the response and save the received image.

// Pass the image URL to the view
var model = new Dalle { Links = await GetUrls(response)};
return RedirectToAction("Index", model);

In this method, we get URLs from the response and return them. However, there is one detail. The API returns a URL with limited access time. After a while, you are able to get access by link only with a secret key. For this reason, I decided to save every picture. Add this method for creating folders and files by the current date.

public async Task SaveFileByLink(string link)
{
        var date = DateTime.Now.Ticks.ToString();
        var fileName = $"file" + date + ".png";
        var uploadsFolder = Path.Combine(_hostingEnvironment.WebRootPath, "Uploads");
        var filePath = Path.Combine(uploadsFolder, fileName);
        await using var fileStream = new FileStream(filePath, FileMode.Create);
        await fileStream.WriteAsync(await DownloadImage(link));
}

Add this method for getting image bytes.

public async Task<byte[]> DownloadImage(string url)
{
    return await _httpClient.GetByteArrayAsync(url);
}

In block, catch add simple error handler.

// Handle any error that occurred during the API request
ViewBag.Error = ex.Message;
return View("Index");

And sure pass data to the Index action.

public IActionResult Index(Dalle model)
{
    return View(model);
}

And also let's add markup for showing images:

<hr style="height:2px;border-width:0;color:gray;background-color:gray">
<div class="container-fluid">
    <div class="row">
        @if (@Model?.Links != null)
        {
            <div class="img-fluid">
                <h4>Generated Images:</h4>
                @foreach (var link in Model.Links)
                {
                    <img src="@link" alt="Generated Image" />
                }
            </div>
        }
    </div>
</div>

And now let's check this out.

Fill the form.

Don't try to make prompts dedicated to famous people, since DALL-E is censored, and return bad requests if you'll try to indicate Elon Mask's name or something else.

You should get results with four pictures:

Second API. Editing pictures.

This API has issues and pitfalls. I'll tell you what I could find about it.

In the beginning, let's add a new input model.

public record EditInput
{
    [JsonPropertyName("prompt")] public string Prompt { get; set; }
    [JsonPropertyName("n")] public short N { get; set; }
    [JsonPropertyName("size")] public string Size { get; set; }
    [JsonPropertyName("image")] public IFormFile Image { get; set; }
    [JsonPropertyName("mask")] public IFormFile Mask { get; set; }
}

And don't forget to add to the root model.

public record Dalle
{
    public GenerateInput? GenerateInput { get; set; }
    public EditInput? EditInput { get; set; }
    public List<string>? Links { get; set; }
}

Go to the markup and paste this code:

<div class="col-4">
            @using (Html.BeginForm("EditImage", "Home", FormMethod.Post, new { @class = "form-horizontal", enctype = "multipart/form-data" }))
            {
                @Html.AntiForgeryToken()
                <div>
                    @Html.LabelFor(m => m.EditInput!.Image, "Image: ", new {@class = "form-label"})
                    @Html.TextBoxFor(m => m.EditInput!.Image, new { type = "file", required = "required" })
                    @Html.ValidationMessageFor(m => m.EditInput!.Image)
                </div>
                <div>
                    @Html.LabelFor(m => m.EditInput!.Mask, "Mask: ", new {@class = "form-label"})
                    @Html.TextBoxFor(m => m.EditInput!.Mask, new { type = "file", required = "required" })
                    @Html.ValidationMessageFor(m => m.EditInput!.Mask)
                </div>
                <div class="mb-3">
                    @Html.LabelFor(m => m.EditInput!.Prompt, "Prompt: ", new {@class = "form-label"})
                    @Html.TextAreaFor(m => m.EditInput!.Prompt, new {@class = "form-control", required = "required"})
                    @Html.ValidationMessageFor(m => m.EditInput!.Prompt)
                </div>
                <div class="mb-3">
                    @Html.LabelFor(m => m.EditInput!.N, "Number of Images: ", new {@class = "form-label"})
                    @Html.TextBoxFor(m => m.EditInput!.N, new {@class = "form-control", type = "number", min = "1", max = "10", required = "required"})
                    @Html.ValidationMessageFor(m => m.EditInput!.N)
                </div>
                <div class="mb-3">
                    @Html.LabelFor(m => m.EditInput!.Size, "Source Currency", new {@class = "form-label"})
                    @Html.DropDownListFor(m => m.EditInput!.Size, new SelectList(new List<string> { "256x256", "512x512", "1024x1024" }), new { @class = "form-control" })
                    @Html.ValidationMessageFor(m => m.EditInput!.Size)
                    <span>(e.g., 1024x1024)</span>
                </div>
                <div class="btn-group" role="group" aria-label="Edit">
                    <button type="submit" class="btn btn-success">Edit</button>
                </div>
            }
        </div>

You can return to HomeController and ad new action:

[HttpPost]
    public async Task<IActionResult> EditImage(EditInput editInput)
    {
        try
        {
            // Add the form data
            var formData = new MultipartFormDataContent();
            formData.Add(new StringContent(editInput.Prompt), "prompt");
            formData.Add(new StringContent(editInput.N.ToString()), "n");
            formData.Add(new StringContent(editInput.Size), "size");

            // Add the image file
            await AddFormDataFile(formData, editInput.Image, "image");

            //Add the mask file
            await AddFormDataFile(formData, editInput.Mask, "mask");

            // Prepare the form data
            var request = CreateBaseRequest(HttpMethod.Post, "v1/images/edits");
            request.Content = formData;

            // Make the API request
            var response = await _httpClient.SendAsync(request, HttpCompletionOption.ResponseHeadersRead);
            response.EnsureSuccessStatusCode();

            // Pass the image URL to the view
            var model = new Dalle { Links = await GetUrls(response)};
            return RedirectToAction("Index", model);
        }
        catch (Exception ex)
        {
            // Handle any error that occurred during the API request
            ViewBag.Error = ex.Message;
            return View("Index");
        }
    }

There different requests. We create a request with MultipartFormDataContent. For setting files I decided to create the method:

private async Task AddFormDataFile(MultipartFormDataContent formData, IFormFile file, string name)
    {
        using var memoryStream = new MemoryStream();
        await using (var fileStream = file.OpenReadStream())
        {
            await fileStream.CopyToAsync(memoryStream);
        }

        var imageData = ConvertRgb24ToRgba32(memoryStream.ToArray());
        var imageContent = new ByteArrayContent(imageData);
        imageContent.Headers.ContentType = new MediaTypeHeaderValue("image/png");
        formData.Add(imageContent, name, file.FileName);
    }

Here I want to tell about pitfalls. You'll not find this in the documentation. All files you should send as a bytes array. In the documentation written that you can upload similar pictures in PNG format. But there has one nuance. The file should be in RGBA format, but the first API generates pictures in RGB format. Keep in mind this nuance. For this reason, I created a converter that converts from RGB to RGBA.

public byte[] ConvertRgb24ToRgba32(byte[] inputImage)
    {
        using var inputStream = new MemoryStream(inputImage);
        using var outputStream = new MemoryStream();

        // Load the input image using ImageSharp
        using var image = Image.Load<Rgb24>(inputStream);

        // Create a new image with RGBA32 pixel format
        using var convertedImage = new Image<Rgba32>(image.Width, image.Height);

        // Convert RGB to RGBA
        for (int y = 0; y < image.Height; y++)
        {
            for (int x = 0; x < image.Width; x++)
            {
                Rgb24 inputPixel = image[x, y];
                Rgba32 outputPixel = new Rgba32(inputPixel.R, inputPixel.G, inputPixel.B, byte.MaxValue);
                convertedImage[x, y] = outputPixel;
            }
        }

        // Save the converted image to the output stream
        convertedImage.Save(outputStream, new PngEncoder());

        // Return the converted image as a byte array
        return outputStream.ToArray();
    }

For using this you need the SixLabors.ImageSharp package.
But it's not all. This API isn't working. I'll show you it. You should see this form.

You need to upload the source image and mask image.

And fill form.

The result will without changes.

I thought that it was an issue with C# or I made a mistake. However, I created a request from cURL and got the same result.

curl https://api.openai.com/v1/images/edits -H "Authorization: Bearer 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" -F image="@sunlit_lounge_rgba.png" -F mask="@mask_rgba.png" -F prompt="A sunlit indoor lounge area with a pool containing a flamingo" -F n=1 -F size="1024x1024" > output.json

This snippet was grabbed from the documentation. Actually, this API doesn't work and is useless.

Third API. Generating different variations of picture.

Let's go to add new model.

public record VariationInput
{
    [JsonPropertyName("n")] public short N { get; set; }
    [JsonPropertyName("size")] public string Size { get; set; }
    [JsonPropertyName("image")] public IFormFile Image { get; set; }
}

Add new markup.

<div class="col-4">
            @using (Html.BeginForm("VariationImage", "Home", FormMethod.Post, new { @class = "form-horizontal", enctype = "multipart/form-data" }))
            {
                @Html.AntiForgeryToken()
                <div>
                    @Html.LabelFor(m => m.VariationInput!.Image, "Image: ", new {@class = "form-label"})
                    @Html.TextBoxFor(m => m.VariationInput!.Image, new { type = "file", required = "required" })
                    @Html.ValidationMessageFor(m => m.VariationInput!.Image)
                </div>
                <div class="mb-3">
                    @Html.LabelFor(m => m.VariationInput!.N, "Number of Images: ", new {@class = "form-label"})
                    @Html.TextBoxFor(m => m.VariationInput!.N, new {@class = "form-control", type = "number", min = "1", max = "10", required = "required"})
                    @Html.ValidationMessageFor(m => m.VariationInput!.N)
                </div>
                <div class="mb-3">
                    @Html.LabelFor(m => m.VariationInput!.Size, "Source Currency", new {@class = "form-label"})
                    @Html.DropDownListFor(m => m.VariationInput!.Size, new SelectList(new List<string> { "256x256", "512x512", "1024x1024" }), new { @class = "form-control" })
                    @Html.ValidationMessageFor(m => m.VariationInput!.Size)
                    <span>(e.g., 1024x1024)</span>
                </div>
                <div class="btn-group" role="group" aria-label="Variation">
                    <button type="submit" class="btn btn-success">Variation</button>
                </div>
            }
        </div>

The action is similar to the previous one. I won't be stopping on this.

[HttpPost]
    public async Task<IActionResult> VariationImage(VariationInput variationInput)
    {
        try
        {
            // Add the form data
            var formData = new MultipartFormDataContent();
            formData.Add(new StringContent(variationInput.N.ToString()), "n");
            formData.Add(new StringContent(variationInput.Size), "size");

            // Add the image file
            await AddFormDataFile(formData, variationInput.Image, "image");


            // Prepare the form data
            var request = CreateBaseRequest(HttpMethod.Post, "v1/images/variations");
            request.Content = formData;

            // Make the API request
            var response = await _httpClient.SendAsync(request, HttpCompletionOption.ResponseHeadersRead);
            response.EnsureSuccessStatusCode();

            // Pass the image URL to the view
            var model = new Dalle { Links = await GetUrls(response)};
            return RedirectToAction("Index", model);
        }
        catch (Exception ex)
        {
            // Handle any error that occurred during the API request
            ViewBag.Error = ex.Message;
            return View("Index");
        }
    }

The form is without text and you need to upload a picture.

I uploaded a picture and got four similar pictures.

This API works fine.

Last words.

Generating pictures can be useful, but it's very raw technology. Midjourney is much better. I used not a trial subscription, it was billed. I spent 1 dollar but I don't know how it can be useful.

The code you are able to get this link.

That's all. Happy coding!

DEV Community

The main pitfalls in generating images with DALL-E API.

Entry

Preparations

Models.

First API. Generating images from text.

Second API. Editing pictures.

Third API. Generating different variations of picture.

Last words.

Top comments (0)

Read next

Cursor Tips

Cassi: An AI-Powered CSS Style Guide Generator

10 Top AI-Powered Tools UI/UX Designers Should Master in 2025

Step-by-Step Tutorial on Building AI Coding Interviewer with AI/ML API and Integration with Clerk Auth and Deploying to Vercel