I recently updated my little image generation game, Teleprompt, to use DALL-E 3 instead of Stable Diffusion as the image engine.
If you're curious about the reasoning, overall product and architecture, check out my previous, high-level article with those details - as well as some cool pictures of dragons!
In this article, I will give you the few pieces of code you need to create your own DALL-E 3 integration with Elixir.
...
Step 1 - Setup OpenAI
DALL-E is part of the OpenAI product suite, under Image Generation.
Create an account with OpenAI and on the left nav bar you'll see an option for API Keys.
From there you can select "Create new secret key" and copy the value that will look like sk-...
.
I like to add this to a .env
file to source and make available to my project:
export OPENAI_API_KEY=sk-...u8pj
Step 2 - API request
In Teleprompt, I setup a GenServer in order to asynchronously call the OpenAI API. Depending on your use case you might not need this complexity but I felt it was important for a webapp to be more event driven and then use PubSub to inform listeners when the image generation was complete.
defmodule Teleprompt.GenerationHandler do
use GenServer
...
def start_generating(prompt) do
# create a post request to the server
GenServer.cast(__MODULE__, {:generate, prompt, listener_code})
end
@impl true
def handle_cast({:generate, prompt, listener_code}, _state) do
endpoint = "https://api.openai.com/v1/images/generations"
openai_api_key = @openai_key
{model, size} = {"dall-e-3", "1024x1024"}
# {model, size} = {"dall-e-2", "512x512"}
data =
%{
"model" => model,
"size" => size,
"quality" => "standard",
"n" => 1,
"prompt" => prompt
}
|> Jason.encode!()
opts = [async: true, recv_timeout: 30_000, timeout: 30_000]
response =
HTTPoison.post!(
endpoint,
data,
[
{"Content-Type", "application/json"},
{"Authorization", "Bearer #{openai_api_key}"}
],
opts
)
response_body =
response
|> Map.get(:body)
|> Jason.decode!()
# url in body: body["data"][0]["url"]
url = response_body |> get_in(["data", Access.at(0), "url"])
...
Here we are making the post with mostly default parameters. I did try out DALL-E 2 as well, which allows for different parameters and different options (like sizes).
One parameter I was a bit surprised to see missing is a seed
, so I don't know if DALL-E allows you to create repeatable results.
You can see the OpenAI API reference here.
I also used the default response type, which is a url to a public image, however I could have also specified a different response_format
and received the file contents as base64 encoded json.
Step 3 - Use the file
In my case I did want to download the file and manipulated it so I immediately take the URL and do some processing on it. Perhaps the b64_json
response would have made more sense but I was already setup to handle urls so I left that code in place.
One question I have is how long the images would last on the OpenAI CDN if you wanted to directly use the url they give you in your app.
I didn't trust that the image would last forever so I took the url, downloaded the file, uploaded it to AWS to serve myself.
Here's how the end of my generate handler looks, after I get the generated image url:
...
url = response_body |> get_in(["data", Access.at(0), "url"])
{file_name, file_path} = download_and_resize_image(url)
# upload to s3
{:ok, file_binary} = File.read(file_path)
write_file_to_s3(file_name, file_binary, "image/png")
Teleprompt.Messaging.received_image(listener_code)
{:noreply, nil}
end
I'll save some of the small details around resizing but the download code is simply a get
and File.write
:
...
file_path = "/tmp/#{file_name}"
{:ok, response} = HTTPoison.get(image_url)
File.write!(file_path, response.body)
...
That's really all there is to it. Now go make some images!
Top comments (0)