Struggles of Running Image Generators: Limited Use and Frustration
New innovations like AI image generators are gathering popularity. With tools like midjourney and DALL-E 3, people can generate amazing images just with their imagination.
The one thing common in all of the tools is that there is pricing involved, So if you plan on generating images for free, then you will be limited either by a free trial or a limited credit system.
For example, if you are a designer and you want to create image assets, mockups, or design ideas, there would be a lot of iterations involved, which is expensive.
Sooner or later, after some usage, your credits will become depleted and you will no longer be able to generate images.
What technology does Image Generation use?
The technology that these image generators use is called Stable Diffusion.
Stable diffusion requires a lot of resources. Making it expensive to be given to everyone for free.
So I thought, Why not run stable diffusion using my own resources? That way, I won't need to deal with payments, and I can create as many images as I want.
So I will be demonstrating by the end of the article how I was able to run AI image generators on my own for absolutely free.
Here you can see that I have given a simple prompt, and I got the image generated in high quality.
Let's see how the stable diffusion technology works, so we can start running it on our own.
How the Tech Behind AI Image Generators Works
To learn about how AI image generation works, we need to know about Stable Diffusion.
We can think of stable diffusion like the real diffusion we have learned in school.
There will be a clear beaker of water, we will add a few drops of dye, The dye diffuses throughout the liquid, until it reaches a state of equilibrium.
Now let's apply the same concept to real stable diffusion.
For training a stable diffusion model, we will start with a process called Forward diffusion.
In forward diffusion, we take an image and add noise to it.
For those who don't know, think of noise like the static you see when the TV gets disconnected.
The type of noise we are adding here is called gaussian noise. This process is done multiple times, so there will be multiple layers of noise.
After performing forward diffusion, Reverse Diffusion is done. This involves reversing the gaussian noise until we get the original image.
The model gradually starts learning how to predict images from noise.
Similarly, forward and reverse diffusion is done on millions of images to properly train the model.
After the training is done, we can make a random noise, and the model will predict the image.
We may have a doubt: How is the model able to generate images from text prompts?
Images used for training have an alt text associated with them, which describes what the image is about.
This way, each image is linked to a text, and the model gradually finds the relationship between the text and the images.
This is how stable diffusion models work in a simple way. Now let's get the stable diffusion running on your own.
Let's get stable diffusion running
Continue reading the rest of the article to see how we can get stable diffusion running on your machine!
Top comments (2)
Pretty solid solution :P
This article would be better titled "how does AI image generation work" since there's no how-to anywhere beyond a blog link. C'mon.