Stable Diffusion

Stable Diffusion was made possible thanks to a collaboration with Stability AI and Runway and builds upon our previous work:

High-Resolution Image Synthesis with Latent Diffusion Models
Robin Rombach* Andreas Blattmann* Dominik Lorenz, Patrick Esser, Björn Ommer
CVPR '22 Oral | GitHub | arXiv | Project page

Stable Diffusion is a latent text-to-image diffusion model. Thanks to a generous compute donation from Stability AI and support from LAION, we were able to train a Latent Diffusion Model on 512x512 images from a subset of the LAION-5B database. Similar to Google's Imagen, this model uses a frozen CLIP ViT-L/14 text encoder to condition the model on text prompts. With its 860M UNet and 123M text encoder, the model is relatively lightweight and runs on a GPU with at least 10GB VRAM. See this section below and the model card.

Requirements

A suitable…

stable_diffusion.openvino

Implementation of Text-To-Image generation using Stable Diffusion on Intel CPU or GPU.

Requirements

Linux, Windows, MacOS
Python <= 3.9.0
CPU or GPU compatible with OpenVINO.

Install requirements

Set up and update PIP to the highest version
Install OpenVINO™ Development Tools 2022.3.0 release with PyPI
Download requirements

python -m pip install --upgrade pip
pip install openvino-dev[onnx,pytorch]==2022.3.0
pip install -r requirements.txt

Generate image from text description

usage: demo.py [-h] [--model MODEL] [--device DEVICE] [--seed SEED] [--beta-start BETA_START] [--beta-end BETA_END] [--beta-schedule BETA_SCHEDULE]
               [--num-inference-steps NUM_INFERENCE_STEPS] [--guidance-scale GUIDANCE_SCALE] [--eta ETA] [--tokenizer TOKENIZER] [--prompt PROMPT] [--params-from PARAMS_FROM]
               [--init-image INIT_IMAGE] [--strength STRENGTH] [--mask MASK] [--output OUTPUT]
optional arguments:
  -h, --help            show this help message and exit
  --model MODEL         model name
  --device DEVICE       inference device [CPU, GPU]
  --seed SEED           random seed for generating consistent images per prompt
  --beta-start BETA_START
                        LMSDiscreteScheduler::beta_start
  --beta-end BETA_END   LMSDiscreteScheduler::beta_end
  --beta-schedule BETA_SCHEDULE
                        LMSDiscreteScheduler::beta_schedule
  --num-inference-steps NUM_INFERENCE_STEPS
                        num inference steps
  --guidance-scale GUIDANCE_SCALE
                        guidance scale
  --eta ETA

…

Top comments (2)

Mike Ritchie • Sep 2 '22

Just set this up yesterday in fact!

On my Mac it takes about 3 minutes to generate an image. Brand new feature released now allows you to supply a starter image, and it now comes with a web interface to make things easier. Only 512px x 512px at the moment, but arguments for height and width are on his roadmap.

Considering how new this project is, he’s done an amazing job!

0xkoji • Sep 2 '22

Yeah agree!

DEV Community

Use Stable Diffusion openvino with poetry

DALLE 2

About Stable Diffusion

CompVis / stable-diffusion

A latent text-to-image diffusion model

Stable Diffusion

Requirements

bes-dev / stable_diffusion.openvino

stable_diffusion.openvino

Requirements

Install requirements

Generate image from text description

install poetry

Create a project folder

Install packages

Clone repo

Run demo.py

generated image

Top comments (2)

Read next

How to Integrate Telegram Payments in a Django and React Mini App

Mastering Trace Analysis with Span Links using openTelemetry and Signoz (A Practical Guide,Part 2)

Golden-Retriever: High-Fidelity Agentic Retrieval Augmented Generation for Industrial Knowledge Base

Lesson 12 - What is TensorFlow?