Generating images using artificial intelligence has become increasingly popular in recent years, thanks to the advances in deep learning techniques and the availability of large datasets. You may have seen stunning images generated by AI on the internet with Midjourney or OpenAI’s Dall-E where you can enter a text description and the AI will generate an image based on that description. Stable Diffusion by Stability AI is an open source alternative that still manages to generate some pretty amazing images.

Using the model directly can be a bit challenging and that’s where the Automatic-1111 Web UI comes into play, it’s a web interface that allows you to easily generate images using Stable Diffusion and comes with some other cool features like the ability to upscale images.

Why not use Dall-E or Midjourney?

While Midjourney and Dall-E are great, they impose some limitations on how you can use the generated images. Also they’re both paid if you need to use them frequently. Stable Diffusion on the other hand can run on your own hardware and it’s completely free to use.

Hardware requirements

You need to meet at least the following hardware requirements to run the Automatic-1111 Web UI:

  • 16GB RAM
  • GPU with at least 2GB VRAM, NVIDIA GPUs are recommended
  • Linux, Windows 10+, Mac M1 or M2
  • 10GB free disk space

The better your hardware, the faster the images will be generated. I tested on Nvidia GTX 1070 and Mac M1 Max and the results were pretty similar (around 10 seconds to generate an image).

Software requirements

You may have these on your machine already, but if not please make sure you install the following:

Windows

  • Git
  • Python 3.10 (make sure you add the python executable to your PATH)

macOS

Assuming you have Homebrew installed, you need to open a terminal and run the following commands:

brew install cmake protobuf rust [email protected] git wget

Installation

To install the Automatic-1111 Web UI, you need to open a terminal and run the following commands:

git clone https://github.com/AUTOMATIC1111/stable-diffusion-webui

macOS

cd stable-diffusion-webui
./webui.sh

Windows

If you have a GPU with low VRAM (less than 8GB), you’ll have to open the webui-user.bat file inside the stable-diffusion-webui with a text editor (right click on the file and select “Edit”), then look for COMMANDLINE_ARGS= and set it as follows:

set COMMANDLINE_ARGS=--lowvram

You can also try --medvram if you have a GPU with 4GB VRAM.

Once you’re done, save the file and double click on it to run the Automatic-1111 Web UI.

Downloading a model

There are many models available on Hugging Face, including the original Stable Diffusion models. You may be tempted to pick the latest model (2.1 as of this writing), but it’s not recommended because v1.5 produces better results. Here are some links to the models:

If you choose to download a different model other than StableDiffusion 1.5, all you need to do it to copy the model file into the models/Stable-diffusion folder.

Usage

Once you have the Automatic-1111 Web UI running, you can open your browser and go to http://127.0.0.1:7860 to access the web interface. The Web UI address may be a bit different, so the best way to find it is to look at the terminal where you ran the Automatic-1111 Web UI. The interface should look like:

Automatic-1111 Web UI

now you can try generating some images by entering a text description and clicking on the “Generate” button, here’s an example:

Inside a medieval hobbit home, ornate, beautiful, atmosphere, vibe, mist, smoke, chimney, rain, 
wet, pristine, puddles, waterfall, melting, dripping, snow, creek, lush, ice, bridge, soup, loaves, 
green, stained glass, forest, roses, flowers, color page, 4 k, tone mapping, trending on artstation

Automatic-1111 Web UI generated image

Inspiration

Coming up with a good text description is the hardest part in my opinion. Luckily Lexica can help with just that, it’s driven by commmunity contributions and it’s a great source of nice prompts that you can drop into the Automatic-1111 Web UI. Examples also come with parameters to help you get results as close as possible to the original image.

Upscaling

The generated images are usually 512x512 pixels, sometimes a little bit larger and that’s for a good reason: you want to generate images as fast as possible. However, if you want to use the generated images for other purposes, you may want to upscale them to a higher resolution. The Automatic-1111 Web UI comes with a built-in upscaler that can help with that.

To upscale images simply click the Send to extras which opens up the extras panel. Next, select ESRGAN_4x from the Upscaler 1 dropdown then click the Generate button. The generated image will be upscaled to 2048x2048 pixels and you can download it by right clicking on it and selecting Save image as....

There are other upscalers available, but ESRGAN has worked for me just fine. You can also use the Upscaler 2 dropdown to apply a second upscaler, but I haven’t found it to be very useful.

Conclusion

Stable Diffusion is a great tool for generating images and the Automatic-1111 Web UI makes it even easier to use. If you have any questions or suggestions, please feel free to leave a comment below.