Run Stable Diffusion Using AMD GPU On Windows

Sharing is caring!

Last Updated on October 6, 2022 by Jay

This tutorial will walk through how to run the Stable Diffusion AI software using an AMD GPU on the Windows 10 operating system.

Hardware

AMD Radeon RX 580 with 8GB of video RAM.

CPU and RAM are kind of irrelevant, any modern computer should be fine.

0. Download & Install Python & Git

The source code Stable Diffusion model/software is written in Python, so we’ll need to install Python first. We’ll need to get Python version 3.7+ (64-bit) to run Stable Diffusion. You can download the Python installation file from either one of the following resources:

Official Python website: https://www.python.org/downloads/

Anaconda Distribution: https://www.anaconda.com/

Python installation is fairly straightforward so we won’t cover that here.

Git is also required, without Git the program will not install properly.

Download Git: https://git-scm.com/downloads

Installing Git is also straightforward, the installer wizard will ask a lot of questions, just go with default for everything.

1. Download Files

1.1 Download Code – Modified Diffusers Library

Download this copy of the diffusers library: https://github.com/harishanand95/diffusers/tree/dml

Note it should be the “dml” branch, not the main branch.

Once downloaded, unzip the content to a folder of your choice.

1.2 Download Onnx Nightly Build Wheel File

Download the wheel file from the below link. Wheel files are Python libraries that we can use pip to install.

https://aiinfra.visualstudio.com/PublicPackages/_artifacts/feed/ORT-Nightly/PyPI/ort-nightly-directml/overview/1.13.0.dev20220908001

Choose one that matches the Python version you have on your computer. For example, if you have Python 3.8, you’ll download the cp38. And if you have Python version 3.9, you’ll download cp39, etc.

2. Build Environment

Open up Command Prompt with admin privileges.

Then navigate to the folder where we store the modified diffusers library, which I saved here: C:\Users\jay\Desktop\PythonInOffice\stable_diffusion_amd\diffusers-dml

Our command prompt should be inside this folder now:

2.1. Create A Python Virtual Environment

When trying something new, we should always use a Python virtual environment. This helps prevent messing up our current Python environments. To create a virtual environment, type the following:

## create the virtual environment
python -m venv amd_venv

## activate the virtual environment
cd amd_venv/Scripts
activate

2.2. Install The Modified Diffusers Library

Then pip install the current directory (i.e. the modified diffusers library) by using the -e . argument. This will treat the current directory as a Python library and install it.

(amd_venv) C:\Users\jay\Desktop\PythonInOffice\stable_diffusion_amd>pip install -e .

2.3. Install The Onnx Nightly Build Wheel File

The easiest way is to use the Command Prompt to navigate to the same folder that stores the wheel file. Then type pip install followed by the file name:

pip install ort_nightly_directml-1.13.0.dev20220908001-cp39-cp39-win_amd64.whl

2.4. Install Other Libraries

We need to install a few more other libraries using pip:

pip install transformers ftfy scipy

3. Connect to Hugging Face (For Downloading Stable Diffusion Model Weights)

When installing the diffusers library, another library called huggingface-hub was also installed. This hugginface-hub provides some utility programs that help facilitate downloading the Stable Diffusion models

Inside Command Prompt, type:

huggingface-cli login

If this is your first time logging in, you’ll need to provide an access token from Hugging Face. Generate the token and copy it, paste it back to the Command Prompt. Then you should see login successful message.

4. Run save_onnx.py

Navigate to the examples\inference folder, there should be a file named save_onnx.py. This Python script will convert the Stable Diffusion model into onnx files. This step will take a few minutes depending on your CPU speed.

python save_onnx.py

This concludes our Environment build for Stable Diffusion on an AMD GPU on Windows operating system.

5. Run Stable Diffusion using AMD GPU on Windows

Inside the same folder examples/inference we’ll find another file named “dml_onnx.py”. This is the script for running Stable Diffusion.

Inside this file, we can modify inputs such as prompt, image size, inference steps, etc.

I used all default settings with this prompt (copied from Reddit):

background dark, block houses, eastern Europe, city highly detailed oil painting, unreal 5 render,
rhads, bruce pennington, studio ghibli, tim hildebrandt, digital art, octane render, beautiful composition,
trending on artstation, award-winning photograph, masterpiece

And this is the image it generated:

Final Thoughts

This experiment was done using an AMD RX580 GPU with 8GB of VRAM. This GPU is supposed to be on par with Nvidia RTX 1070/1080. However, I noticed that Stable Diffusion runs significantly slower on AMD. I was getting around 7-8 seconds for each iteration, so for the default 50 inference steps, generating one 512×512 image would take roughly 6-7 minutes. A comparable image would take

Additional Resources

How to Run Stable Diffusion on Windows

How to Run Stable Diffusion Without A Graphic Card (GPU)

37 comments

  1. I would be great if someone would find a way for the AMD Users to use the Webuis of other versions as well since they are so nice and easy to handle but not many tried it yet. I tried to run it via Docker or an Ubuntu VM were you would have to first make a Throughput of the GPU which failed miserably at the end. This is kind of disheartening.

    1. The problem with AMD is that there’s no support from either pytorch of tensorflow, which are 2 main neural network python libraries. The original SD model/code is written with the pytorch framework, so you can’t run it directly on AMD and windows. The onnx is a workaround but very slow. On Linux we can use AMD and Rocm, but it’s just too much work to install a new operating system just for this software…

      There are only two GPU chipmakers in the world, we definitely would like to see support for AMD in the AI world.

      1. Using current build (August 2023) of Automatic1111 for both, I was able to get SD running on the GPU the first try under Windows 11. Under Ubuntu it absolutely does not see the GPU at all. It is an RX6600 and is only slightly faster at this than the RX580 as described in the article. The system has a Ryzen 5 3600 and 48GB RAM, and under Linux, running on the CPU, it only takes maybe 50% longer than the GPU to do each image. Getting this to work did require –lowvram, –precision full –no-half, and –no-half-vae. This is an 8GB card but every time I try to generate a single 512×512 image, VRAM usage goes all the way to the top and stays there until it’s done. Nothing bigger than 512×512 works, nor batches of greater than 1.

        I have heard that people have done things like compile a new custom Linux kernel version with a special driver version that has ROCm support baked in, that people compile a custom version of pytorch to get the ROCm support. In this specific instance it does not seem to make make nearly enough difference to make all the research and all the work worthwhile.

        I hear people say that it can be made to work somewhat more easily under Arch, but given the extreme complexity and difficulty of getting an Arch install to work at all, the 50%, maybe, if that, performance delta doesn’t justify dealing with the unending screaming flailing white-knuckle frustration nightmare that is Arch. (Yes, I know. “Skill issue.” I got the computer to use to do stuff I want to do. “Stuff I want to do” does not include compiling my own custom kernel, developing my own custom libraries, or writing my own custom drivers. Time spent learning to do all that is time not spent using SD to take scenes from Lord of the Rings and replace all the actors with Danny DeVito)

        Under Windows I have also tried Shark (freezes and fails with “resource allocation” errors during installation) and SDnext (absolutely cannot see GPU, runs on CPU only, only about a third slower than Automatic1111 on the GPU).

        At this point I am getting really frustrated and am giving serious consideration to getting a used RTX card on Craigslist, or on eBay. Maybe even an old Quadro, though I am not 100% sure which ones are supported and which ones aren’t.

  2. I am trying to run this, but no matter what I do, when i run the save_onyx file i get a lot of Connection aborted errors, and I am stuck. Any help would be appreciated, thanks!

  3. Hi. I have a question. I am a person who does not understand code, so I am asking for help with such a problem:
    Traceback (most recent call last):
    File “C:\Users\yevgeniy\Desktop\PythonInOffice\stable_diffusion_amd\diffusers-dml\src\diffusers\configuration_utils.py”, line 197, in get_config_dict
    config_dict = cls._dict_from_json_file(config_file)
    File “C:\Users\yevgeniy\Desktop\PythonInOffice\stable_diffusion_amd\diffusers-dml\src\diffusers\configuration_utils.py”, line 236, in _dict_from_json_file
    return json.loads(text)
    File “C:\Users\yevgeniy\AppData\Local\Programs\Python\Python310\lib\json\__init__.py”, line 346, in loads
    return _default_decoder.decode(s)
    File “C:\Users\yevgeniy\AppData\Local\Programs\Python\Python310\lib\json\decoder.py”, line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
    File “C:\Users\yevgeniy\AppData\Local\Programs\Python\Python310\lib\json\decoder.py”, line 355, in raw_decode
    raise JSONDecodeError(“Expecting value”, s, err.value) from None
    json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

    During handling of the above exception, another exception occurred:

    Traceback (most recent call last):
    File “C:\Users\yevgeniy\Desktop\PythonInOffice\stable_diffusion_amd\diffusers-dml\examples\inference\save_onnx.py”, line 16, in
    pipe = StableDiffusionPipeline.from_pretrained(“CompVis/stable-diffusion-v1-4”, scheduler=lms, use_auth_token=True)
    File “C:\Users\yevgeniy\Desktop\PythonInOffice\stable_diffusion_amd\diffusers-dml\src\diffusers\pipeline_utils.py”, line 167, in from_pretrained
    config_dict = cls.get_config_dict(cached_folder)
    File “C:\Users\yevgeniy\Desktop\PythonInOffice\stable_diffusion_amd\diffusers-dml\src\diffusers\configuration_utils.py”, line 199, in get_config_dict
    raise EnvironmentError(f”It looks like the config file at ‘{config_file}’ is not a valid JSON file.”)
    OSError: It looks like the config file at ‘C:\Users\yevgeniy/.cache\huggingface\diffusers\models–CompVis–stable-diffusion-v1-4\snapshots\52b46db8e14744892bb7ee014fc1cbb8c408643f\model_index.json’ is not a valid JSON file.

    What should i do? Is there any chance that it will not work on my PC anyways? Using Radeon RX 5600 XT, latest version of Python and onnx cp310 (other versions don’t work)

  4. Thank you for this! It’s my first time messing around with python and your guide was super helpful!

    I didn’t know that giving the same prompts would get you the same results.
    Even changing it slightly (“eastern europe” to “western europe”) gives the same image, but with a different type of door and slightly different background. Does it depend of the other parameters in the code or is it all on the prompts?

    1. Yes, there’s a line “torch.manual_seed(42)” that sets a seed for the random generator. If you remove that line, it should start generating different images even with the same prompt.

  5. I’m getting stuck at the Hugging Face token – I generate a token (read), copy it, and when I try to paste it into the command prompt, nothing happens. Pressing enter results in this error:

    Traceback (most recent call last):
    File “C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2288.0_x64__qbz5n2kfra8p0\lib\runpy.py”, line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
    File “C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.10_3.10.2288.0_x64__qbz5n2kfra8p0\lib\runpy.py”, line 86, in _run_code
    exec(code, run_globals)
    File “C:\Users\danmo\OneDrive\Desktop\stable_diffusion_amd\diffusers-dml\amd_venv\Scripts\huggingface-cli.exe\__main__.py”, line 7, in
    File “C:\Users\danmo\OneDrive\Desktop\stable_diffusion_amd\diffusers-dml\amd_venv\lib\site-packages\huggingface_hub\commands\huggingface_cli.py”, line 45, in main
    service.run()
    File “C:\Users\danmo\OneDrive\Desktop\stable_diffusion_amd\diffusers-dml\amd_venv\lib\site-packages\huggingface_hub\commands\user.py”, line 149, in run
    _login(self._api, token=token)
    File “C:\Users\danmo\OneDrive\Desktop\stable_diffusion_amd\diffusers-dml\amd_venv\lib\site-packages\huggingface_hub\commands\user.py”, line 319, in _login
    raise ValueError(“Invalid token passed!”)
    ValueError: Invalid token passed!

    Please advise! Thanks for trying to help people get this set up.

      1. If you use the copy token button on the hugging face website once you’ve generated a new token you get get it to work. After clicking the copy button, go to your command prompt, click in the window to the right of the Token prompt, and choose Edit, Paste from the upper left-hand corner menu of the command box, then hit the Enter button. It does not look like anything was pasted, but it finally worked for me when I did it this way.

  6. I have gotten this to work, but I can’t get it to work if I change the dimensions in the save_onnx and dml_onnx files.

    For instance, when I tried to make a 768h x 1024w image, I changed line 66 in save_onnx to:
    convert_to_onnx(pipe.unet, pipe.vae.post_quant_conv, pipe.vae.decoder, text_encoder, height=768, width=1024)

    and the prompt (line 210) in dml_onnx to:
    image = pipe(prompt, height=768, width=1024, num_inference_steps=50, guidance_scale=7.5, eta=0.0, execution_provider=”DmlExecutionProvider”)[“sample”][0]

    and when I ran dml_onnx from Command Prompt, it gave me the following error:

    Traceback (most recent call last):
    File “C:\Users\danmo\OneDrive\Desktop\stable_diffusion_amd\diffusers-dml\examples\inference\dml_onnx.py”, line 210, in
    image = pipe(prompt, height=768, width=1024, num_inference_steps=50, guidance_scale=7.5, eta=0.0, execution_provider=”DmlExecutionProvider”)[“sample”][0]
    File “C:\Users\danmo\OneDrive\Desktop\stable_diffusion_amd\diffusers-dml\amd_venv\lib\site-packages\torch\autograd\grad_mode.py”, line 27, in decorate_context
    return func(*args, **kwargs)
    File “C:\Users\danmo\OneDrive\Desktop\stable_diffusion_amd\diffusers-dml\examples\inference\dml_onnx.py”, line 167, in __call__
    noise_pred = unet_sess.run(None, inp)[0]
    File “C:\Users\danmo\OneDrive\Desktop\stable_diffusion_amd\diffusers-dml\amd_venv\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py”, line 200, in run
    return self._sess.run(output_names, input_feed, run_options)
    onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Got invalid dimensions for input: latent_model_input for the following indices
    index: 2 Got: 96 Expected: 64
    index: 3 Got: 128 Expected: 64
    Please fix either the inputs or the model.

  7. I did as the article says,but I had an error output:
    (amd_venv) D:\novel AI\diffusers-dml\examples\inference>python dml_onnx.py
    Fetching 19 files: 100%|██████████████████████████████████████████████████████████████| 19/19 [00:00<00:00, 826.00it/s]
    Traceback (most recent call last):
    File "D:\novel AI\diffusers-dml\examples\inference\dml_onnx.py", line 210, in
    image = pipe(prompt, height=512, width=512, num_inference_steps=50, guidance_scale=7.5, eta=0.0, execution_provider=”DmlExecutionProvider”)[“sample”][0]
    File “D:\novel AI\diffusers-dml\amd_venv\lib\site-packages\torch\autograd\grad_mode.py”, line 27, in decorate_context
    return func(*args, **kwargs)
    File “D:\novel AI\diffusers-dml\examples\inference\dml_onnx.py”, line 73, in __call__
    unet_sess = ort.InferenceSession(“onnx/unet.onnx”, so, providers=[ep])
    File “D:\novel AI\diffusers-dml\amd_venv\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py”, line 335, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
    File “D:\novel AI\diffusers-dml\amd_venv\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py”, line 381, in _create_inference_session
    sess.initialize_session(providers, provider_options, disabled_optimizers)
    onnxruntime.capi.onnxruntime_pybind11_state.Fail
    Can anyone help me?

  8. Hello, I’m sorry to bother you. After the environment has been fully prepared, try running dml_ Onnx.py, it reports such an error:
    “onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException”
    I will put the complete error information at the end. This sentence seems to be the core information.
    I don’t quite understand what should be done. After searching the Internet, I found that the error information seems not comprehensive. Maybe you have a solution to this problem. I would be grateful.

    All error messages:
    2022-11-10 19:25:24.0274182 [E:onnxruntime:, inference_session.cc:1484 onnxruntime::InferenceSession::Initialize::::operator ()] Exception during initialization: D:\a\_work\1\s\onnxruntime\core\providers\dml\DmlExecutionProvider\src\BucketizedBufferAllocator.cpp(122)\onnxruntime_pybind11_state.pyd!00007FFF8F771218: (caller: 00007FFF8FDC9F76) Exception(1) tid(1320) 887A0005 GPU ?Traceback (most recent call last):
    File “E:\AI\diffusers-dml\diffusers-dml\examples\inference\dml_onnx.py”, line 210, in
    image = pipe(prompt, height=512, width=512, num_inference_steps=50, guidance_scale=7.5, eta=0.0, execution_provider=”DmlExecutionProvider”)[“sample”][0]
    File “E:\AI\diffusers-dml\diffusers-dml\amd_venv\lib\site-packages\torch\autograd\grad_mode.py”, line 27, in decorate_context
    return func(*args, **kwargs)
    File “E:\AI\diffusers-dml\diffusers-dml\examples\inference\dml_onnx.py”, line 73, in __call__
    unet_sess = ort.InferenceSession(“onnx/unet.onnx”, so, providers=[ep])
    File “E:\AI\diffusers-dml\diffusers-dml\amd_venv\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py”, line 347, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
    File “E:\AI\diffusers-dml\diffusers-dml\amd_venv\lib\site-packages\onnxruntime\capi\onnxruntime_inference_collection.py”, line 395, in _create_inference_session
    sess.initialize_session(providers, provider_options, disabled_optimizers)
    onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException

  9. When I run save_onnx.py,I got an output like this:
    Traceback (most recent call last):
    File “C:\novel AI\AMD\diffusers\examples\inference\save_onnx.py”, line 16, in
    pipe = StableDiffusionPipeline.from_pretrained(“CompVis/stable-diffusion-v1-4”, scheduler=lms, use_auth_token=True)
    File “C:\novel AI\AMD\diffusers\src\diffusers\pipeline_utils.py”, line 240, in from_pretrained
    load_method = getattr(class_obj, load_method_name)
    TypeError: getattr(): attribute name must be string

    I tried to run it again,but it didn’t work.
    Did I downloaded a wrong file?
    I can’t do anything because I don’t know python.Can anyone help me?

    1. Run this command
      pip install diffusers==0.8.0 transformers scipy ftfy
      Then try again tha last command you did.
      You probably has like 0.3.0 or earlier version.

      1. RuntimeError: Encountering a dict at the output of the tracer might cause the trace to be incorrect, this is only valid if the container structure does not change based on the module’s inputs. Consider using a constant container instead (e.g. for `list`, use a `tuple` instead. for `dict`, use a `NamedTuple` instead). If you absolutely need this and know the side effects, pass strict=False to trace() to allow this behavior.

  10. \diffusers-dml\examples\inference\dml_onnx.py”, line 28, in __init__
    scheduler = scheduler.set_format(format)
    AttributeError: ‘LMSDiscreteScheduler’ object has no attribute ‘set_format’

    how to fix?

  11. Great! but it´s changed a bit. To put your token you have to modify ..\amd_venv\Lib\site-packages\huggingface_hub\_login.py

  12. So….Absolutely ZERO support here?!? There is a huge desire to use this pipeline but there is no support to fix the errors that are occurring. HELP!!!

    Traceback (most recent call last):
    File “E:\AI\diffusers-dml\examples\inference\save_onnx.py”, line 16, in
    pipe = StableDiffusionPipeline.from_pretrained(“CompVis/stable-diffusion-v1-4”, scheduler=lms, use_auth_token=True)
    File “E:\AI\diffusers-dml\src\diffusers\pipeline_utils.py”, line 240, in from_pretrained
    load_method = getattr(class_obj, load_method_name)
    TypeError: getattr(): attribute name must be string

  13. Hello,

    I got the base model working but now I would like to try and load some custom models to this setup to experiment with.

    Can anyone point me towards a good tutorial that explains how to do it?

    Thank you!

    Best,

    Ono

  14. Hi,

    File “E:\AI\diffusers-dml\examples\inference\save_onnx.py”, line 16, in
    pipe = StableDiffusionPipeline.from_pretrained(“CompVis/stable-diffusion-v1-4”, scheduler=lms, use_auth_token=True)
    File “E:\AI\diffusers-dml\src\diffusers\pipeline_utils.py”, line 240, in from_pretrained
    load_method = getattr(class_obj, load_method_name)
    TypeError: getattr(): attribute name must be string

    tried downloading process but always got this error, please help…. 🙂

Leave a Reply

Your email address will not be published. Required fields are marked *