4 mins read

SDXL Lightning ComfyUI: Speed and Quality Combined

Intro

SDXL-Lightning is one of the latest text-to-image generation model, known for its lightning-fast speed and relatively high-quality results. It utilizes a technique called Progressive Adversarial Diffusion Distillation, resulting in efficient generation of high-resolution (1024px) images in just a few steps. The model has been open-sourced as part of research, and it offers various configurations and checkpoints for different inference steps.

The models have been released by ByteDance, the parent company of TikTok, so it doesn’t come directly from stability.ai! SDXL Lightning seems to give better results than SDXL Turbo and LCM models, especially considering that it’s been optimized for 1024x1024px images. In this article, I will show you how to use SDXL Lightning in ComfyUI with some variants.

Overview of SDXL-Lightning

SDXL-Lightning offers models distilled from the stabilityai/stable-diffusion-xl-base-1.0 repository. These include 1-step, 2-step, 4-step, and 8-step distilled models, each providing different levels of generation quality. This means that you can generate high-quality 1024px images with a reduced number of steps, making the generation much faster.

Key Features

  • Fast Generation: Capable of generating high-quality images swiftly.
  • Progressive Adversarial Diffusion Distillation: Utilizes advanced techniques for efficient text-to-image generation.
  • Open Source: The model and checkpoints are available for public use and experimentation.

SDXL Lightning in ComfyUI

You can find all the model weights ready to be downloaded on ByteDance/SDXL-Lightning, where you will also find a comprehensive overview for each file in the model card. There are several different possibilities: LORAs, UNETs, and Base Models, usable with 1, 2, 4, or 8 steps. I will focus on the 4 steps base model and a 2 steps LORA in the next sections. Let’s see how good these models are!

Lightning 4 steps

For my first experiment, I downloaded the sdxl_lightning_4step.safetensors file into the Comfyui/models/checkpoints/ folder, and then utilized the default ComfyUI workflow to get started.

sdxl lightning comfyui

Then, we need to set the cfg scale to 1.0 and the scheduler to sgm_uniform. Generation is fast, and the quality is better than SDXL Turbo. With 12GB of VRAM, generations are indeed very fast! And this is using the 4-step version.

Using IPAdapters

IPadapters are a type of prompt modifier that utilize an input image to transfer some of its features into the final image. This is achieved by combining the input image with the usual text prompt. You can read more in this IPAdapter tutorial.

I decided to try IPadapters with these lighting models, uploaded the workflow on flowt.ai, and I noticed that by default the strength is really high in this case, almost generating the original image as output. So I tweaked the Apply IPadapter node, increasing the starts_at to 0.0001, and the output became a bit more balanced, considering not only the input IPadapter image but also my text prompt.

Using an input image of mushrooms and typing “New York City” in the text prompt gave me a nice combination of the two elements. Similarly, using an image of a cathedral for the input image and “a floating jellyfish” as a text prompt resulted in a satisfying outcome.

In this case, the quality of the images seems to be much better, with great detail and sharpness, all achieved with only 4 steps.

Lightning LORAs

We can also utilize these lightning models as LORAS, transforming any SDXL base image into a 2-step model to generate images very quickly using many of the existing models already available. I downloaded the sdxl_lightning_2step_lora.safetensors, which should allow us to generate images with only 2 steps.

sdxl lightning comfyui workflow with lora

In this workflow, I’m using a base SDXL model named realisticStockPhoto_v10.safetensors, and then I applied a 2-step LORA using the corresponding node. Unfortunately, I think that the quality is somewhat diminished compared to the outputs we would get by using the original base model alone. However, considering how fast the generation is, the trade-off seems acceptable.

Conclusion

Lately, numerous image generation models have been published, making it challenging to keep up. Models like SDXL-Lightning, Stable Cascade, and Stable Diffusion 3 (though not publicly available yet) have been revealed in a short amount of time. This progress in open-source and uncensored AI image generation is continually improving, presenting intriguing challenges to closed-source alternatives like DALLE and Midjourney.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.