6 mins read

Discover AnimateLCM for Faster AI Animations

A model named AnimateLCM can now generate animations much faster with as few as 5 steps on a consumer GPU. Creating videos using Stable Diffusion is definitely a fun pastime, except for one thing: it requires much more computational power to generate good animations, unlike simple AI images which are definitely less demanding. Let’s see how to make also AI animations faster with AnimateLCM and how it works in ComfyUI.

AnimateLCM is Fast

We have been using Latent Consistency Models (LCMs) to generate Stable Diffusion images with very few steps – in some cases almost in real time – but this was not yet possible with animations and videos made using AnimateDiff, becoming a time-consuming activity with a not so powerful GPU. AnimateDiff is a diffusion model that is capable of generating short videos from a text prompt or an input video to be used as reference.

AnimateLCM introduces a novel method to generate high-quality videos in a similar manner, but faster. Going a bit technical (you can read the paper here), it enhances efficiency and quality by segregating the training process into two distinct phases: one for image generation and the other for motion. This decoupled consistency learning approach significantly improves both training efficiency and video generation with a reduced number of iterations.

Moreover, similarly to AnimateDiff, additional functionalities can be easily integrated into the model to control various aspects of the generated videos, such as ControlNets or IPAdapters. This ensures compatibility with already existing tools and workflows, maintaining functionalities without compromising video generation speed.

To simplify, you can use AnimateLCM to generate videos with Stable Diffusion with a low number of steps (5-10) compared to the standard AnimateDiff (20+). So, I decided to try it out as soon as the code was made available. You can try AnimateLCM both in ComfyUI and the Stable-Diffusion-WebUI, and I will be using the former for this tutorial.

ComfyUI Workflow

In order to get started, you can find a possible workflow here on flowt.ai, which comes with a set of custom nodes that you need to install by using the ComfyUI-Manager. I moved the nodes and their colors to make the components more clear, but the workflow should be something like the picture below:

comfyui workflow for animatelcm

The yellow nodes are for AnimateDiff, while the light blue ones are for the advanced LCM nodes. Let’s go a bit more into the details.

Custom Nodes

The most important nodes in this workflow are:

If you have already installed the AnimateDiff-Evolved node, be sure to upgrade it using the ComfyUI-Manager. Once all the red nodes are gone, the workflow is ready, so we need to download the weights and place them in the proper folder.

AnimateLCM Models

The weights for AnimateLCM can be found here on HuggingFace: you will notice two files, and you should download both. One is the main checkpoint (sd15_t2v_beta.ckpt in models/checkpoints/), and the other is used as LORA (sd15_lora_beta.safetensors), which should be placed in the models/loras folder. As usual, you will also need a base Stable Diffusion 1.5 Model and a VAE.

Nodes Highlights

Be sure to select the correct LORA that you just downloaded for AnimateLCM. As the base model, you can choose anything that works with SD1.5.

The empty latent image node is used to set the size of the image and, more interestingly in the case of animations, the batch_size, which corresponds to the number of images that will create the final video. The higher the size, the smoother the generation will be, thanks to the additional frames.

We can set a batch size of 120 to achieve a good animation in a reasonable amount of time (approximately 280 seconds with 12GB of VRAM). For testing purposes, you can also reduce the batch size, but I noticed that using 30-50 frames makes the animation much more static, with a less smooth transition between frames.

The LCMScheduler is where the number of steps is set, and it’s where the LCM models come into power: set it as low as 4 to achieve decent animations, and 8-10 for better results in just a few seconds! Do not forget to keep the CFG Scale low in the SamplerCustom node, for LCMs a value around 2.0 is enough.

Finally, use the output node to view your animation and adjust parameters such as the number of frames, compression (CRF), and format (MP4 or GIF, for example).

Comparison by Steps

Let’s conduct a quick experiment: keep the seed and prompt the same, but vary the number of steps to determine the minimum amount required to generate a decent animation.

The prompt used for these animations is:

art photo, intricate, (a old tree with leaves), ultra-detailed, sharp details, cg 8k wallpaper, lighting, beautiful underwater scene.

stepstimevideo
227.16s
446s
894.5s
12143.07s
Using a 3060 Nvidia GPU – 12GB of VRAM

The sweet spot appears to be between 8 and 12 steps, which is a significant improvement compared to using 20 steps or more, especially when considering animations. It’s important to note that resolution can also affect quality: in this instance, the animations are 512×512 pixels, but with a larger size, you might achieve better results with fewer steps.

The original AnimateLCM repository is here if you wish to have a look at their project page or demo online.


Leave a comment if you need help with this workflow!

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.