12 mins read

AnimateDiff to Create Amazing Animations With ComfyUI: A Full Guide

Here, I am sharing a new tutorial, this time on generating animations using Stable Diffusion, AnimateDiff (V1, V2, and V3), and ComfyUI. I will start with the most basic process and then gradually introduce additional functionalities to offer better control over the generated animations, including prompt traveling and control net.


Introduction

To follow along, you’ll need to install ComfyUI and the ComfyUI Manager (optional but recommended), a node-based interface used to run Stable Diffusion models. In this guide, I will demonstrate the basics of AnimateDiff and the most common techniques to generate various types of animations.

Firstly, download an AnimateDiff model from this Huggingface repository. For these examples, I will use the latest V3 version. Look for a file named ‘v3_sd15_mm.ckpt’ (V3) and ‘Version 2 mm_sd_v15_v2.ckpoint’ (V2) (mm stands for Motion Module), as well as ‘temporaldiff-v1.’

Finally, you will need several custom nodes, with the most crucial one being the AnimateDiff-Evolved node. You can install this node using the ComfyUI Manager.

Basic Example

Starting from the default workflow provided by ComfyUI, using only one additional node we can load and use the AniamteDiff motion modules. Just do right click > add node > AnimateDiff EV > AnimateDiff Loader. Otherwise, load a simple workflow ready to be used like this one – if you see any red boxes, don’t forget to install the missing custom nodes using again the ComfyUI Manager.

Now that the nodes are all installed, double check that the motion modules for animateDiff are in the following folder: ComfyUI\custom_nodes\ComfyUI-AnimateDiff-Evolved\models. From the node, you can switch between the V2 and V3 motion module directly from the node. Select the mm_sd_v15_v2.ckpoint for V2.

beta_scheduler should be set to sqrt_lineart (AnimateDiff) and apply_v2_models_properly set to true anytime that you use a V2 motion module.

motion_scale affects the motion amount generated by motion model – less than 1 means less motion; greater than 1 means more motion.

Moreover, check the number of batch_size in thee Empty_Latent_image node to get an animation, I set 16 in this example; this is the number of frames in the final animation, so the higher the batch size, the longer will be (but it will also take longer to generate!).

The rest of the workflow is a standard Stable Diffusion workflow, loading a checkpoint, adding a prompt and a negative prompt and using a KSampler to generate the image. Choose a prompt and depending on the number of frames that you choose, after a while a simple animation should have been generated.

prompt: 1man, windy and sunny, looking at the camera

Motion Loras

Motion LORAs give you control on the camera movements, like zooming in and out, pan left and right and so on. There is a LORA for each of the most basic camera movements. These LORAs work well for AnimateDiff V2.

You can find more camera controls on the github repository of AnimateDiff or other sites like CivitAI and HuggingFace.

motion loras list models download

Download any Lora that you would like to try and put it in the following folder: ComfyUI\custom_nodes\ComfyUI-AnimateDiff-Evolved\motion_lora.

By Adding a couple of nodes, we can update the workflow like this (it can be downloaded from OpenArt if you didn’t already in the previous step):

In particular, there is an Additional AnimateDiff Lora Loader where we can choose the camera movement to keep.

Keep the context_lenght in the Uniform Context Options to 16, it can be adjusted also to 24 or 32 depending on the loaded motion model.

I tried out 4 movements: ZoomOut, ZoomIn, TiltUp and PanRight.

Quality here is not great, but I wanted to show how the Loras can give us great control in the camera movements.

Prompt Traveling

Prompt Traveling is a technique for generating smooth, continuous animations that transition between different scenes. It allows us to introduce unique prompts at different points in the animation. This technique provides granular control, enabling variations in color, size, or trajectory to make the final animation more dynamic according to our instructions.

In other words, we can specify for each frame a different prompt, meaning that we can create smooth transitions between different concepts. A useful ComfyUI node is available, named Batch Prompt Schedule. The workflow for Prompt Traveling can be found here. If you see a red node, again use the ComfyUI Manager to install it. Here is how this powerful node works:

prompt traveling node comfyui

The first cell of the node is where the prompt is written, following a specific structure:

"number":"prompt", where the number corresponds to the position of the frame. Referring to the prompt in the previous image, it means that the frame number 0 will start with “a tree during spring,” then from frame number 8, it will have “a tree during spring, sunny,” until the last frame, number 24. The final animation should blend between these four concepts.

To avoid repeating some parts of the prompt that we want to keep in each frame, we can use the pre_text and app_text sections, which will respectively add before or after each frame prompt. So, we could use “a tree during” and only add the modifiers in the list of prompt frames defined earlier.

The full workflow looks like this:

Prompt traveling comfyui workflow

To get the best results, prompt traveling works better with a higher number of frames, minimum 24, otherwise the transition will be difficult to spot. Moreover, The AnimateDiff motion module might give different results, so I downloaded mm_sd_v14.ckpt to try out this technique, I think it’s more dynamic.

prompt travelling frames

Let’s run this example in ComfyUI. I will use 24 frames for this animation. You can see how there is a transition between the first prompt, with a blue background, up to the last one, where I emphasise the concept of fire and red background.

comfyui prompt travel

We can see that we have more control over the overall animation, at least in terms of styles and elements, but everything is still a bit messy and out of our control. Trying many different prompts might give you good results anyway.

Upscale

These results so far have been useful to understand how AnimateDiff works, but the out put quality is not great yet. However we can try to improve these results, by for example increasing the resolution of the latent image, or using upscaling techniques, similar to it is possible to do with a single image.

Increase the resolution of the base empty image

Let’s focus on upscaling. There are various methods to achieve this in ComfyUI, but the concept is to integrate an upscaling node or workflow immediately after the image output. For simplicity, I will utilize a single node called CR Upscale Image. You can find it by using the ComfyUI Manager.

comfyui upscale animatediff

In this way, an upscaled animation is generated. You can try also the Ultimate SD Upscaler or the Upscale Latent by node as well.

AnimateDiff and Controlnet

ControlNet and AnimateDiff go hand to hand to add consistency in the movements in the final animation.

Using OpenPose, we can detect the movements of persons in a video to get much more consistency. You can use any type of controlnet: openpose, scribble, dept, leart etc. In this example, I will use lineart and openpose.

We need to add some nodes in our workflow in order to use controlNet.

For this, I did a few modification from the txt2img w/ Initial ControlNet input (using OpenPose images) + latent upscale w/ full denoise workflow that can be found at the end of AnimateDiff repository. In my workflow, I added a variant of ControlNet OpenPose, which is used to detect the pose from a video and you can find it on OpenArt.

animatediff openpose comfyui workflow

This is the ControlNet part, where I choose a video, set the max frames that I want to load, and detect the pose using ControlNet.

Control Net Openpose comfyui

As example, I choose a short vertical video as input, then using OpenPose ControlNet we can detect the basic movements, and create a new animation that has much more consistency. The example includes also another step to upscale the animation.

For this workflow I used the temporaldiff-v1-animatediff.safetensor, an improved model based on AnimateDiff, but you can experiment with any other variant. You can also combine it with Prompt Travel, just adding the Batch Prompt node instead of the simple prompt node.

Using IPAdapters

AnimateDiff works well also with IPAdapters, a set of models that allows us to easily influence the output by providing an input image. The style, elements or connotations of the source images will be present in the final animation. Download this workflow from OpenArt.

For this example, I will be using the motion module V3, as well as an LORA v3_sd15_adapter.ckpt that you can download on the same AnimateDiff repository and put it in the LORA folders (ComfyUI\models\loras).

ipadapter animatediff workflow

This is the basic IPAdapter flow, with the input image that will act as an instant lora to affect the animation. The most important parameters here are in the Apply IPAdapter node, in particular weight and noise. Increase or educe the weight in order to give more or less importance to the image for IPAdapter.

You don’t necessarily need to use prompt travel, just replace the green Batch Prompt Schedule node with a simple Clip Text Encode to use a single prompt.

Note how smooth the animation looks, albeit being rather short, even without using any ControlNet.

Final Workflow

Finally, I will share a workflow that combines all the components we have seen so far: AnimateDiff, Prompt Traveling, ControlNet (with OpenPose and Depth), IPAdapters, and Upscaling. I have attempted to keep the workflow as simple as possible, clearly indicating the positions of the ControlNet and IPAdapter nodes.

Download the complete workflow, as usual, on my OpenArt.

Check your input video parameters

Always check the “Load Video (Upload)” node to set the proper number of frames to adapt to your input video: frame_load_cape to set the maximum number of frames to extract, skip_first_frames is self explanatory, and select_every_nth to reduce the number of frames.

Use Empty Latent Image to adjust the height and width of your animation, depending also on your input video and the upscale factors.

ControlNet OpenPose and Depth ComfyUI
Two ControlNets

In this workflow, two types of ControlNets are used together: OpenPose that we saw earlier, and Depth, to capture better depth and shadows from the frames.

Finally, do not forget to update your prompts and adjust them depending on the number of frames that you would like to generate.

Animation with ControlNet and IPAdapter

One more thing: AnimateDiff can be improved also with Face Detailer, a technique used to restore or improve faces. If you’d like to know how, check the article!

Conclusion

AnimateDiff is a powerful technique to generate even complex animation with the right workflows. It’s a bit challenging to get consistent animations, but with the help of ControlNet things get more stable. AnimateDiff can be used also with Stable Diffusion XL and LCM Models, with the right motion modules.

More improvements could be done to improve consistency and reduce artifacts, such as face and hands adjustments or adding more ControlNets.


AnimateDiff repository.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.