8 mins read

Image Interpolation With Stable Diffusion: Steerable Motion Walkthrough

Recently, I stumbled upon a great ComfyUI workflow on OpenArt that allows you to create animations from multiple input images using the Steerable Motion custom node. In this article, I will show you with a walkthrough to help you understand how the workflow operates, so that you’ll be able to interpolate images and generate animations.

Frames from image interpolation

Introduction

Image interpolation is not something new or that requires Stable Diffusion to be used; it is a technique used to generate new pixels in an image by estimating their values based on the features of the surrounding pixels. This is often performed when resizing or resampling an image, involving a change in the number of pixels in the image. However, it can also be employed to interpolate frames, creating a smooth transition by generating pixels.

Steerable Motion, a ComfyUI custom node, can be considered an application of the popular AnimateDiff—a model used to create animations from text or input videos, together with ControlNet and IPAdapter. Specifically, it involves creatively interpolating frames, even when they are very distinct from each other, to create dramatic transitions.

Walkthrough

Let’s explore how you can use this technique using ComfyUI. If you’re interested in getting started with this UI or Stable Diffusion in general, refer to this basic tutorial.

Workflow Download


First, download the Steerable Motion Workflow from OpenArt (credits to the author!) and load it in ComfyUI. You might encounter some red nodes that are missing. The solution is simple: use the ComfyUI Manager to install the missing custom node.

The key custom node for this workflow is Steerable Motion by banodoco on GitHub. You can find additional information in the README and another workflow example

Models Download

Once all the nodes are set, you will need to download some models and put them in the right folders. If you try to hit “queue prompt” immediately, it will show you a bunch of errors about models not found. However, this is useful: close the error and check the nodes marked with a red border; these are the ones that have failed, and you need to select the right ones.

Make sure to have any checkpoint-based Stable Diffusion v1.5 (have a look on civit.ai), and the Clip Vision encoder in the ComfyUI/models/clip_vision/SD1.5/ folder. Note that the clip vision might thrown an error on the Batch Interpolation node, if it doesn’t work, you might need a different one (I am using a .safetensor format).

Then you will need to download ip-adapter-plus_sd15.safetensors (or ip-adapter-plus_face depending on your input images) for the IPAdapter: put it in the ComfyUI\custom_nodes\ComfyUI_IPAdapter_plus\models\ folder.

ipadapter image interpolation

ControlNet: download also control_v11f1e_sd15_tile_fp16.safetensor: this is needed for the Batch Creative Interpolation node. The original workflow uses control_v11e_sd15_ip2p.pth instead, feel free to experiment.

batch creative interpolation node

Regarding the AnimateDiff component, download this base motion module v3_sd15_mm.ckpt in ComfyUI\custom_nodes\ComfyUI-AnimateDiff-Evolved\models and v3_sparsectrl_rgb.ckpt in \ComfyUI\models\controlnet\

You can also add any LORA you prefer. I added an AnimateDiff LORA named v3_sd15_adapter.ckpt, but you can also use your preferred LORA.


If you want to improve the transition between the frames of the final animation, you can use the STMFNet VFI node, but you will need an additional model stmfnet.pth in ComfyUI\custom_nodes\ComfyUI-Frame-Interpolation\ckpts\stmfnet. If you want to skip this, just bypass the node and link the image output from KSampler ADV to Video Combine.

Input images

After all the nodes are set correctly and the checkpoints are in the right folders, the last thing left is to choose your input images. You can add two or more images; simply add more Input Images nodes by copy-pasting them. It will take some time to generate the animation, depending on your specs. On my 3060 with 12 GB of RAM, it took 193 seconds using two images (528x528px) and 11 minutes for 4 images.

Note that these images will all have to have the same height and width; otherwise, it will give you an error.

four images for interpolation

Feel free to experiment regarding the content of the images, but more coherent or similar images will produce a smoother animation, especially if you are trying to tell a story. However, the results with different images can create interesting transitions.

Start Interpolation

So, if all the models are in the right place, and the input images are selected, you are ready to generate your animation. You can also add a prompt and a negative prompt if you want to add more details. After that, hit “Queue prompt” and wait—remember, depending on your GPU, it might take a while, especially with four images.

You might have noticed a graph in the workflow after your generation started. You can read more about this using the notes of the authors, but they basically help you predict how the animations will be, even without generating the images.

They determine the intensity of the transition and how much weight each image has. You will need to experiment to understand what works best with your original images, adjusting the strength of the IPAdapters and Controlnets.


It seems that an interesting parameter to tweak is the linear_key_frame_influence_value. During my tests, I noticed that a smaller value, like around 0.50-0.80, will make the interpolation between each image longer, adding a sort of blending effect in the transition. On the other hand, a value above 1 will add a stronger overlap between the input images, blending more quickly and apparently with less artifacts. You will see how the lines in the graph change accordingly.

linear_key_frame_influence_value explanation
Images blend differently with different values

Going back to the workflow, in this node towards the end, you will be able to see the generated frames of your animation before they are merged into a video:

Results

At the end of the process, you should be able to see a video that interpolates your initial images. The type of the transition might differ, depending on the parameters that you choose in the batch creative interpolation node.

steerable motion graph and output

Modifying the parameters in the creative interpolation node will give you different results. You can adjust the transition between frames to make them longer or shorter, or give more weight to certain images during the interpolation. There are many useful clarifications provided by the author inside the workflow; it might be easier to experiment with parameters.

steerable motion animation example

If that workflow does not work for you, you can also try this one that I uploaded on OpenArt, which is very similar to the creative_interpolation_example.json from the Steerable Motion GitHub, except for the way you add images. You can just paste the path of a folder, and the workflow will pick all images in there. The parameters are also different, so you might get different results, useful to make comparisons

This is a nice application of the latest AI models based on stable diffusion in combination with well-defined techniques like image interpolation, creating great results with a simple workflow and the Steerable Motion custom node.


If you find any issues with the workflow, please let me know in the comments!

2 thoughts on “Image Interpolation With Stable Diffusion: Steerable Motion Walkthrough

  1. Good stuff, but I wish you would explain the intuition of how each node works so that we can build or iterate on the workflow, all this stuff is common sense if you play around with it.

  2. Thank you for the tutorial! I’m running the workflow on comfy through pinokio on am Mac M1 and I get this error messagen, any idea what I’m doing wrong?
    rror occurred when executing BatchCreativeInterpolation:

    Currently, AutocastCPU only support Bfloat16 as the autocast_cpu_dtype

    File “/Users/ruschmeyer/pinokio/api/comfyui.pinokio.git/ComfyUI/execution.py”, line 152, in recursive_execute
    output_data, output_ui = get_output_data(obj, input_data_all)
    File “/Users/ruschmeyer/pinokio/api/comfyui.pinokio.git/ComfyUI/execution.py”, line 82, in get_output_data
    return_values = map_node_over_list(obj, input_data_all, obj.FUNCTION, allow_interrupt=True)
    File “/Users/ruschmeyer/pinokio/api/comfyui.pinokio.git/ComfyUI/execution.py”, line 75, in map_node_over_list
    results.append(getattr(obj, func)(**slice_dict(input_data_all, i)))
    File “/Users/ruschmeyer/pinokio/api/comfyui.pinokio.git/ComfyUI/custom_nodes/steerable-motion/SteerableMotion.py”, line 427, in combined_function
    embed, = ipadapter_encoder.preprocess(clip_vision, prepped_image, True, 0.0, 1.0)
    File “/Users/ruschmeyer/pinokio/api/comfyui.pinokio.git/ComfyUI/custom_nodes/steerable-motion/imports/IPAdapterPlus.py”, line 718, in preprocess
    clip_embed_zeroed = zeroed_hidden_states(clip_vision, image.shape[0])
    File “/Users/ruschmeyer/pinokio/api/comfyui.pinokio.git/ComfyUI/custom_nodes/steerable-motion/imports/IPAdapterPlus.py”, line 169, in zeroed_hidden_states
    with precision_scope(comfy.model_management.get_autocast_device(clip_vision.load_device), torch.float32):
    File “/Users/ruschmeyer/pinokio/api/comfyui.pinokio.git/ComfyUI/env/lib/python3.10/site-packages/torch/amp/autocast_mode.py”, line 329, in enter
    torch.set_autocast_cpu_dtype(self.fast_dtype) # type: ignore[arg-type]

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.