6 mins read

Stable Diffusion VS DALLE – A Comparison with Prompts and Examples

Hey everyone! The focus of this article is a comparison between two big players in the world of AI image generation: the community-powered hero backed by Stability.ai, Stable Diffusion, and the powerful, closed source ChatGPT’s DALLE by OpenAI.

Let me break it down for you. Stable Diffusion? It’s like the ultimate DIY project. It’s open-source, which means it’s out there for anyone and everyone. Tinker, tweak, and transform – the sky’s the limit. It’s like having an art studio where the walls are made of glass, and everyone’s invited to peek and play.

Now, flip the coin, and you’ve got DALLE by ChatGPT. This one’s the James Bond of image generators – powerful, sophisticated, and a bit secretive. Closed-source means it keeps its cards close to its chest, powered by OpenAI’s top-secret sauce. But does it deliver some stunning visuals!

And here’s the real kicker: how well do these two interpret your prompt instructions? That’s what we’re about to find out. It’s not just about pumping out pretty pictures. It’s about understanding your prompt, reading between the lines, and turning those words into visuals exactly as you want them.

Next, we’re lining up Stable Diffusion and DALLE side by side to see how they handle the same prompts.

A bit of Context

I’m using ChatGPT4 and the amazing Stable Diffusion XL to create some awesome images for a fun comparison. To make things easier, I’m using an amazing tool called Fooocus, which really helps in getting the best out of Stable Diffusion XL and it’s very easy to use. There are a few tweaks here and there to smooth things out made by the tool, but I think it’s a fair game since DALLE gets a boost from ChatGPT’s well crafted instructions.

I’m also using something called Image Prompt Crafter, a Custom GPT that I made to come up with quick and creative prompts. It’s been a huge help for generating ideas for Stable Diffusion, Midjourney, and DALLE images. So, the cool images you’re about to see? That’s where I got inspired from.

Prompt Challenge

No more talking now, let’s start the challenge: left picture is DALLE, right one is Stable Diffusion XL.

Wizard dog

Realistic close-up portrait of a dog wearing a wizard hat, high detail, photo-realism, 8K resolution, natural lighting, focus on facial expressions, dynamic texture of fur and fabric, magical ambiance.

DALLE
SDXL

Cherry trees

A melancholic landscape on a grassy hill in Japan, featuring cherry trees in full bloom, soft lighting, overcast sky, delicate pink cherry blossoms gently falling, traditional Japanese elements subtly integrated, serene and reflective mood, in the style of a detailed digital painting, high resolution"

DALLE
SDXL

ChatGPT definitely went strong with the melancholic vibes, maybe a bit too dramatic.


Pixel Art

a pixel art style image featuring palm trees on a beach during sunset. vibrant hues of the setting sun reflecting on the water, shadows of palm trees across the beach. pixelated aesthetic, with clear, blocky shapes and a limited color palette typical of retro video games. serene and nostalgic, classic 8-bit graphics. sharp pixel edges

A pixel art image generated by ChatGPT, palm trees
DALLE
A pixel art image generated by Stable Diffusion
SDXL

Fantasy castle

digital fantasy art illustration of a majestic castle, soaring towers, glowing windows, surrounded by mystical forests and a shimmering moat, ethereal sky, moonlit night, vibrant colors, detailed architecture, intricate stonework

A fantastic castle generated by DALLE
DALLE
Impressive castle at night by sdxl
SDXL

Cyberpunk superhero

superhero in cyberpunk futuristic city, neon-lit skyscrapers, reflective wet streets, advanced tech armor, dynamic pose on rooftop, vivid holographic displays, bustling urban landscape, night scene, high-tech gadgets, vivid contrasts

Cyberpunk digital art image by DALLE-3
DALLE
SDXL

Realistic portrait

ultra-realistic portrait of a man, direct gaze at the camera, sharp facial features, natural skin tones, blurred bokeh background, high-definition, professional photography, soft natural lighting, lifelike texture, detailed expression

DALLE
A Realistic portrait of a man by SDXL
SDXL

Well, ChatGPT’s DALLE missed the point with this one, I think, realism is completely missing.


Fisherman comics

comic anime style, young guy fishing, cozy lake in a valley, vibrant colors, exaggerated expressions, clear blue water, lush green hills, peaceful setting, warm sunlight, detailed fishing gear, playful art style, dynamic composition

A comics ai generated image of a fisherman
DALLE
SDXL

Carnivorous plant

massive carnivorous plant towering over a city, about to devour a skyscraper, enormous tendrils and sharp teeth, ominous clouds, dramatic stormy sky, sense of impending doom, intense colors, surreal scale, detailed urban landscape

DALLE
SDXL

Here, on the other hand, SDXL missed the mark. When adding specific elements and actions in the prompt, DALLE tends to be more accurate, being able to follow the instructions better.


Travel poster

vintage travel poster for Rome, iconic Colosseum and ancient ruins, vibrant sunset, Vespa scooters, quaint streets, rich colors, stylish font, 'Visit Rome - The Eternal City', artistic, nostalgic feel, famous landmarks, inviting atmosphere

A travel poster of Rome generrated by ChatGPT
DALLE
SDXL

Here, DALLE reveals its capabilities by generating an image with a perfectly written text on it. SDXL is usually not able to generate very coherent or readable text – at least not out of the box – but still, it generated an image with a nice aesthetics.


Conclusion and Resources

That’s it! Now you have a better idea on the capabilities of Stable Diffusion XL and DALLE-3 (from ChatGPT4). These examples are not cherry picked, meaning that improvements could be made, for examples adjusting the prompts, providing example images or using additional models (for SDXL only).

You can run Stable Diffusion locally, with several tools like ComfyUI, stable-diffusion-webui from automatic1111, and Fooocus, the UI that I mentioned at the beginning. If your laptop is not powerful enough, you can use some free and paid options, for example directly from the Stability.ai website. To experiment with DALLE-3, you can try for free using Bing Image Creator, even if with some limitations. To get the most out of it, I used a ChatGPT Plus subscription. If you have a ChatGPT Plus subscription, you can use the Image Prompt Crafter CustomGPT to find inspiration for your prompts!

One thought on “Stable Diffusion VS DALLE – A Comparison with Prompts and Examples

  1. Were the SD images generated using Fooocus’ “Prompt Expansion” or with raw prompts? That can make a huge difference in the output. Supposedly Prompt Expansion attempts to mimic the under-the-hood prompt alterations of DALL-E and MidJourney.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.