Is playing around with Stable Diffusion and talking about it making my blog titles more unhinged than usual? Possibly.

A warning, this post is going to have a lot of large images and pseudo-bizzare pictures. There’s also going to be quite a bit of scantily clad and risque images. That’s just the nature of the story I’m yarning.

So one of the earliest pages on Stable Diffusion I read after installing it was a Reddit covering a bunch of fantasy illustrations of Emma Watson. They’re gorgeous. The images are risque enough that the page’s NFSW status is debatable. And sure, we can wax poetic about ethics issues regarding likeness rights, especially of celebrities. I wouldn’t disagree, we sure could, but this post won’t be that conversation. This, instead, is going to be about a rabbit hole of ever-changing varying themes of images that lead me to my current favorite generated image.

All About That Base

Something I’ve been thinking of is the idea of base images. This is the idea of using image-to-image, with the intention of making unguided variations – or even something completely unrecognizable from the input. But on the flip side I’ve also been thinking about how to guide the process and get more control, even if that means manipulating the image in Photoshop. Is that cheating? Like using scissors in origami? Meh…

At the start of this process, I took one of those Reddit pictures and tried to see what would happen if I told it to keep it as a fantasy-styled art with Emma Watson. Basically, tell it to generate what’s already in the input image.

emma watson, sexy, heroic movie shot, cinematic, realistic, photo-realistic

Also, as a warning, using the term ‘sexy’ in the prompt is a very dangerous dice roll for getting back NSFW content. Not specifically for this image set, but with Stable Diffusion in general. Often you don’t get sexy-sexy, but more repulsive-sexy.

So we have a wicked dragon ring background, and a really thin stomach version. All with neat shimmering and glowing gold backgrounds. And throughout all of them, not a decent left hand.

The Inner Macabre In Me

I started adding some themes to the prompt, specifically some “stuff-of-nightmares” phrases. Another way to phrase it would be, “I went full-metal.” And things fell apart. I got back some stuff-of-nightmare results, but more from discombobulation and haphazard composition.

Oh! Did I forget to mention things were falling apart?

The prompt was changed a bit more, and the Emma Watson experiment was over, but the image still seemed like a viable base. I took her name out and modified some things – such as adding horns, demons, and ornate gilded armor. For the sake of time, image bandwidth, and to spare you a lot of the crazy body horror it generated (a lot involving insanely bizarre-looking demon nips), we’ll skip to an image that had a face I was drawn to and decided to keep by masking it so that changing it was off-limits to the algorithm.

On the left is perhaps the original image where the face came from. In playing the numbers game, I have so many variations that I’m not 100% sure where this series started anymore. I decided to go into Photoshop and remove the haphazard horns and add a more orderly pattern of 2 on each side – going outwards and upwards. Sometimes it would play around with the horns by changing their numbers and geometry, and sometimes it would do a good job of keeping them similar but improving the aesthetic.

And then out of left field, I get this burly fire demon, covered in fire and haze, with horns for hair. And this is it, folks – I love it! ❤️ My only gripe is the right shoulder. The deltoid-biceps transition seems lumpy.

For the shoulder issue, could I go into Photoshop and liquify it to adjust it? No! That’s not how we do things around here!

Odd-Pocket

I wanted to search more in this pocket of the AI model. Sometimes to do this, I’ll run an image without a prompt – and the images will often be pretty similar, especially if the CFG parameter is low or moderate. But, I got some weird stuff doing that for the previous image. For example, the very next image had the head locked down (because of the mask), but the rest of the image changed.

In so many ways, I want to ask, “what is happening!?”
So I will, “what is happening!?”
The image would be a pretty cool piece if I cropped it and kept the bottom. Up until now, through this entire journey, I’ve gotten a lot of disjointed parts, body horror, and what-nots when least expected. But, I’ve never had such a drastic change to something so completely different using the no-prompt method. While this can happen with a high CFG, I was expecting the masked head to lock the images generated to 3/4th length portraits.

Roses For Hair

But, we’re not done yet! I went back to that Reddit post, superimposed my demon’s face onto another image, and did another round of image generation. If you go to the Reddit post, you may be able to guess which image I used.

Some of these curls and swirls in the hair reminded me of roses, so I threw that in the prompt. As always, there are glaring body issues, especially with the hands – but every one of these had absolutely gorgeous elements in them.

In the top row, I had the old (stuff-of-nightmares) prompt, but the horns started feeling off in this set. So I removed them, and the roses took over.