AI-generated images thread

Noodles

The sequel will probably be better.
Joined
Sep 20, 2018
Messages
5,751
Location
Illinois
SL Rez
2006
Joined SLU
04-28-2010
SLU Posts
6947
People always talk about how Stable Diffusion can't handle hands, but its almost as bad with belly buttons. There are so often two or three or weird long holes.
 

Jopsy Pendragon

Make Authoritarianism Go Away
Joined
Sep 20, 2018
Messages
2,996
Location
San Diego CA
SL Rez
2004
Joined SLU
2007
SLU Posts
11308
So I've been playing with using the output of stable diffusion as input. After 50 iterations of that, making slight little tweaks and fixes along the way, it turned this:


into this "Jopsy Pendragon the Far-To-Serious Elven Space Ranger!":

 

Noodles

The sequel will probably be better.
Joined
Sep 20, 2018
Messages
5,751
Location
Illinois
SL Rez
2006
Joined SLU
04-28-2010
SLU Posts
6947
So I've been playing with using the output of stable diffusion as input. After 50 iterations of that, making slight little tweaks and fixes along the way, it turned this:
Which one are you using? I probably just have not looked hard enough but I have not been able to figure out how to do things like, turn an existing image into a different style, (like real to cartoonish or SL to realistic).
I tried the img2img but it never gives me anything remotely like the input unless I start eith a SD generated image.

I even tried training an embedding on some SL screen shots which just made weird results, which was odd because I've trained it on people screen shots and gotten ok-ish results
 

Argent Stonecutter

Emergency Mustelid Hologram
Joined
Sep 20, 2018
Messages
7,321
Location
Coonspiracy Central, Noonkkot
SL Rez
2005
Joined SLU
Sep 2009
SLU Posts
20780
John Donne, "The Sun Rising".



Princes do but play us; compared to this,
All honor's mimic, all wealth alchemy.
Thou, sun, art half as happy as we,
In that the world's contracted thus.
Thine age asks ease, and since thy duties be
To warm the world, that's done in warming us.
Shine here to us, and thou art everywhere;
This bed thy center is, these walls, thy sphere.
 

Jopsy Pendragon

Make Authoritarianism Go Away
Joined
Sep 20, 2018
Messages
2,996
Location
San Diego CA
SL Rez
2004
Joined SLU
2007
SLU Posts
11308
Which one are you using? I probably just have not looked hard enough but I have not been able to figure out how to do things like, turn an existing image into a different style, (like real to cartoonish or SL to realistic).
I tried the img2img but it never gives me anything remotely like the input unless I start eith a SD generated image.

I even tried training an embedding on some SL screen shots which just made weird results, which was odd because I've trained it on people screen shots and gotten ok-ish results
I'm currently using a vanilla install of 'Easy Diffusion', (what I found first when looking to download Stable Diffusion). No idea if it's more or less legit than others, I did very little research before grabbing one. :)

The three things I experiment with most are:

"Inference Steps:" quick and dirty (low values like 25) to ultra-processed (500). You may get better hands and faces with higher values but lose other bits that were interesting at lower values.

"Guidance Scale:" low values take your prompt text as a -very- loose suggestion, high values try to follow it more exactly. Lower than 7 often produces chaos, higher than 20 always ends up horribly oversaturated for me.

"Prompt Strength" (only available with an initial image for img2img). The smaller the value the smaller the impact the prompt will have on the image.

When I'm using an initial image(img2img) a new slider becomes available: 'Prompt Strength' low values retain more of the original image, high values overwrite more of it.

Finding a good balance of Guidance Scale and Prompt Strength seems to take a lot of trial and error, depending on the source image and how well the engine can handle the prompt and the source image. If it's struggling with one or the other, you'll get some heinous distortion immediately.

While iterating through my SL snapshot... it turned the background water into spooky trees, then spider webs, then a waterfall, and finally a snowstorm before it gave up trying to make it resemble anything other than just 'gray'.=D

Anyway, with Inference Steps: 150, Guidance=50 and prompt=0.5 and: "a rough sketch, Japanese anime drawing of an elf warrior, manga, hentai style, black and white, printed page, paper grain, crosshatched, flat, heavy cell shading, block shading, hard color segmentation" I got this...


With inference steps: 175, guidance=50 and prompt=0.5 and: "a ornately framed ultra flat ultra pale water color with heavy inked lines poster of a male elf warrior, in art nouveau style, ultra simple color, by alfons mucha, ultra flat, abstract, pale colors, heavy lines, ((((ultra ornate, complicated border frame in the style of art nouveau))))" I got this instead:


I did have it generate several and try different settings before I found versions that seemed okay though. :}


By the way are you modeled to loosely resemble a World of Warcraft night elf male?
Lol, don't get me started. Yes, I like WoW's elf ears. I always found Night-Elf guys in WoW too creepy to play for long. Body, Animation, Voice... I guess there's a reason shape-shifting druids were kind of that race's 'main thing'. Blood Elves were fine, but the voicing kinda made me gravitate towards other races. Goblin/Dwarf/Troll/Forsaken... and later Worgen/Nightborne/Void Elf.

The gray coloring was temporary... more inspired from Elder Scrolls Online which I played a lot after giving up on WoW. Most of my alts are Dunmer (dark elves), like this guy:

 
  • 2Like
Reactions: Khamon and Noodles

Bartholomew Gallacher

Well-known member
Joined
Sep 26, 2018
Messages
6,769
SL Rez
2002
Well Easy Diffusion is a great way getting started. It is a Web UI, but different from the most common one for Stable Diffusion, which is from Automatic1111. I am using a fork from Automatic111 due to my GPU. Anyway, what I am showing here must also be doable in Easy Diffusion, because the underlying generator is the same.

First of all a lesson learned for me is that it is important with SD to honor the aspect ratio. So I changed this to height of 768 pixels and width of 512 pixels. I am
using your SL image as reference.

I am using Img2Img here. First I used "Interrogate CLIP" to let the UI generate my start for the input prompt, which gave me this:



Well, it's a start but has some issues, so I tweaked the prompt to this manually:
a male elf with a long goatee and long, pointed ears dressed as space ranger and a body of water in the background, (a character portrait:0.828), sci fi, intricate detail, masterpiece, green hair

Since we are going for scifi and elves, I used as model Rev Animated which is well suited for fantasy and scifi, much better than the standard model. As sampler I normally do use for starters DDIM, because this will stay closer to the original image, but might change it later to UniPC.

For the picture here I am going with UniPC, which leads to fancier and more detailed results in my opinion. Settings are like depicted. You will note that I am using only 30 sampling steps here, which are interference steps, because for most samplers more than 30 steps is a waste of time. Most samplers don't change much at all after 30 steps, meaning they do converge to one picture, so unnecessary really to use 150 steps at all.

The only exception are ancestral samplers, like Euler A, which never really stable converge.

Furthermore the configuration guidance is set to 7, which is a default value, to give the AI enough room for generating enough variety.



The most important magic sauce then to get your pose right is using ControlNet. It analyses your image for information, and this is used as input as well to pose the guy in the image. I am using here the depth controlnet, which leads for this type of imagery to good results. Another popular Controlnet is pose, which would analyse the posing and use that. With pose you can just scribble some rough sketches, and render fully humans/figures in that pose using SD, which is a really powerful tool.



Anyway, that's my setup for making Jopsy Pendragon the space ranger elf. And this is one of the results, because as we know different prompt/seed leads to different results. The overall time to get this result is about 10 minutes, which was mostly spent for tweaking. I could further play with emotion, background and so on, but I think this is pretty good for the invested time.



Same stuff with other seeds:



 
Last edited:

Argent Stonecutter

Emergency Mustelid Hologram
Joined
Sep 20, 2018
Messages
7,321
Location
Coonspiracy Central, Noonkkot
SL Rez
2005
Joined SLU
Sep 2009
SLU Posts
20780
Can you get a convincing ferret-guy with a lock of blue hair, in a leather jacket with an Akubra hat slung over his back?
 
  • 1Like
Reactions: Pancake

Bartholomew Gallacher

Well-known member
Joined
Sep 26, 2018
Messages
6,769
SL Rez
2002
SD generates different styles of images when altering prompts slightly. Sometimes a style overview displaying the impact might be handy. First you've got to enable the prompt matrix like this:



Then put in some prompt, just like this: A cat with a bowtie | sepia photo | illustration | fantasy | concept art

And here is the result. Don't use too many words, because it will generate 2^x images, where X is the count of different keywords separated by pipes.

 

Noodles

The sequel will probably be better.
Joined
Sep 20, 2018
Messages
5,751
Location
Illinois
SL Rez
2006
Joined SLU
04-28-2010
SLU Posts
6947
It looks like I need an add on for Control Net woth Automatic1111. I may look into these other tools for doing images.
 

Jopsy Pendragon

Make Authoritarianism Go Away
Joined
Sep 20, 2018
Messages
2,996
Location
San Diego CA
SL Rez
2004
Joined SLU
2007
SLU Posts
11308
It turns out that in my above generated images Controlnet was disabled, me bad. This is one image with depth controlnet enabled:

Each of the images are amazing! But this one is by far my favorite... especially from the neck up. Even the mouth/eyebrows remind me of a skin that I'd used in SL for a long time before mesh obsoleted it. :)

Try as I might, I simply -can't- get easy diffusion to recognize the word 'goatee'.

You've given me a -lot- to unpack. Easy Diffusion doesn't have an overt Interrogate Clip (I presume it has some kind of imbedded version to deal with img2img, but nothing I can use to generate a prompt with), and there's definitely no ControlNet stuff. I haven't started playing with different models yet, still using sd-v1-4.

I find myself easily overwhelmed by option paralysis if I try to tackle too many different options at once, so I try to master a few variables at a time together, then gradually add more. :)
 

Argent Stonecutter

Emergency Mustelid Hologram
Joined
Sep 20, 2018
Messages
7,321
Location
Coonspiracy Central, Noonkkot
SL Rez
2005
Joined SLU
Sep 2009
SLU Posts
20780
Something like this?
Yeh, I have that problem too, where a color seeps out to other parts of the image. like all the green clothes in the Jopsy images. See my avatar for the lock of blue hair, brown akubra hat "slung over the back", and the brown "leather jacket" but I can't keep that blue from spreading everywhere.

But mostly I can't get toons without a ton of retries, humanoid animals seem to be verboten. When I add action words it creates a humin doing whatever and an animal standing, running, or sitting beside them. I have a whole folder of painters wearing a beret with a fox at their feet because a fox wearing a beret and painting a portrait is too much for the neural network to produce.
 
Last edited: