Which one are you using? I probably just have not looked hard enough but I have not been able to figure out how to do things like, turn an existing image into a different style, (like real to cartoonish or SL to realistic).
I tried the img2img but it never gives me anything remotely like the input unless I start eith a SD generated image.
I even tried training an embedding on some SL screen shots which just made weird results, which was odd because I've trained it on people screen shots and gotten ok-ish results
I'm currently using a vanilla install of 'Easy Diffusion', (what I found first when looking to download Stable Diffusion). No idea if it's more or less legit than others, I did very little research before grabbing one.
The three things I experiment with most are:
"Inference Steps:" quick and dirty (low values like 25) to ultra-processed (500). You may get better hands and faces with higher values but lose other bits that were interesting at lower values.
"Guidance Scale:" low values take your prompt text as a -very- loose suggestion, high values try to follow it more exactly. Lower than 7 often produces chaos, higher than 20 always ends up horribly oversaturated for me.
"Prompt Strength" (only available with an initial image for img2img). The smaller the value the smaller the impact the prompt will have on the image.
When I'm using an initial image(img2img) a new slider becomes available: 'Prompt Strength' low values retain more of the original image, high values overwrite more of it.
Finding a good balance of Guidance Scale and Prompt Strength seems to take a lot of trial and error, depending on the source image and how well the engine can handle the prompt and the source image. If it's struggling with one or the other, you'll get some heinous distortion immediately.
While iterating through my SL snapshot... it turned the background water into spooky trees, then spider webs, then a waterfall, and finally a snowstorm before it gave up trying to make it resemble anything other than just 'gray'.=D
Anyway, with Inference Steps: 150, Guidance=50 and prompt=0.5 and: "a rough sketch, Japanese anime drawing of an elf warrior, manga, hentai style, black and white, printed page, paper grain, crosshatched, flat, heavy cell shading, block shading, hard color segmentation" I got this...
With inference steps: 175, guidance=50 and prompt=0.5 and: "a ornately framed ultra flat ultra pale water color with heavy inked lines poster of a male elf warrior, in art nouveau style, ultra simple color, by alfons mucha, ultra flat, abstract, pale colors, heavy lines, ((((ultra ornate, complicated border frame in the style of art nouveau))))" I got this instead:
I did have it generate several and try different settings before I found versions that seemed okay though. :}
By the way are you modeled to loosely resemble a World of Warcraft night elf male?
Lol, don't get me started. Yes, I like WoW's elf ears. I always found Night-Elf guys in WoW too creepy to play for long. Body, Animation, Voice... I guess there's a reason shape-shifting druids were kind of that race's 'main thing'. Blood Elves were fine, but the voicing kinda made me gravitate towards other races. Goblin/Dwarf/Troll/Forsaken... and later Worgen/Nightborne/Void Elf.
The gray coloring was temporary... more inspired from Elder Scrolls Online which I played a lot after giving up on WoW. Most of my alts are Dunmer (dark elves), like this guy: