How does your avatar look when enhanced by AI?

Istelathis · November 7, 2023

I was able to get controlnet working for a short time, but only was able to get a few images before I started to get more error message

The original image:

Snapshot_120.png.dc9c760b71f9cf8d44c9509939e4b305.png

A watercolor prompt:

Beautiful_Lighting__Watercolor_S0_St50_G7.5(1).jpeg.c615c4972ef486db1f9dd03f310881f9.jpeg

Here is a realistic prompt below:

Beautiful_Lighting__Realistic_S2315265894_St50_G1.1.jpeg.a896726b86a445a8358f53d202c0700c.jpeg

I had to use a LORA to ensure the faces did not turn out funky, the watercolor image had a higher Guidance Scale of 7.5, which is why you see a different outfit. I find if you lower it, the outfit will change less. Both of these were done with 50 steps, using cyberrealistic for the model. I would have liked to have done more fine touching, and added more steps but my video card is older, and it takes forever to render, especially with controlnet.

Love Zhaoying · November 7, 2023

2 minutes ago, Istelathis said:

I was able to get controlnet working for a short time, but only was able to get a few images before I started to get more error message

The original image:

A watercolor prompt:

Here is a realistic prompt below:

I had to use a LORA to ensure the faces did not turn out funky, the watercolor image had a higher Guidance Scale of 7.5, which is why you see a different outfit. I find if you lower it, the outfit will change less. Both of these were done with 50 steps, using cyberrealistic for the model. I would have liked to have done more fine touching, and added more steps but my video card is older, and it takes forever to render, especially with controlnet.

Interesting how In the watercolor, your breasts were replaced with a tasteful blouse!

Edited November 7, 2023 by Love Zhaoying

Marianne Little · November 7, 2023

23 minutes ago, Love Zhaoying said:

Interesting how In the watercolor, your breasts were replaced with a tasteful blouse!

And not just that. The cap disappeared, and the hair changed. I also think the head has a slight tilt forward, and the eyes down.

Love Zhaoying · November 7, 2023

2 minutes ago, Marianne Little said:

And not just that. The cap disappeared, and the hair changed. I also think the head has a slight tilt forward, and the eyes down.

It's like one of those "Spot the Differences" challenges! I was thinking it was the AI making editorial changes because "watercolors must be just so, to be proper; one must maintain standards of decorum!"

Marianne Little · November 7, 2023

29 minutes ago, Love Zhaoying said:

It's like one of those "Spot the Differences" challenges! I was thinking it was the AI making editorial changes because "watercolors must be just so, to be proper; one must maintain standards of decorum!"

It has totally changed the avatar. Maybe it is no caps or crop tops in what the AI is using as reference for "watercolor"?

Istelathis · November 7, 2023

1 hour ago, Marianne Little said:

And not just that. The cap disappeared, and the hair changed. I also think the head has a slight tilt forward, and the eyes down.

1 hour ago, Love Zhaoying said:

It's like one of those "Spot the Differences" challenges! I was thinking it was the AI making editorial changes because "watercolors must be just so, to be proper; one must maintain standards of decorum!"

From what experimentation I have done, it has to do with guidance scale as well as prompt strength in easy diffusion, at least that has been my experience. Here is one using img2img with a GS 7.5 and PS of .55, and a prompt for a woman holding a cat, not a great picture but using the same original picture I had posted above. I ran this one with 100 steps trying to get the cat legs to form, but alas, they decided to remain hidden.

extra.jpeg.774847c9a4a41bdd5a053828f58ceaf9.jpeg

Another one using the same img2img with the prompt strength raised to .99 and guidance scale of 15

image.jpeg.a4b7410f30d16dc5b5f716279411622d.jpeg

Edited November 7, 2023 by Istelathis

Marigold Devin · November 7, 2023

1 hour ago, Marianne Little said:

It has totally changed the avatar. Maybe it is no caps or crop tops in what the AI is using as reference for "watercolor"?

I totally agree with you. AI, as we all know, is only as good as whatever has been programmed into it, and it would make sense to make a more demure image. I guess if we all were using the same software, we'd start to see more similarities, like there was that fun software a couple of years ago (Toon Me) where you'd upload a photo of your avatar (or real life self), and be given four different cartoon versions.

It all seems very clever, and must have taken plenty of man (woman/person) hours to program. It's hard to go back into Second Life itself and see myself without all the make-up, as it were, after creating an AI representation of my avatar.

ValKalAstra · November 7, 2023

If I may - a few words on the various settings and what they do:

CFG or Guidance Scale - Governs how much you let the AI off the leash. Low CFG values means it all but ignores your prompt while high CFG means it will strictly adhere to the prompt but neglect any lessons it knows about image composition and such. A value of 6.5 - 7 is usually ideal unless you explicitly want it to go wild or strict.

Steps and Sampler - These need to be configured in unison. A sampler is what turns an image from noise into the result. Each sampler needs at least a certain amount of steps to work but anything past that is just pointlessly blowing out GPU power. Exception so called ancestral samplers (usually marked with an a in their name like "Euler a"), these can run forever, continously adding new noise. As a rule of thumb: Most will do fine with 25 Steps. DDIM will work with 10, Euler is older and may need 80 or so.

Denoise or Denoising - How much the image is meant to change from the source image you give it, ranging from 0 to 1,0. A value of 1 will all but throw out the image and a 0 will do no changes at all. This is what you will need to find a decent balance with.

The choice of the model is also an important one and it helps to understand how and why things came to be. There are three major revisions of Stable Diffusion. 1.5, 2.x and SDXL. Out of these you can outright dismiss 2.x, it was a failed attempt. 2.x tried to remove the nudity from the model and very quickly realised that nudity was what made the model understand human anatomy. The resulting images were a mess of uncanny horrors.

If you've got a good PC, use SDXL and 1.5 If you have got a less good PC, use 1.5. That leaves the choice of the model and here it helps to understand that there were for the most part three major development paths that have all converged by now. Anime, based on leaked Anime models (usually NovalAI), base stable diffusion and... asian girls. There was a lot of hype from korea early on. These have all merged into one by now just with various weights attributed to this or that and maybe some additional training in certain aspects.

Newer models generally work much better than older models. NSFW models will do better with anatomy (including hands) but need heavy counter prompting to put on clothes. The merge with anime models means that unless you want massive tracts of land, you'll need to actually prompt OUT certain anatomic sizes. Furthermore, it helps to understand that the Anime models used a different style of prompting, based on booru tags (no, that is not safe for work to google).

If you want certain clothes, you need to prompt for them.

Last but not least, keep prompt length and image resolution in mind. SD 1.5 was trained on 512x512, you can go up to 768 but anything beyond that goes kaputt. Use upscaling. SDXL was trained even more rigidly on a handful of resolutions around 1024x1024 and can also go +-256 in either direction (bit more but eh, find a cheatsheet yourself :P). Prompt length then, each prompt is split into one or several brackets of 75 tokens. This is important to know because it will govern things pretty strongly. Let's say you've got 76 tokens in your prompt, then the first 75 tokens will form one bracket - and the remaining token will form a bracket on its own. Then both brackets will be weighed equally - resulting in that one token coming on super strong. So stay below 75 tokens ideally or look into "BREAK"ing prompts.

Alright, enough crash course SD. Took some time to try a more direct conversion without my LORA. This was with a denoising value of 0.6, some prompting and a lot of cherry picking on SDXL. SDXL tends to make faces come out a bit strong on bone structure and none of the stable diffusion models can deal with my eye colour in any reliable shape or form.

lf you want more consistency than that with Stable Diffusion, you either want to make heavy use of controlnets or train your own LORA. Something I've done together with a friend of mine. That's also what allows you to translate images into different styles more easily. Above result is cherry-picked, below result is not (but could be better):

Footnote: Talking about bias in training data, there's a funny social one and that's age. Most of us want to appear younger than we are, as we get older. Some won't answer the age question truthfully. Look, long story short, the training data thinks a fourty year old person will look like an actually sixty year old person, because so many sixty year olds lied about their age being fourty - and furthermore, so did the fourty year olds, which you can find in thirty year old category and if you want thirty year olds, you're gonna find them in the 20-25 age bracket. I found that funny :D.

Edit:

Oh, oh, an interesting use case for sl photography! You can push a sl screenshot through image to image, mid denoise and then take cues on lighting your scene more believably. If you compare the three shots, the biggest flaw of my sl one is the flat frontal lighting.

Heck, you can even use it as a little indicator where to manually add highlights in post.

Edited November 7, 2023 by ValKalAstra

Sprout Evergarden · November 7, 2023

instructions uncleaar got this instead

than it started working

Edited November 7, 2023 by Linnea Evergarden

Maitimo · November 7, 2023

These all look lovely but sadly I can't participate - the video instructions lost me in the first minute. Is there an easier way without all the Github and low-level installation?

Love Zhaoying · November 7, 2023

5 hours ago, Istelathis said:

I ran this one with 100 steps trying to get the cat legs to form, but alas, they decided to remain hidden.

I had seen some articles saying AI is bad at getting "human hands" right (# of fingers, positions, etc.) so you'd think that a cat's legs would be easier (or maube not).

ValKalAstra · November 7, 2023

3 minutes ago, Love Zhaoying said:

I had seen some articles saying AI is bad at getting "human hands" right (# of fingers, positions, etc.) so you'd think that a cat's legs would be easier (or maube not).

Hands need some heavy lifting to get right. It's a definitive weakness for the reason you've mentioned. It can be done though. As for why the cat is a bit of a mess - that happens with multiple subjects in an image. You'll need to either do so called regional prompting or inpainting. To compare, here is a majestic chonker with the right amount of legs.

Rowan Amore · November 7, 2023

Original...

AI enhanced...

download_image_1699374317670.png.2507a3a126b020fe88db826829a39d63.png

Marianne Little · November 7, 2023

32 minutes ago, Rowan Amore said:

Original...

AI enhanced...

Did you ask for it to change so much? Only the face is a bit similar to SL face, but more realistic.

And why did it slim you? Give you pink hair? Set you in a different enviroment? Turned your clothes to something cosplay Aladdin style?

Marianne Little · November 7, 2023

5 hours ago, Maitimo said:

These all look lovely but sadly I can't participate - the video instructions lost me in the first minute. Is there an easier way without all the Github and low-level installation?

I think the same. Tomorrow I will try to get some help, and see if they can install it for me.

Istelathis · November 7, 2023

img2img and ControlNet Image:

Snapshot_142.png.cd953bd1c5a3d6695cca586c400dfd90.png

Software used:

Easy Diffusion

Prompt:

Illustration, Character Design

Negative Prompt:
Naked

Extra Details:

Seed: 2220702992, Dimensions: 768x400, Sampler: unipc_tu, Inference Steps: 25, Guidance Scale: 1.1, Model: dreamshaper_8, Negative Prompt: naked, Prompt Strength: 0.55, Preserve Color Profile: true, ControlNet Model: control_v11f1p_sd15_depth
Processed 1 images in 1 minute 3 seconds

Illustration__Character_Design_S2220702992_St25_G1.1.jpeg.fa58ff55aa3efe7d4a0b030395a543b5.jpeg

Edited November 7, 2023 by Istelathis

Rowan Amore · November 7, 2023

2 hours ago, Marianne Little said:

Did you ask for it to change so much? Only the face is a bit similar to SL face, but more realistic.

And why did it slim you? Give you pink hair? Set you in a different enviroment? Turned your clothes to something cosplay Aladdin style?

I used the fantasy prompt or whatever it's called. Style.

Edited November 7, 2023 by Rowan Amore

Marianne Little · November 12, 2023

On 11/7/2023 at 8:25 PM, Rowan Amore said:

I used the fantasy prompt or whatever it's called. Style.

I think your SL image is better. I saw nothing of you in the AI image, and I see your avatar in the long thread about your avatar. The AI image was not "Yes, that is Rowan". It was a stranger.

Marianne Little · November 12, 2023

On 11/7/2023 at 10:49 AM, ValKalAstra said:

If I may - a few words on the various settings and what they do:

CFG or Guidance Scale - Governs how much you let the AI off the leash. Low CFG values means it all but ignores your prompt while high CFG means it will strictly adhere to the prompt but neglect any lessons it knows about image composition and such. A value of 6.5 - 7 is usually ideal unless you explicitly want it to go wild or strict.

Steps and Sampler - These need to be configured in unison. A sampler is what turns an image from noise into the result. Each sampler needs at least a certain amount of steps to work but anything past that is just pointlessly blowing out GPU power. Exception so called ancestral samplers (usually marked with an a in their name like "Euler a"), these can run forever, continously adding new noise. As a rule of thumb: Most will do fine with 25 Steps. DDIM will work with 10, Euler is older and may need 80 or so.

Denoise or Denoising - How much the image is meant to change from the source image you give it, ranging from 0 to 1,0. A value of 1 will all but throw out the image and a 0 will do no changes at all. This is what you will need to find a decent balance with.

The choice of the model is also an important one and it helps to understand how and why things came to be. There are three major revisions of Stable Diffusion. 1.5, 2.x and SDXL. Out of these you can outright dismiss 2.x, it was a failed attempt. 2.x tried to remove the nudity from the model and very quickly realised that nudity was what made the model understand human anatomy. The resulting images were a mess of uncanny horrors.

If you've got a good PC, use SDXL and 1.5 If you have got a less good PC, use 1.5. That leaves the choice of the model and here it helps to understand that there were for the most part three major development paths that have all converged by now. Anime, based on leaked Anime models (usually NovalAI), base stable diffusion and... asian girls. There was a lot of hype from korea early on. These have all merged into one by now just with various weights attributed to this or that and maybe some additional training in certain aspects.

Newer models generally work much better than older models. NSFW models will do better with anatomy (including hands) but need heavy counter prompting to put on clothes. The merge with anime models means that unless you want massive tracts of land, you'll need to actually prompt OUT certain anatomic sizes. Furthermore, it helps to understand that the Anime models used a different style of prompting, based on booru tags (no, that is not safe for work to google).

If you want certain clothes, you need to prompt for them.

Last but not least, keep prompt length and image resolution in mind. SD 1.5 was trained on 512x512, you can go up to 768 but anything beyond that goes kaputt. Use upscaling. SDXL was trained even more rigidly on a handful of resolutions around 1024x1024 and can also go +-256 in either direction (bit more but eh, find a cheatsheet yourself :P). Prompt length then, each prompt is split into one or several brackets of 75 tokens. This is important to know because it will govern things pretty strongly. Let's say you've got 76 tokens in your prompt, then the first 75 tokens will form one bracket - and the remaining token will form a bracket on its own. Then both brackets will be weighed equally - resulting in that one token coming on super strong. So stay below 75 tokens ideally or look into "BREAK"ing prompts.

Alright, enough crash course SD. Took some time to try a more direct conversion without my LORA. This was with a denoising value of 0.6, some prompting and a lot of cherry picking on SDXL. SDXL tends to make faces come out a bit strong on bone structure and none of the stable diffusion models can deal with my eye colour in any reliable shape or form.

lf you want more consistency than that with Stable Diffusion, you either want to make heavy use of controlnets or train your own LORA. Something I've done together with a friend of mine. That's also what allows you to translate images into different styles more easily. Above result is cherry-picked, below result is not (but could be better):

Footnote: Talking about bias in training data, there's a funny social one and that's age. Most of us want to appear younger than we are, as we get older. Some won't answer the age question truthfully. Look, long story short, the training data thinks a fourty year old person will look like an actually sixty year old person, because so many sixty year olds lied about their age being fourty - and furthermore, so did the fourty year olds, which you can find in thirty year old category and if you want thirty year olds, you're gonna find them in the 20-25 age bracket. I found that funny :D.

Edit:

Oh, oh, an interesting use case for sl photography! You can push a sl screenshot through image to image, mid denoise and then take cues on lighting your scene more believably. If you compare the three shots, the biggest flaw of my sl one is the flat frontal lighting.

Heck, you can even use it as a little indicator where to manually add highlights in post.

I like what you did here.

ValKalAstra · November 13, 2023

On 11/7/2023 at 12:22 PM, Maitimo said:

These all look lovely but sadly I can't participate - the video instructions lost me in the first minute. Is there an easier way without all the Github and low-level installation?

Oh, sorry - I kind of totally missed your post there. There are somewhat easier ways but at the end of the day, the technology is still pretty early and user unfriendly. For local installs, Some do swear by EasyDiffusion, with the focus of it being a "one click installer" that takes care of the technical aspects. I can't vouch for it as I never used it but @Istelathis seems to be using it https://github.com/easydiffusion/easydiffusion

But... yah. It's still user unfriendly. There are also some websites that aim to provide a more user friendly alternative. I haven't kept up with them but most offer some free generations per day and then ask you to pay but as long as you stay below the threshold, it should be fine. One example might be https://playgroundai.com/ (caveat: You need to remember to mark your session private there and they tend to label some things differently, such as denoising to image strength). Good enough to toy around with though.

/edit: PlaygroundAi seems to be 500 free images per day

Edited November 13, 2023 by ValKalAstra

JeromFranzic · November 13, 2023

1 hour ago, ValKalAstra said:

Oh, sorry - I kind of totally missed your post there. There are somewhat easier ways but at the end of the day, the technology is still pretty early and user unfriendly. For local installs, Some do swear by EasyDiffusion, with the focus of it being a "one click installer" that takes care of the technical aspects. I can't vouch for it as I never used it but @Istelathis seems to be using it https://github.com/easydiffusion/easydiffusion

But... yah. It's still user unfriendly. There are also some websites that aim to provide a more user friendly alternative. I haven't kept up with them but most offer some free generations per day and then ask you to pay but as long as you stay below the threshold, it should be fine. One example might be https://playgroundai.com/ (caveat: You need to remember to mark your session private there and they tend to label some things differently, such as denoising to image strength). Good enough to toy around with though.

/edit: PlaygroundAi seems to be 500 free images per day

Just tried playgroundai, on my phone now. Would post my results, but I need to work on my prompts for it... also, the results ended up coming out x rated ish so nope, not here lol!

Will try Easy Diffusion later. The minimum specs aren't too high. I can actually run that on my 5 year old laptop (still my main PC for SL... *sniff*). Linux version is a huge plus for me, better OS overhead there vs Win11. Thanks!

ValKalAstra · November 13, 2023

2 minutes ago, JeromFranzic said:

Just tried playgroundai, on my phone now. Would post my results, but I need to work on my prompts for it... also, the results ended up coming out x rated ish so nope, not here lol!

Will try Easy Diffusion later. The minimum specs aren't too high. I can actually run that on my 5 year old laptop (still my main PC for SL... *sniff*). Linux version is a huge plus for me, better OS overhead there vs Win11. Thanks!

Aye - that's where the negative prompts come in handy. Put "Nude" or "Naked" in the negative prompt field (I think you need to toggle that to active on playgroundai). If it still comes out undressed, you can increase the weight of those words like so: (nude:1.5).

JeromFranzic · November 13, 2023

3 minutes ago, ValKalAstra said:

Aye - that's where the negative prompts come in handy. Put "Nude" or "Naked" in the negative prompt field (I think you need to toggle that to active on playgroundai). If it still comes out undressed, you can increase the weight of those words like so: (nude:1.5).

Right... need to add extra negative prompts. Ok. :^)

JeromFranzic · November 13, 2023

All right then... I really need to work on my prompts, but I was able to run Easy Diffusion on my PC. Installed it in Win11 and Linux. The results are from Linux (Ubuntu Studio LTS), will test Win11 later. Running it heats up my laptop about as much as SL does, in shorter bursts.

Original SL photo

Results, first one is at 768 res, the second one is at 960. Limited to that because of GPU memory (4 GB) and the GPU used (GTX 1050). First result took 2 minutes to generate, second result was a little over 4 minutes to generate.

Ceka Cianci · November 14, 2023

Would Artbreeder work for this as well?

How does your avatar look when enhanced by AI?

Recommended Posts

Link to comment

Share on other sites

Top Posters In This Topic

Popular Days

Top Posters In This Topic

Popular Days

Popular Posts

Janet Voxel

UnilWay SpiritWeaver

Cristiano Midnight

Posted Images

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Please sign in to comment

Announcements

Linden Lab

Tilia

Second Life

Connect With Us

Partner With Us