Best text to image generator

Billd111@lemmy.world · 11 months ago

Best text to image generator

rickdg@lemmy.world · 11 months ago

Can you give an example of a complete prompt? Are you using Dall-E, Midjourney, Stable Diffusion…?

It seems that all models need to have prompts crafted specifically for them and you need to follow-up with corrections. The follow-up is critical for pretty much anything these LMMs output.

Ragdoll X@lemmy.world · edit-2 11 months ago

Image-to-image also helps a lot with SD. Even some roughly-drawn blobs can be the difference between the image almost matching what you had in mind vs. looking exactly how you intended.

vibinya@lemmy.world · 11 months ago

My favorite has been locally hosting Automatic1111’s UI. The setup process was super easy and you can get great checkpoints and models on Civitai. This gives me complete control over the models and the generation process. I think it’s an expectation thing as well. Learning how to write the correct prompt, adjust the right settings for the loaded checkpoint, and running enough iterations to get what you’re looking for can take a bit of patience and time. It may be worth learning how the AI actually ‘draws’ things to adjust how you’re interacting with it and writing prompts. There’s actually A LOT of control you gain by locally hosting - controlNet, LORA, checkpoint merging, etc. Definitely look up guides on prompt writing and learn about weights, order, and how negative prompts actually influence generation.

[She/Her] EdgeRunner 🏳️‍⚧️@lemmy.dbzer0.com · 11 months ago

Ive started with stablediffusion_webui, i feel you !!

silas@programming.dev · edit-2 11 months ago

Talking to a text-to-image model is kinda like meeting someone from a different generation and culture that only half knows your language. You have to spend time with them to be able to communicate with them better and understand the “generational and cultural differences” so to speak.

Try checking out PromptHero or Civit.ai to see what prompts people are using to generate certain things.

Also, most text-to-image models are not made to be conversational and will work better if your prompts are similar to what you’d type in when searching for a photo on Google Images. For example, instead of a command like “Generate a photo for me of a…”, do “Disposable camera portrait photo, from the side, backlight…”

simple@lemmy.world · 11 months ago

Dall-E 3 is the easiest to use and usually understand prompts the best. You can use it for free via Bing Image Editor.

[She/Her] EdgeRunner 🏳️‍⚧️@lemmy.dbzer0.com · 11 months ago

Its time to promote, https://lemmy.dbzer0.com/c/stable_diffusion_art.

Very helpfull and relaxing,

CommunityLinkFixerBot@lemmings.world · 11 months ago

Hi there! Looks like you linked to a Lemmy community using a URL instead of its name, which doesn’t work well for people on different instances. Try fixing it like this: [email protected]

Usernameblankface@lemmy.world · edit-2 11 months ago

I use DAL-E 3 through the Bing Image Creator website. It’s free and happens to work well with the way I describe things.

For the full body picture, describe their shoes as well as their hat or hair. Or describe what they’re standing on and what they’re looking at.

Most of the time, DalE will take “do not include thing” to mean “~~do not~~ include thing.” Sometimes starting from Bing chat and asking for it to draw a picture not including a thing works better.

Altima NEO@lemmy.zip · 11 months ago

Dall-E 3 seems to be the easiest to use and from my experience, does pretty well with prompts like that.

The issue is that it’s quick to throttle you after a while m and it’s heavily censored for seemingly innocuous words.

Stable Diffusion can be a bit dumb sometimes, occasionally giving you an image of a person wearing jean everything. Now if you’re willing to put in the time to learn to use Stable Diffusion, and you are able to run it on your PC, it’s got a lot of freedom and unlimited image output as fast as your GPU can handle. You could use the “regional prompter” extension to mark zones where you want jeans to be, a specific shirt, etc. Or use inpaint to regenerate a masked area. It’s more work, but it’s very flexible and controllable.