StableDiffusion
- A Simple Quest for a Swing Set
So I work at a home improvement store, and one of my co-workers does some contracting work on the side. He is trying to encourage one of his neighbors to put some simple small-park kind of stuff on a plot of land he owns so that he (my co-worker) can pick up some extra business installing it.
He's seen me messing around with Stable Diffusion on some web apps at work on my down time, and he asked if it was possible if I could take a photo of the site and use AI to insert some of these elements into it so that he could show it to this potential client and maybe sell it to him that way.
"Sure," I said, thinking to myself, 'I can just use inpainting to blend this stuff into the image pretty seamlessly. Easy-peasy.'
It took me almost a full day of on-again, off-again work to get a picnic table I could live with. But I CANNOT get any model, any prompt, anything to make a swing set that I can live with. I've been pecking away at this problem for several days now, and every single attempt at a swing set has resulted in something that is mangled, twisted, or some terrible hybrid of OTHER playground equipment that my co-worker definitely doesn't want in the scene.
At this point I'm just working on it for the challenge, but I admit that I'm stumped. Short of training my own Lora, does anyone have any advice on how to make a coherent swing set to bring into this image? \>\_\< Yes, this is a silly problem to have, I admit that, but I've also learned a great deal about how Stable Diffusion 'thinks' in the last few days, so I consider it a learning experience. Still, does anyone else have any ideas?
- [SDXL] [ALBUM] SDXL Landscape Testing w/ Dynamic Prompt included.
cross-posted from: https://lemmy.dbzer0.com/post/1352467
> These have all been created with the base SDXL release + refiner in Automatic1111's using the extension to add the "refiner" found (https://github.com/wcde/sd-webui-refiner) > > 1216x832 resolution with Hi-Res Fix at 2x using the Siax-200k upscaler and DPM 2 a Karras sampler. > > Here's the dynamic prompt. You need to install the dynamic prompt extension from (https://github.com/adieyal/sd-dynamic-prompts) > > This is the prompt I used > >
{natural | forest | mountain | beach | desert| rural | urban | city | sci-fi | space | dystopian| futuristic}, > > {cinematic shot | establishing shot | intimate scene | sweeping grandeur},(21:9 aspect ratio) > > {(ultrawide panoramic cinematic aspect ratio)} > > {Blender | Photoshop|Octane render|Unreal Engine 5|8K RAW Photo} , > > {2-4$$lone twisted tree | winding river| mountain peak| crumbling ruins| abandoned cabin|wooden fence | dramatic cliffs | stormy sea | rolling thunder | howling wind | foggy moor | charred forest | broken-down cart| towering dunes | parched canyon | bone-strewn pit | petrified woods| wrecked galleon | beast's den| majestic waterfall | calm lake | moonlit trail | moss-covered stones | misty vale |ravaged battlefield | derelict mill} > > {cirrus clouds |stormy sky|cumulus clouds|stratus clouds|nimbostratus clouds|cumulonimbus clouds} > > {clear | atmospheric fog | mist | haze | pollution| dust |smoke |atmospheric halo| sun dogs | moon dogs | sun pillars | circumzenithal arcs|circumhorizontal arcs}, > > {abstracted | concept art| Hyperrealistic| stylized| fantasy| impressionistic | photo| realistic } > > (16K, 32bit color, HDR, masterpiece, ultra quality) > > {brutalist | minimalist| whimsical| retro futurist} > > {muted tones | vibrant hues} > > {warm sunset tones |cool muted blues | colors} > > {natural | warm| dramatic } > > {god rays | sun beams | crepuscular rays| antisolar rays | volumetric light | light pillars | sun pillars | moon pillars}, > > {dawn | sunset| night} {clear | overcast | fog} > > {winter | spring | summer | autumn} > > { volumetric shadows | volumetric ambiance | aerial perspective | depth fog}, > > in the style of > {1-2$$Dylan Furst |Ash Thorp | Simon Stålenhag | Bob Ross| Ralph McQuarrie | Syd Mead| Moebius| Daarken| Felix Yoon| Gustave Doré| Arnold Böcklin| William Blake | Frank > Frazetta| John Constable |J.C. Dahl } > and > {1-2$$James Gurney | Craig Mullins| Android Jones |Justin Maller | John Berkey| Roger Dean| Rodney Matthews | Chris Foss| Nicolas Roeg | Geoffrey Hayes | John Harris| Dinotopia| Jon Foster| Brom| Brian Froud | Alan Lee}, >
> > ! > > ! > > ! > > ! > > ! > > ! > > ! > > ! > > ! > > ! - Cars with Juggernaut model.
Tried my hand at cars. I used the Juggernaut model. The prompt uses regional prompter extension and looks like this:
> > > at night city, summer, sweaty ADDCOMM > RAW photo, Nikon Z6 II Mirrorless Camera, hyper realism, extremely detailed, 8k uhd, dslr, soft lighting, high quality, film grain ADDBASE > ADDROW > ADDCOL > Aston Martin zeekars, hotrod, LED, lora:zeekars:.7 ADDROW > Negative prompt: JuggernautNegative-neg, transition of shapes, blurry, (((numbers on door))), duplicates, close range angle, crop, illustration, drawing, painting, sketching, render, artwork, 3d, cgi, logo, text, letterbox, 3D, render, video game, anime, cartoon, sketch, caption, subtitle, signature, watermark, username, artist name > >
I use SD upscale and ControlNet Tile for upscaling with 4x\_NMKD-Siax\_200k. zeekars is a Lora.
I hope SDXL will improve in all the small details for mechanical, geometrical forms. At first, the image looks great but when you look closely, panels separation, headlight, tyres look unrealistic...We will see tomorrow!
- Just got an XT 7900 XTX running Stable Diffusion on Debian Trixie
Just wanted to put this out there for anyone else who was in the same position, as I'd spent some time banging on this to find a functioning combination and would have appreciated having had success reports myself.
Running Debian Trixie, current as of July 22, 2023.
I see 512x512 speeds of about 2.2 it/s, which is significantly slower than an lower-end Nvidia card than I'd used and significantly (about 1/8th the speed) that other people have reported getting the same XT 7900 XTX card running at on Linux), so there is probably more work for me to do, but it's definitely running on the GPU and is much faster than running on the CPU, so I know that this combination (vanilla system Python, vanilla system drivers, torch nightly in a venv) does at least work, which was something that I'd been unsure of up until now.
Running on the host, no Docker containers. Using a venv. Automatic1111 web UI, in-repository drivers, 2.1.0.dev20230715+rocm5.5 torch via pip installed in a venv, standard system Python 3.11 (i.e. did not need to set up Python 3.8, as I've seen some people do). Needs the
non-free-firmware
apt repo component enabled; I havefirmware-amd-graphics-20230515-3
. Rocm 5.6 is out as of this writing from AMD, but Debian Trixie presently only has 5.5 packaged and in the repos.I did need to install
libstdc++-13-dev
-- onlylibstdc++-12-dev
being installed caused Automatic1111 to bail out with an error in not being able to find alimits
C++ header when building some C++ code at runtime; some users had run into a similar error and resolved it by installinglibstdc++-12-dev
, which was a bit confusing. I have both clang and g++ installed. I am not terribly familiar with the AMD ROCM stack, but my understanding is that part of it (libamdhip64?) performs some compilation at runtime; it apparently remembers the binaries it has compiled, as if I removed libstdc++-13-dev after a successful run, it continued to work.The user running the Automatic1111 frontend needed to be added to the
render
andvideo
groups to have access to the requisite device files.I did not need to have
HSA_OVERRIDE_GFX_VERSION
set.As for options being passed in
COMMAND_ARGS
, just--medvram
and--api
.--xformers
does not work with AMD cards; Stable Diffusion (or Automatic1111, unsure about responsibility in the stack) apparently just ignores it there; passing it doesn't break anything.Some
--opt-sdp
options, like--opt-sdp-attention
, cause dramatic slowdown, I assume causing the generation to run on the CPU instead of the GPU. I'd suggest to anyone trying to get a similar environment running to not start including optimization flags until they have things working without them; this had complicated things for me.I see 2.59 it/s, so something like 20% higher performance, without
--medvram
being passed toCOMMANDLINE_ARGS
.I have not done extensive testing to see whether any issues show up elsewhere with Stable Diffusion.
- Trio of Gamers
Random illustration for a story idea; the story idea may not pan out, but I was proud of how the art came out!
Steps: 40, Sampler: Euler a, CFG scale: 10, Seed: 900795974, Size: 1536x1536, Model hash: 25ba966c5d, Model: aZovyaRPGArtistTools\_v3, Denoising strength: 0.3, Clip skip: 2, Token merging ratio: 0.5, Ultimate SD upscale upscaler: 4x-AnimeSharp, Ultimate SD upscale tile\_width: 512, Ultimate SD upscale tile\_height: 512, Ultimate SD upscale mask\_blur: 8, Ultimate SD upscale padding: 32, Version: v1.3.2
- Head of StabilityAI counting down to 2 on Twitter
I hope it’s the open release of SDXL - the beta on the Stable Diffusion discord is getting pretty impressive.
In any case, I’d like a tea serving drone 😁.
- QR Code Scene for kbin.social
I made a large scale working QR Code scene that points to kbin.social. I used the new QR Code Control for SD 1.5 (released here, and multiple rounds of upscale in img2img using it, controlnet tile, and the t2ia color control, starting from a QR code I made with this QR code generator.