• Hey, guest user. Hope you're enjoying NeoGAF! Have you considered registering for an account? Come join us and add your take to the daily discourse.

Get Stable diffusion locally on your PC (RTX card needed) no restrictions and no censorship.

Tumle

Member
RTX specifically is not required. Any Nvidia card which can run CUDA, which is basically all of them for the past decade, can run this.

However your limit is VRAM. You need a lot. My RTX 3090 with 24 GB of VRAM throws errors trying to render 1024x1024, though it looks like for a lot of people the sweet spot is 704x512.
i can run 512x512 on my RTX 3070... the incline in ram needed must be very steep.
 

Tumle

Member
This is interesting and scary, as I thought art would never be a field that would be automated by machines. In my opinion the best application for this seems to be landscapes for concept art. I'm curios to know how this works, if it's just "remixing" artwork or creating new pieces.
Not sure exactly how it works.. but its not just remixing pictures. i'll see if i can find the video i saw about it on youtube.. :)
 

Mikado

Member
It just refuses to put an entire character in the frame :messenger_face_steam:

(That's probably merciful, since it doesn't do a great job with faces in this hand-drawn style anyway)

TIkRF1P.png
TxyTLuL.png


sgcdhbe.png


9ff445f2-2bed-44b9-8701-d1f17a3e29a2_text.gif


Honestly? I'm not sure the stable diffusion method will really lead to consistently presentable AI-generated art for things like character design.
Not that such a thing won't be achieved but I think it will be a different approach. SD just feels like a bit of a local maxima at the moment.

But it's great for breaking through a creative block and getting a bunch of ideas to riff on manually: "I like the helmet from this one, and the boots from this one."

Edit: Actually the biggest breakthrough is being able to run it locally with no cost-per-generation or other limits.

Like, I can just leave it running on a spare machine for hours. It generates a plausible concept every 12 seconds. In the time it would take to work up one design to a semi-polished state, hundreds or even thousands more sketches would be available. This isn't true for the web versions.
 
Last edited:

Mikado

Member
Also see people use exclamation marks and parentheses.. do they have any affect on the outcome?
I tried this a bit. It seems like it sort of changes the weights on the tokens - and produces different images for that same seed (with otherwise the same prompt). But I couldn't correlate those changes to any markup I made around specific words (like if I said something like `face!!!!!!!!!!` it's still not any more likely to have the face in the frame, much less emphasized. Instead, like, a boot will be a different pattern).
 
Try MidJourney, it blows all these programs away and is terrifying...heres some of what ive done on it.

I see adobe integrating this into Photoshop so artists can quickly iterate their own ideas into things.

grid_0.png

grid_0.png

grid_0.png
 

Mikado

Member
Try MidJourney, it blows all these programs away and is terrifying...heres some of what ive done on it.

I see adobe integrating this into Photoshop so artists can quickly iterate their own ideas into things.

(Personal rant time)

Midjourney and all the "Other Peoples' Computers" services definitely make some great results but I have several problems with them from my own perspective (which is why I've been sleeping on this whole field so far):

- I hate "cloud" services in general, but especially for art tools.
- Pay-per-iteration is an incredibly stupid concept especially for a stochastic process like this (I think can pay $600 for some sort of subscription to MJ that might get you more tokens or sthg? Not sure, because screw that, and also, the UI frontend is a discord channel? :nougat_rofl: )
- As a principle, I very much don't like the "Tools Dictate Their Usage" policy. Someone else gets to decide what you are allowed to create. Today maybe it's tits. Tomorrow maybe it's the wrong politics.

I have no expectation that Adobe's (inevitable) implementation will be any different since they already don't stop harassing people about making sure they save everything online instead of locally.

Being able to run your own tools, on your own machine, with a model of your own choosing (yes, someone else probably built the model but you can choose which one you integrate), is a Big Deal to me.

Some peoples' Hills To Die On is, like, paying full-price for Day One Cosmetic DLC. For me, it's non-local art tools. It is what it is.
 
(Personal rant time)

Midjourney and all the "Other Peoples' Computers" services definitely make some great results but I have several problems with them from my own perspective (which is why I've been sleeping on this whole field so far):

- I hate "cloud" services in general, but especially for art tools.
- Pay-per-iteration is an incredibly stupid concept especially for a stochastic process like this (I think can pay $600 for some sort of subscription to MJ that might get you more tokens or sthg? Not sure, because screw that, and also, the UI frontend is a discord channel? :nougat_rofl: )
- As a principle, I very much don't like the "Tools Dictate Their Usage" policy. Someone else gets to decide what you are allowed to create. Today maybe it's tits. Tomorrow maybe it's the wrong politics.

I have no expectation that Adobe's (inevitable) implementation will be any different since they already don't stop harassing people about making sure they save everything online instead of locally.

Being able to run your own tools, on your own machine, with a model of your own choosing (yes, someone else probably built the model but you can choose which one you integrate), is a Big Deal to me.

Some peoples' Hills To Die On is, like, paying full-price for Day One Cosmetic DLC. For me, it's non-local art tools. It is what it is.
oh dude im right there with you, i dont like the idea of paying for this and would rather it be on my own pc. Though right now...the tools i tried here suck compared to MJ, im hoping that changes though. Fortunately mj is actually cheap...the 600 is only for large AAA type studios...its 10/m for 200 images or 30 for unlimited.

As an industry artist ...yea....this kinda sucks for my concept artist friends. Im a 3d artist though and i welcome lots of ai help with the more tedious things...substance tools for instance...but yea eventually ai will be coming for me also. Though it will take a little longer.

There will always be a need for both types of artists in some capacity though. Remember ai only pools from the available data and input provided by whats available....its mimicing happy woman..not knowing what one is. You cant paint in the style of mechealangelo if there was no michaelangelo to pull from.

That being said its basically the early stages of the holodeck.....and i recall people still learning to play music, paint and perform plays even with a holodeck.
 

EviLore

Expansive Ellipses
Staff Member
portrait of nick offerman, d & d, wet, shiny, fantasy, intricate, elegant, highly detailed, digital painting, artstation, concept art, smooth, sharp focus, illustration, art by artgerm and greg rutkowski and alphonse mucha


6091c171-e89f-47ff-8c09-15951067c4cf


900fe94f-9223-4662-95c3-286944ee4016


86f028fd-0695-439a-bce3-2b93845a52d3
 

Mikado

Member
As an industry artist ...yea....this kinda sucks for my concept artist friends. Im a 3d artist though and i welcome lots of ai help with the more tedious things...substance tools for instance...but yea eventually ai will be coming for me also. Though it will take a little longer.

There will always be a need for both types of artists in some capacity though. Remember ai only pools from the available data and input provided by whats available....its mimicing happy woman..not knowing what one is. You cant paint in the style of mechealangelo if there was no michaelangelo to pull from.

Right now (for professionals), I think it would mostly be useful as a modern Deck of Oblique Strategies. Speaking for myself, there is a tendency to fall back on familiar pattern-language (3 Holes in a Triangular Formation! 45 Degree Panel Lines! That X-Shaped Indentation!). Maybe there's some value to be had in "Just Letting it Happen" and possibly coming up with something that one wouldn't have thought of on their own under the same time constraints, then providing a layer of "intelligence" by repainting or re-combining items from the collection of "found objects".

Of course I'm finding that this process has a pattern-language of its own and certainly, as you mentioned, it's mostly just regurgitating elements learned from existing visual treatments. It's effectively collage and not creation. Will be interesting to see where it goes.
 

EviLore

Expansive Ellipses
Staff Member
hyperrealistic portrait of high detail amber heard as a bee queen in ornate black robe yellow swan feathers as the mistress in fear being chased horror. by jeremy mann, fantasy art, photo realistic, dynamic lighting, artstation, poster, volumetric lighting, 4 k, award winning


ebac1b93-199c-43b9-89f6-b7711902dad3


ab777179-a4dc-43dd-9422-974f4f62a6d4




Elsa from frozen portrait of Scarlett Johansson, au naturel, hyper detailed, digital art, trending in artstation, cinematic lighting, studio quality, smooth render, unreal engine 5 rendered, octane rendered, art style by klimt and nixeu and ian sprigger and wlop and krenz cushart


f439695a-975a-4f5f-a315-b0b7eff70160


scarlett johansson as delirium from sandman, ( hallucinating colorful soap bubbles ), by jeremy mann, by sandra chevrier, by jamie hewlett and richard avedon, punk rock, tank girl, high detailed, 8 k


97bdfdb8-be5b-4fb7-a128-4fb9d2a068c1


96e8da0a-5bc5-47f5-922e-bb57d0936f1d
 
yMjpSXg.png


QuMCE09.png


My prompt: "Super Mario as the Hunter from Bloodborne, in the style of el greco, late renaissance, hyper realistic, hyper detailed, trending on artstation"

This is wild. My RTX 3080 can generate one picture per 17 seconds. So much fun and the possibilities seem limitless.

zqpUldR.png

"Super Mario jumping on a realistic turtle." Turtle is not pleased.


bsNPjXD.png

"Dynamic concept art of Max Payne wearing a leather jacket, neo noir, rainy streets, cyberpunk, neotokyo, stunning lighting, highly detailed, realistic, 4k"

Edit: sidenote: using myself as a case study, there is still artistry involved when using this program--meaning that i can put mash-up/remix prompts in, but I'm not getting the kind of outputs that would win an art contest. In that way, I think it might change WHAT an artist is, but wont necessarily change WHO the artists are, if that makes sense. But it's very fun to play with as a layperson!
 
Last edited:

Mikado

Member
i8a1J1N.png
JBxvznL.png


I'd have watched whatever OVA these were designs for, back in the 90's.

I wish we could get a list of the elements this thing pulled in to make the images so we could check out the original works. Some of these are way too clean to not be literal rips of something that exists.
 

Wildebeest

Member
It just refuses to put an entire character in the frame :messenger_face_steam:

I tried some magic words to get characters in frame, and used a subject where it didn't matter that the face looked like a dog's breakfast. Seemed to work.

"Greenskin Half Orc warrior brooding in an attractive way, by Akim Kaliberda, pathfinder character art digital art trending on artstation HQ"


XirsHau.png
 

Mikado

Member
Can someone TLDR this for me. So this is all generated by AI in the form of text and scripting?

Paraphrasing and simplifying here, but this approach sort of works as follows:

It starts with a field of noise. Then, through a series of iterative steps it tries to reverse the noise back into image by selecting from a group of input functions (themselves created by processing a large collection of input images with associated keywords - mostly scraped off the internet), weighted by the user's prompt keywords. If you don't run enough iterations it just produces chaos.

Prompt:
close up face shot of a beautiful young woman, cute face, tacticool, sci-fi, cables, digital illustration, line art by yoji shinkawa and masamune shirow'

1 Iteration:
KBk8Z8t.png


2 Iterations:
uJ4f6rs.png


4 Iterations
YljNQX0.png


5 Iterations:
ailiApb.png


10 Iterations:
zv7VVzK.png



50 Iterations:
uSZ9tz8.png


The entire approach is strongly influenced by the breadth and quality of the source data (called the "model").

Edit: You can see some of the locality at work by slightly changing the prompt and keeping the same seed. I just added the word sunglasses to the above prompt and got this:
bItnVmB.png


You can see that while the overall shape of the image (determined largely by the first few iterations) is the same, but the addition of the sunglasses elements clearly brought in a different input for a lot of nearby face details, so the mouth and chin structure ended up completely different despite having no different guidance there. Also the tops of the frames are.... weird, probably where it's slightly indecisive w.r.t the sunglasses being a better match for the original dark eyes.
Screwing with the weighting of the sunglasses keyword by calling it sunglasses!!!!!!!!! seems to influence the weighting but that doesn't mean it makes better sunglasses (because computers don't know anything about the elements they're bringing in), it apparently just increases the weighting of image functions that contain the sunglasses keyword (which itself doesn't mean anything, because the keyword->function association is only as good as the original tagging for the source images).

Disturbing results ensue:
43fh7KY.png


Overall, it's a neat trick, but it's not magic.
 
Last edited:

Droxcy

Member
This is interesting and scary, as I thought art would never be a field that would be automated by machines. In my opinion the best application for this seems to be landscapes for concept art. I'm curios to know how this works, if it's just "remixing" artwork or creating new pieces.

It's a good starting point for level design & character design I'll generate some prompts of an art direction or feel I want and then go based on that. But replacing real workers It won't happen
 
Last edited:
Paraphrasing and simplifying here, but this approach sort of works as follows:

It starts with a field of noise. Then, through a series of iterative steps it tries to reverse the noise back into image by selecting from a group of input functions (themselves created by processing a large collection of input images with associated keywords - mostly scraped off the internet), weighted by the user's prompt keywords. If you don't run enough iterations it just produces chaos.

Prompt:
close up face shot of a beautiful young woman, cute face, tacticool, sci-fi, cables, digital illustration, line art by yoji shinkawa and masamune shirow'

1 Iteration:
KBk8Z8t.png


2 Iterations:
uJ4f6rs.png


4 Iterations
YljNQX0.png


5 Iterations:
ailiApb.png


10 Iterations:
zv7VVzK.png



50 Iterations:
uSZ9tz8.png


The entire approach is strongly influenced by the breadth and quality of the source data (called the "model").

Edit: You can see some of the locality at work by slightly changing the prompt and keeping the same seed. I just added the word sunglasses to the above prompt and got this:
bItnVmB.png


You can see that while the overall shape of the image (determined largely by the first few iterations) is the same, but the addition of the sunglasses elements clearly brought in a different input for a lot of nearby face details, so the mouth and chin structure ended up completely different despite having no different guidance there. Also the tops of the frames are.... weird, probably where it's slightly indecisive w.r.t the sunglasses being a better match for the original dark eyes.
Screwing with the weighting of the sunglasses keyword by calling it sunglasses!!!!!!!!! seems to influence the weighting but that doesn't mean it makes better sunglasses (because computers don't know anything about the elements they're bringing in), it apparently just increases the weighting of image functions that contain the sunglasses keyword (which itself doesn't mean anything, because the keyword->function association is only as good as the original tagging for the source images).

Disturbing results ensue:
43fh7KY.png


Overall, it's a neat trick, but it's not magic.

Pretty fucking cool
 
Paraphrasing and simplifying here, but this approach sort of works as follows:

It starts with a field of noise. Then, through a series of iterative steps it tries to reverse the noise back into image by selecting from a group of input functions (themselves created by processing a large collection of input images with associated keywords - mostly scraped off the internet), weighted by the user's prompt keywords. If you don't run enough iterations it just produces chaos.

Prompt:
close up face shot of a beautiful young woman, cute face, tacticool, sci-fi, cables, digital illustration, line art by yoji shinkawa and masamune shirow'

1 Iteration:
KBk8Z8t.png


2 Iterations:
uJ4f6rs.png


4 Iterations
YljNQX0.png


5 Iterations:
ailiApb.png


10 Iterations:
zv7VVzK.png



50 Iterations:
uSZ9tz8.png


The entire approach is strongly influenced by the breadth and quality of the source data (called the "model").

Edit: You can see some of the locality at work by slightly changing the prompt and keeping the same seed. I just added the word sunglasses to the above prompt and got this:
bItnVmB.png


You can see that while the overall shape of the image (determined largely by the first few iterations) is the same, but the addition of the sunglasses elements clearly brought in a different input for a lot of nearby face details, so the mouth and chin structure ended up completely different despite having no different guidance there. Also the tops of the frames are.... weird, probably where it's slightly indecisive w.r.t the sunglasses being a better match for the original dark eyes.
Screwing with the weighting of the sunglasses keyword by calling it sunglasses!!!!!!!!! seems to influence the weighting but that doesn't mean it makes better sunglasses (because computers don't know anything about the elements they're bringing in), it apparently just increases the weighting of image functions that contain the sunglasses keyword (which itself doesn't mean anything, because the keyword->function association is only as good as the original tagging for the source images).

Disturbing results ensue:
43fh7KY.png


Overall, it's a neat trick, but it's not magic.
That's really helpful to understand how it's working. Thank you.
 

Tumle

Member
Try MidJourney, it blows all these programs away and is terrifying...heres some of what ive done on it.

I see adobe integrating this into Photoshop so artists can quickly iterate their own ideas into things.

grid_0.png

grid_0.png

grid_0.png
Midjourney is implementing stable diffusion in a beta right now😊
Also not a fan of having to use discord to generate..
they all have there strengths and weaknesses
 

Makoto-Yuki

Gold Member
i just tried to install this but i can't get it working.

i downloaded "anaconda" and extracted the github into it. i had to download something else but i figure out where the fuck to download it. i signed up to the site but i don't see any download link. how hard is it for websites these days to give you a simple download link?

anyone able to help me out?

edit: nevermind. i never read the OP properly lol. i was trying to do it all from the github page.
 
Last edited by a moderator:

Tumle

Member
i just tried to install this but i can't get it working.

i downloaded "anaconda" and extracted the github into it. i had to download something else but i figure out where the fuck to download it. i signed up to the site but i don't see any download link. how hard is it for websites these days to give you a simple download link?

anyone able to help me out?

edit: nevermind. i never read the OP properly lol. i was trying to do it all from the github page.
Yea you shouldn’t have to do all the GitHub stuff, just download and execute 😊
 

Pakoe

Member
how are you getting such good results? what are your prompts?
old city! on fire on a rainy battlefield. by daniel f. gerhartz and matt stewart, fantasy, photorealistic, octane render, unreal engine, dynamic lighting, beautiful, perfect factions, trending on artstation, poster, volumetric lighting, very detailed faces, 4 k, award winning
 

Makoto-Yuki

Gold Member
old city! on fire on a rainy battlefield. by daniel f. gerhartz and matt stewart, fantasy, photorealistic, octane render, unreal engine, dynamic lighting, beautiful, perfect factions, trending on artstation, poster, volumetric lighting, very detailed faces, 4 k, award winning
what's with the exclamation marks? what do they do?

and what's the "by *artists names*"? does this connect online to find them?
 
Last edited by a moderator:

Pakoe

Member
what's with the exclamation marks? what do they do?

and what's the "by *artists names*"? does this connect online to find them?
I'm no expert, let that be clear lol.
I've read that the marks puts an emphasis on the word, haven't completely tested it yet.
As for the artist, I think it tries to emulate their style.
 

Tumle

Member
what's with the exclamation marks? what do they do?

and what's the "by *artists names*"? does this connect online to find them?
No doesn’t connect online at all, it’s all stored in the learning algorithm, i tested with diffusing a prompt with out being connected to the internet and it still worked😊
 
Last edited:

Mikado

Member
Boaty McBoatface 2: The Boatening:
9shPXeR.png
MHMVfUr.png

T9ZyfAx.png



Tempted to keep generating these till the heat death of the universe.

I'm not convinced that this algorithm is going to have that much effect on professional art creation.
But career meme-crafters are going to have to start serious looking for new jobs.
 

Makoto-Yuki

Gold Member
not mine but found on lexica. these are amazing. i don't know how people are getting such good results!

0eec9748-48f3-4545-b982-5242730815b1
27af1c69-af76-45ab-9117-a0208165de6d


edit: this is the best i've got so far lol i think it's kinda cool. the program isn't working anymore for me (something about cuda memory?) so that's my fun over.

qK9gMfT.jpg
 
Last edited by a moderator:

Tumle

Member
not mine but found on lexica. these are amazing. i don't know how people are getting such good results!

0eec9748-48f3-4545-b982-5242730815b1
27af1c69-af76-45ab-9117-a0208165de6d


edit: this is the best i've got so far lol i think it's kinda cool. the program isn't working anymore for me (something about cuda memory?) so that's my fun over.

qK9gMfT.jpg
Sounds like it doesn’t clear the memory on your graphicscard after your prompts..😕
Oh and try copying there prompts about details and settings, but insert your scene prompts.
Also try the seed number they used and any other settings they have changed 😊
 
Last edited:
Top Bottom