Stable Diffusion makes videos now?
(twitter.com)
Comments (14)
sorted by:
If this is true, we are only 5 to 10 years away from the death of Hollywood and big media, and on our way to massive VR/Holodecks
This stuff is still super hardware/power intensive. It take can 6-24 hours to render a movie like that, and you're leaving a LOT up to chance.
And it still is bad at doing anything with strong coherence. It can make pieces that look technically impressive but they lack composition. If there's any kind of context or meaning in the pic it's a fluke.
That's what I think every time I see some 'modern art'. Fitting.
Lot of modern art is money laundering bullshit. Get away from the heavily promoted shit and you'll find some quality stuff out there.
The ONLY way I found to have any coherency is making textual inversions and even there you occasionally have some weird issues. I think I fucked myself when I picked a character that has got a hairpin on one side of her face, I flipped the images to have more training and now she just occasionally has 2 hairpins on both sides, lol. Even if you work on coherency you're still having issues like that. I'd expect this to be worked on within a couple years, though. I expect something better to happen in terms of textual inversion also.
especially the learning aspect of AI is super hardware intensive, let alone make a 30 frames movie which is 30 images it has to generate for 1 second...but I've not done a lot of animation with AI(yet).
Good
I didn't realize SD was that capable. I'll have to look into it.
I think Stable Diffusion is just the name for the algorithm that makes the output incrementally more like the inputs - it can apply to any inputs or outputs.
Normally people run a prompt on a random noise image and it 'hallucinates' the final image from it like watching old style tv noise, but it's way cooler if you start with an actual image then it'll re-imagine it as something similar. Like I took a little 2d clipart and it changed the style into photorealistic with the same shape. Pretty sure what this guy did was start with his video zooming in on a donut and make it generate a new image for each frame, one by one.
Unfortunately the dist I found that lets you start with an image only had nvidia support so each small image took 10 minutes so I deleted it... but it was really awesome.
Yeah there's a shit-ton of these videos linked on reddit and 4chan, where they use prompts and the previous image as input for the next frame. At the very least I expect the tools to be heavily used for surreal pseudo-futuristic advertising campaigns, perhaps at the next Super Bowl even.
Not sure if this is the right place - but I know a lot of you are AI experts and this amazed me.
Forgive me for linking to reddit, but yes, animated video is absolutely on the table.
https://old.reddit.com/r/StableDiffusion/search?q=animated&restrict_sr=on&sort=relevance&t=all=
There is also very impressive AI audio generation, this repo in particular produced ridiculously realistic sounding sentences.
https://github.com/neonbjb/tortoise-tts
Blender already has a stable Diffusion add on for rendering a video.
I got one of the OP to Lain.
This one is using low settings that aren't fundamentally changing the picture. Looks more like a deep fried style than making meaningful use of the technology.