AI Videos with Stable Diffusion
Updated: Feb 27
By Harris Terry (dragonsbane2023)
While watching a number of incredible videos altered by AI generation online, I noticed in some there was much more going on than a simple iPhone filter. I pondered how it was done and began thinking about it, and then with some experimentation, I soon learned how to do a basic version of this with only Stable Diffusion as the AI that alters the images, and FFMPEG to modify the video. There are plenty of ways to do this, and this is just how at the moment I am able to get these results, as new programs or apps might automate part of this, especially perhaps something with Stable Diffusion add-ons. Here are a few samples of what I mean:
As you can see, some of the videos have things morphing all around, including changing clothes or even the back wall of a room into a spaceport city. The more time you are willing to take with the process, and the faster your Stable Diffusion install can iterate, the better you can make these videos as you can practice more, and not waste too much time with a video that renders for hours that comes out differently than you wanted. Install
Firstly, you will need to install FFMPEG https://ffmpeg.org/download.html and follow these install directions as it is not a simple .exe, but rather involves adding your directory to your Window’s PATH https://phoenixnap.com/kb/ffmpeg-windows so follow the instructions carefully and do not give up! If using a different OS, make sure to find the correct installation instructions for your OS. Secondly, you will need to have Stable Diffusion installed or at your disposal, specifically the img2img Batch file functionality with the ability to alter hundreds of images. Personally, I run SD at home with a 3070ti 8GB video card, 64GB of RAM, a 13700k i7 5 GHZ processor, and all m.2 SSDs. This allows me to alter widescreen images or iPhone images of 1920x1080 or the reverse without issue, although not very quickly. I will assume you are using a Web-UI version of Stable Diffusion, as I have never used it with just the command line.
Here is the workflow process from beginning to end:
I first create 4 folders - Input, Output, Process, and Backup. I make sure these folders are on my fastest HDD. These will be used for the images to stay organized.
Next, I take the video I am using, and place it in the Process folder. Start with a short video, not more than 30 seconds, or the process might take many hours. I then click in Windows Explorer on the Address Bar of the Process folder and then type CMD to being up the command prompt. Another way would be to just type CMD in the Windows Search bar, and then navigate to the correct folder, but I find just clicking the Address Bar of the correct folder and typing CMD there faster and easier.
Next, I paste this into the command prompt:
ffmpeg -i in.mp4 img%04d.png
Make sure to change the name in.mp4 to the name of your video. It does not need to be an mp4, as I have used .mov files with no issues.
When you press Enter, a bunch of data will begin to show in the command prompt window, hopefully with no errors. Once done, the entire movie will be individual images in the Process folder, all numbered and named. Place a copy of these into the Backup folder in case you need them for something. I leave them in the Process folder as well, so if I later remove frames, the replacements will be there already.
Next, take these images and copy them into the Input Folder. Open Stable Diffusion, and go to the img2img section.
First, we will test out a frame to make sure it is what we are looking for, generally. In the first video above, I decided to make it in a Rococo style. I have been looking into art styles and I really love that one and find it very beautiful and elegant.
I put this in the prompt for that particular video:
“modelshoot style, (extremely detailed CG unity 8k wallpaper), full shot body photo of the most beautiful artwork in the world, Rococo style, angel wings, pristine, silk scarves, silk, professional majestic oil painting by Watteau, trending on ArtStation, trending on CGSociety, Intricate, High Detail, Sharp focus, dramatic, photorealistic painting art by midjourney and greg rutkowski”
You could change just the style and artist and a few words, and have a new style. I felt Watteau was the artist best representing the Rococo style, and I wanted things found in that style, like angel wings, silk scarves, intricate, and so on.
I used this as the negative prompt:
“(((wide face))), black and white, nose ring, jpeg artifacts, cartoon, disfigured, bad art, deformed, blurry, bad anatomy, bad proportions, gross proportions, b&w, weird colors, duplicate, morbid, mutilated, mutation, out of frame, cross-eye, body out of frame, extra heads, extra limbs, extra fingers, extra arms, extra legs, malformed limbs, missing arms, missing legs, fused fingers, too many fingers, long neck, cloned face, mutated hands, poorly drawn hands, poorly drawn face, poorly drawn feet, Photoshop, video game, tiling, 3d render”
Then set the rest as you like. I usually use Euler, 30 steps, CFG 5 for something conservative to 10 for wild, and Denoise at 0.3 to keep it very similar all the way to 0.5 for some bigger changes. Denoise may vary depending on the size of the pics. Make sure to set the seed to something to keep it all similar, or things change too much in my opinion unless that is what you are going for. Drop the first image from the Input folder (the first image of the video) into the window for img2img and then click “Generate”, and see if you like the image.
If you don’t, then refine by changing values, prompt, seed, and so on. Once you have an image you like, switch to the Batch area.
In the Batch section of img2img, we will set a few things, Make sure the Input Path and Output Path are set correctly to the folders we created, and make sure your settings are the same (I think they stick anyhow). Click “Generate” and you should see the batch process begin. If you watch the Output folder, you will see the images begin to appear depending on how powerful your computer is. After the first few are there, check them to see if they are what you are looking for. You do not want to process hundreds of images to discover they are not what you wanted! If they are good, let it continue until done.
Next, go to the Output folder and (if you want) find frames you find are out of place. Usually, I try not to remove any even if not great, only if something pornographic or something offensive pops up. I then place all these back in the Process folder and overwrite everything. If you removed frames, the ones you left behind before will be in the Process folder from before, so now in the Process folder, you should have all the images, processed, in numerical order without gaps.
Next, click the Windows Explorer in the Address Bar from the Process folder, type CMD to bring up the command line again, and paste:
ffmpeg -r 30 -f image2 -s 1920x1080 -i img%04d.png -vcodec libx264 -crf 10 -pix_fmt yuv420p final.mp4
Again, set 30 to your exact frame rate (sometimes its 29.96 instead of 30) or slower if you want slow motion (24 is what I used for the first video above, set your resolution where the 1920x1080 is located (again, same as video) and final.mp4 to the name of your video file before you hit Enter.
To reverse the video, I used this command, as it looked cool going backward:
ffmpeg -i originalVideo.mp4 -vf reverse reversedVideo.mp4
Just make sure to rename the originalVideo.mp4 to the name of the video, and reversedVideo.mp4 to a new title.
I then open the movie in some video editor to add music (funnily, I use Video Editor from windows, as it's easy to add music or change video speed to 0.8 to make it more slow-motion).
And voila! You should have your movie with animations from the prompt you set. Again, it is important to check a few images in the Output folder so that you make sure it is the way you want, as at times it takes so long to render hundreds of frames, it will be discouraging to see it messed up because the Denoise or CFG were not set to the way you want.
You can find my work at