Furkan Gözükara

Ovi SECourses Premium App to Generate Audio Having 121 Frames Videos from Text and Images - Supports all GPUs Including RTX 5000 Series - Has Flash Attention + Batch Processing and Block Swapping - As Low as 8 GB VRAM - Like VEO 3 and SORA 2 - 1-Click to Install on Windows, RunPod and Massed Compute

Added 2025-10-08 21:00:00 +0000 UTC

Patreon exclusive posts index to find our scripts easily, Patreon scripts updates history to see which updates arrived to which scripts and amazing Patreon special generative scripts list that you can use in any of your task.

Join discord to get help, chat, discuss and also tell me your discord username to get your special rank : SECourses Discord

Please also Star, Watch and Fork our Stable Diffusion & Generative AI GitHub repository and join our Reddit subreddit and follow me on LinkedIn (my real profile)

=======

Latest zip file : Ovi_Pro_v8.zip

Full scale ultra advanced app for Ovi - an open source project that can generate videos from both text prompts and image + text prompts with real audio.

When Clear All Memory is selected (default in 32 GB and below presets) make sure to click Cancel button first and then close CMD or it will continue working as a subprocess

Project page is here : https://aaxwaz.github.io/Ovi/
I have developed an ultra advanced and easy to use Gradio app and much better pipeline that fully supports block swapping
- Our block swapping is based on Kohya Musubi tuner implemention thus it is the best in the world right now
Our app also supports Block Based FP8 Scaling which is also based on Kohya Musubi tuner and it is also the best in the world right now from quality point
- So we are not using base FP8 but using FP8_Scaled when enabled with higher quality
- With intelliengt Block Based Scaling there isn't almost any quality loss
- Our FP8_Scaled Base model reduces VRAM like 10 GB and its safentesors file will be auto downloaded as well
Now we can generate full quality videos with as low as 6 GB VRAM with Block Swapping + Tiled VAE
- Our implemented tiled-VAE is same as how ComfyUI does so it is perfect quality and best out there
The 1-click installer will install into Python 3.10.11 venv and will auto download models as well so it is literally 1-click
- My installer auto installs with Torch 2.8, CUDA 12.9, Flash Attention 2.8.3 and it supports literally all GPUs like RTX 3000 series, 4000 series, 5000 series, H100, B200, etc
All generations will be saved inside outputs folder and we support so many features like batch folder processing, number of generations, full preset save and load
- All generations will have metadata txt files saved as well
Look the examples to understand how to prompt the model that is extremely important
- You can use Google Studio AI and Gemini for free to write new amazing prompts hopefully I will show in upcoming tutorial
Look our below screenshots to see the app features
50 Steps recommended but you can do low too like 20
1-Click to install on Windows, RunPod and Massed Compute
Optimized presets for literally every GPU (starting from 6 GB to 96 GB)

15 October 2025 V8.3

New checkbox Merge LoRAs on GPU added
This is for cloud services faster LoRA merge
This is auto enabled in 80 gb and 96 gb configs but should work with 48 gb GPUs as well

9 October 2025 V8.1

Lots of amazing new examples added
Added new example tabs T2V Video Extend Examples and I2V Video Extend Examples
- These tabs will auto set Video Extend examples, and duration to 4 second for each clip - total 12 second
- Now when you click load example, it will auto switch back to generate tab
New zip file has prompt_generate_guide_in_Gemini which you can use to generate prompts
Automatic set of Aspect Ratio with different base resolutions bug fixed
Manually set of Video Width and Video Height bugs fixed
Please run Windows_Install_and_Update.bat to very quickly update to latest version and see on your Gradio top left 8.1

Windows Requirements

Python 3.10.11, FFmpeg, CUDA 12.9, cuDNN 9.12, C++ Tools, MSVC and Git
If you get any errors follow below video and its source link
https://youtu.be/DrhUHnYfwC0
Source post : https://www.patreon.com/posts/click-to-open-post-used-in-tutorial-111553210

Massed Compute (Recommend Cloud) :

Please register via this link : https://vm.massedcompute.com/signup?linkId=lp_034338&sourceId=secourses&tenantId=massed-compute
- Use our coupon SECourses
- Our coupon works on all GPUs now
  - H100 has amazing price and speed but you can use like RTX A6000 ADA as well
  - Full details here : https://www.patreon.com/posts/26671823
- Then select our image SECourses from Creator dropdown
- Then follow Massed_Compute_Instructions_READ.txt
- Same as my any other Massed Compute installer script
- Example tutorial for learn how to install and use Massed Compute
  - (Starts at 12:58) : https://youtu.be/KW-MHmoNcqo?si=G1WbG-Qw4ujWvOtG&t=778

RunPod (Cloud):

Please register via this link : https://runpod.io?ref=1aka98lq
- Then follow Runpod_Instructions_READ.txt
- Same as my any other RunPod installer script
- Use the template written in Runpod_Instructions_READ.txt file
- Example tutorial for learn how to install and use RunPod
  - (starts at 22:03) : https://youtu.be/KW-MHmoNcqo?si=QN8X8Sjn13ZYu-EU&t=1323

9 October 2025 V7.6

This is a massively bug fix and performance improvement update
This time we fixed for real finally :D
Now we have prompt caching feature - auto enabled
- It will generate hash value of your Video Prompt + Video Negative Prompt + Audio Negative Prompt + FP8_Scaled enabled or not
- With this hash value, it will check if T5 encoding exists in prompt_cache or not
- If exists, it will skip T5 encoding, will speed up immersely, if not it will cache and save
- This should work in all cases we have like single processing, batch processing, multi line, video extend, etc
I have entirely re-factored T5 system for above change and hopefully now we have finally fixed T5 infinite loop error, so update and try again
- The loop error was caused auto enabling Delete T5 After Encoding and
  Clear All Memory at the same time and now issue fixed
I have fixed the RAM leak that was happening when Clear All Memory was not enabled
- Therefore now 96 GB RAM PCs can disable Clear All Memory, It may work with 64 GB RAM too so test
- Still if you get error enable Clear All Memory
When Clear All Memory was not enabled, changing video duration after first run was not working since it was initializing and never changing again
- This bug fixed and changing video duration should work in all cases now
Auto pad for 32px divisibility checkbox added
- When enable, it won't crop any part of image, it will auto downscale to target resolution and it will fill missing parts with black pixels to make it divisible to 32
- Use with Auto Crop Image - don't disable it
Do not auto enforce validation check was not working in all cases but this issue is fixed and should work now
Prompt validation system improved
- Now it will show you errors of single generation on Gradio
- So you will see your error
- When doing batch processing, it wont start the batch and show errors on cmd
  - You can always enable Do not auto enforce validation check and skip auto enforce
Sorry for the errors

8 October 2025 V6.4

LoRAs were not working accurately with FP8_Scaled base model and this issue fixed
However, for LoRAs to work, we have to each time re-scale on the fly, so cached scale won't be used
- This adds like 10 seconds delay
- This LoRA having version F5_Scaled won't be saved, we can add save feature if you wish but that means 11.5 GB model file for every LoRA combination
- I will look further if there is any way to apply LoRA to F5_Scaled base model cache
In some cases, users reported that it was infinite looping T5 encoding, this issue hopefully fixed
Hopefully I will add T5 text encoder caching to cache directory, so same prompt will use cache directly not re-cache
Hopefully I will add load background sound, so you can upload background music and auto added
Just run Windows_Install_and_Update.bat to very quickly update to latest version and see on your Gradio top left 6.4

6 October 2025 V6.3

Exact resolution issue fixed and the model will use exactly the resolution you give on interface Video Width and Video Height
When doing batch processing, based on your base Width and Height, it will auto crop and resize your input folder images accurately to exact resolutions
- e.g. when base resolution is 960x960, 1152x1728px image will be generated as 768x1184
Be careful bigger resolution uses more VRAM
When doing video extension, it was not keeping Sage Attention selection, now will respect your selection
First example prompt issue fixed
Do not auto enforce validation check added
- So you can generate videos without any speech tags
Now we are supporting up to 4 LoRAs
- Put your loras into lora or loras folder - case insensitive
- You can apply LoRA to Video Layer, Sound Layer or Both
- An example working LoRA : https://civitai.com/models/1936797/glowing-eyes-wan-22-5b-i2v?modelVersionId=2192059
  - Verified working
- LoRA feature not fully tested yet but seems like working perfect
Just run Windows_Install_and_Update.bat to very quickly update to latest version and see on your Gradio top left 6.3

6 October 2025 V5.9

This is a super important major update that almost completes our app into maximum quality
Now you can use both videos and images as input
- When you upload a video as input, it will get the last frame, auto crop it if enabled, use it as a reference image, then it will generate your video and merge back with your input video, so you have basically video extension of existing videos feature right now
- If you don't want auto combine, enable Don't auto combine video input checkbox and it won't auto combine just use last frame of video as an input
Now we have Multi-line Prompts feature
- When this is enabled, the prompt box input will be seperated into lines and every new line prompt will become an individual prompt and it will generate a video for each prompt
- Lines lesser than 3 characters will be ignored so you can put 2 new line spaces and write if you wish
- This will work with batch processing as well, just have your prompts multiple lines in your batch processing folder
- Don't enable Multi-line Prompts and Video Extension at the same time
Now we have Video Extension (Last Frame Based) feature
- When this is enabled, it will extend your video based on the number of lines you have
- Lets say you have a prompt that is 3 lines
- So first line will be base prompt and will generate 0001.mp4
- The second line will be second prompt, it will get last frame of 0001.mp4 and use it as an input image, use second line prompt and will generate 0001_ext1.mp4
- The third line will be third prompt, it will get last frame of 0001_ext1.mp4 and use it as an input image, use third line prompt and will generate 0001_ext2.mp4
- After all generations done it will merge all generations and generate 0001_final.mp4
- You can extend as many as times you want with number of lines, fully automatic and working pretty good
- I will hopefully add example for this into examples tab soon
- Don't enable Multi-line Prompts and Video Extension at the same time
Now we have presets for all lower VRAM GPUs for Scaled FP8 Base Model
- Scaled FP8 Base Model is working perfect and 24 GB GPUs can generate 5 second videos without any Block Swap, thus ultra fast
I have implemented Sage Attention and working perfect
- It did speed up 15% during inference and now it is auto enabled in all presets
Auto cropping logic improved, some bugs fixed and made more robust
Automatic prompt format validation system implemented
- When you click Generate button it will check and throw error if not valid
- There is also Validate Prompt Format button now for you to validate and see errors
Entire Gradio app font changed to Tahoma for better readibility
New prompting feature setting duration in prompts
- If you write in beginning of your prompt like {2} it will make that video generation as 2 seconds
- This feature is useful for multi line generation and video extension features
- So the format is {x}, if there is no such format, it will use duration slider set value
- Example prompt
  - {4} A man is doing a podcast video. He is saying <S> Hi guys! How are you! Did you know, I am not real? <E> He continues to talk.
  - {2} A pod cast making man talking. He is saying <S> Like for real, I just found out. I was made by Furkan's Ovi app! <E> He then giggles <S> Hi-hi-hi! <E>
You can also write speaking prompts like this
- <S>[strong sound] I am an artificial intelligence android robot.<E>
- <S>[soft whisper] I am an artificial intelligence android robot.<E>
Don't modify our presets and save as your modified presets since when you update, they will be overwritten back to originals
Don't close running cmd window immediately, first click cancel and then close cmd

5 October 2025 V4.4

This is a super important update from performance wise
Now Delete T5 After Encoding will be auto enabled if your RAM is under 64 GB but if you don't have 96 GB RAM i really recommend to enable
Now when Delete T5 After Encoding is enabled, it will start it as a sub-process therefore there will be 0 RAM and VRAM leakage
Now CPU-Only T5 will load like 2x faster
New option Scaled FP8 Base Model added
- This saves like 10 GB VRAM
- If you had generated with V4.0 delete it from Ovi_Pro\ckpts\Ovi folder and regenerate
With FP8 Base Model, 24 GB GPUs can generate without any Block Swap
- It uses like 17 GB VRAM at the moment during inference without Block Swap
T5 loading speed increased for all BF16, FP8_Scaled and CPU
FP8 Scaled T5 and Base model will be auto downloaded now
Working on fixing longer generation - reported to be broken
Just run Windows_Install_and_Update.bat to very quickly update to latest version

5 October 2025 V3.8

This is a big update and I am still testing so many new amazing features
- Still in testing so report errors
Fully automatic aspect ratio resolutions and detection based on your entered Base Width and Base Height
Like set 960x960 and you will get 1280x704 automatically for 16:9
Moreover now you can generate bigger resolution videos like 1280x704 but remember it would use more VRAM than 960x544 - base resolution which is 720x720
Longer generations are also available so give them a try
Now it will show uploaded image resolution
Now it will auto crop to the new desired resolution you give immediately and show cropped image resolution too
Batch processing fixed and auto cropping perfected
- It will auto recognize your images in your folder and process them with their closest aspect ratio based on your base resolution
Just run Windows_Install_and_Update.bat to very quickly update to latest version

5 October 2025 V3.4

Ok this is a massive update
We have added full tiled-VAE same as ComfyUI and working amazing
Now with Block Swapping + tiled-VAE + T5 Text Encoding on CPU (still super fast) we can generate 121 frames 5 second videos as low as on 6 GB GPUs
I have added presets for every GPU out there and the app will automatically detect your GPU and select your preset when you first time install and start
Cancel button was not working properly and now working perfect
3:4 and 4:3 aspect ratios added as well
Original repo was forcing all resolution to be 720x720, I have added a new feature called as Force Exact Resolution and with this you can generate with higher resolution like 1280x704
- It must be divisible to 32
- Auto crop will auto handle this
- All presets have this feature enabled by default but remember VRAM presets are made for 720x720 base resolution, higher resolution uses more VRAM
Much more robust preset system developed to not have any errors when saving or loading older presets
If you are low on both VRAM and RAM try this Delete T5 After Encoding + Scaled FP8 T5 + CPU-Only T5
Inaccurately showing previous generation result on interface fixed
Just run Windows_Install_and_Update.bat to very quickly update to latest version

4 October 2025 V2.9

Delete Text Encoder After Encoding : Now you can enable or disable
Now will load Text Encoder directly into VRAM, encode and then delete or move according to your selection - Therefore it is was faster than before
Clear All Memory added and recommended - 0 VRAM and RAM leak
Scaled FP8 T5 - reduces T5 VRAM usage but slower tto load - quality same
- It will be auto enabled when your VRAM is below 23 GB and you don't load a preset
Delete T5 After Encoding - enable if low on RAM
Preset save and load fully working now
Consequent generation error fixed with Clear All Memory
With CPU Offload now we will move VAE to RAM while not needed and thus with 29 Block Swap now it uses only 6 GB VRAM during inference
Upcoming tiled VAE hopefully and FP8_Scaled model loading and video extending - loop
With v2.9 it will save FP8_Scaled version of T5 and use it when you next time used Scaled FP8 T5 to speed up
- The file will be saved inside ckpts\Wan2.2-TI2V-5B
Just run Windows_Install_and_Update.bat to very quickly update to latest version

Full tutorial video coming soon hopefully

Comments

looks like your lora is incompatiable. so far we have verified Wan 2.2 5B model loras working. what lora is you trying?

Furkan Gözükara

2025-10-26 16:42:02 +0000 UTC

Hi!, everything is great, the samples work I'm using 3080Ti 12GB . However, I'm having issues with LORA, it will not apply, please assist: [VIDEO MODEL] Merging 1 LoRA(s)... WARNING:root:⚠ DETECTED: bfloat16 CPU matmul is catastrophically slow on this system! WARNING:root: Automatically enabling float32 workaround for LoRA merging WARNING:root: Recommendation: Downgrade PyTorch to 2.4.x or 2.5.x (current: 2.8.0+cu129) Merging LoRA layers: 0%| | 0/1035 [00:00CPU transfers for each layer! WARNING:root:Failed to merge LoRA for layer blocks.0.self_attn.q.weight: shape '[3072, 3072]' is invalid for input of size 26214400 WARNING:root:Failed to merge LoRA for layer blocks.0.self_attn.k.weight: shape '[3072, 3072]' is invalid for input of size 26214400 WARNING:root:Failed to merge LoRA for layer blocks.0.self_attn.v.weight: shape '[3072, 3072]' is invalid for input of size 26214400 Merging LoRA layers: 1%|▊ | 15/1035 [00:00<00:07, 139.50it/s]WARNING:root:Failed to merge LoRA for layer blocks.0.self_attn.o.weight: shape '[3072, 3072]' is invalid for input of size 26214400 WARNING:root:Failed to merge LoRA for layer blocks.0.cross_attn.q.weight: shape '[3072, 3072]' is invalid for input of size 26214400 WARNING:root:Failed to merge LoRA for layer blocks.0.cross_attn.k.weight: shape '[3072, 3072]' is invalid for input of size 26214400 WARNING:root:Failed to merge LoRA for layer blocks.0.cross_attn.v.weight: shape '[3072, 3072]' is invalid for input of size 26214400 WARNING:root:Failed to merge LoRA for layer blocks.0.cross_attn.o.weight: shape '[3072, 3072]' is invalid for input of size 26214400 WARNING:root:Failed to merge LoRA for layer blocks.0.ffn.0.weight: shape '[14336, 3072]' is invalid for input of size 70778880 WARNING:root:Failed to merge LoRA for layer blocks.0.ffn.2.weight: shape '[3072, 14336]' is invalid for input of size 70778880 WARNING:root:Failed to merge LoRA for layer blocks.1.self_attn.q.weight: shape '[3072, 3072]' is invalid for input of size 26214400 Merging LoRA layers: 4%|██ | 37/1035 [00:00<00:05, 170.06it/s]WARNING:root:Failed to merge LoRA for layer blocks.1.self_attn.k.weight: shape '[3072, 3072]' is invalid for input of size 26214400

TokyoIdolsAFK

2025-10-26 11:57:37 +0000 UTC

Hello again. Not VRAM how much RAM you have? Did you set 100 GB virtual RAM? can you set and let me know after restarting windows : https://www.windowscentral.com/software-apps/windows-11/how-to-manage-virtual-memory-on-windows-11

Furkan Gözükara

2025-10-25 19:26:12 +0000 UTC

Hi. I have VRAM 24Gb, all parameters as you write. I checked all items and then I reinstalled all. Again, The app calculates/generates, but in the end I always have only this message : "Error during video generation: [WinError 2] The system cannot find the file specified" without any video in the Output. only wav founded in Output folder. ================================================================================ STARTING VAE DECODE - VRAM before: 2.07 GB ================================================================================ VAE DECODE PROGRESS: Decoding video (standard mode)... VAE DECODE PROGRESS: Standard decode completed ================================================================================ VAE DECODE COMPLETE VRAM after: 2.81 GB Peak during decode: 14.33 GB VRAM used by decode: 12.25 GB ================================================================================ Error during video generation: [WinError 2] The system cannot find the file specified [SINGLE-GEN] Failed - no result returned [SUBPROCESS] Generation failed with return code: 1 [GENERATION 1/1] No output file found in N:\AI\AI_Video\Ovi_Pro_v8\Ovi_Pro\outputs after retries Total generation time: 332.54 seconds ================================================================================ VIDEO GENERATION COMPLETED Final output path: None File exists: No ================================================================================ [MEMORY CLEANUP] Final cleanup completed - all generation memory freed Please I need help. Thank you for your job and other programs and scripts.

Ant-2014

2025-10-25 18:50:22 +0000 UTC

yes out of RAM. how much RAM you have?

Furkan Gözükara

2025-10-25 18:26:28 +0000 UTC

Hello, I have all python, cuda etc locally, but always get this error , whatever t2v, i2v : Can you help me, please? VAE DECODE COMPLETE VRAM after: 2.38 GB Peak during decode: 7.51 GB VRAM used by decode: 5.46 GB ================================================================================ Error during video generation: [WinError 2] The system cannot find the file specified [SINGLE-GEN] Failed - no result returned [SUBPROCESS] Generation failed with return code: 1 [GENERATION 1/1] No output file found in N:\AI\AI_Video\Ovi_Pro_v8\Ovi_Pro\outputs after retries Total generation time: 188.18 seconds ================================================================================ VIDEO GENERATION COMPLETED Final output path: None File exists: No ================================================================================ [MEMORY CLEANUP] Final cleanup completed - all generation memory freed

Ant-2014

2025-10-25 17:51:49 +0000 UTC

thanks. next gen will be hopefully even better model and app :)

Furkan Gözükara

2025-10-20 22:09:40 +0000 UTC

So after trying again this isn't for me especially how you can't let the app generate what it wants to say for you instead of you telling it everything. Anyway great job on the app

James Woodill

2025-10-20 21:56:08 +0000 UTC

i am looking as well. i hope there will be speed loras

Furkan Gözükara

2025-10-19 10:51:35 +0000 UTC

Thank you very much for this great work. I've already done some tests and am happy about the possibility of working locally. For the first time, I have video and sound, and it works with German voice. I hope there will soon be an option to achieve significantly reduced generation times with a 4- or 8-step LORA. Are you working on this?

thom mick

2025-10-18 01:04:55 +0000 UTC

which preset you using? changed any settings? it should be instant normally

Furkan Gözükara

2025-10-13 15:18:30 +0000 UTC

If I load a lora, the process requires quite some (big) time: [VIDEO MODEL] Merging 1 LoRA(s)... Merging LoRA layers: 79%|█ It takes about 5 minutes, but I see my runpod (H100 + cpu xeon platinum 8352Y), the cpu is stuck at almost 100%, and gpu not running. Maybe there's space for some optimization on lora loading (maybe now device is set on cpu). Hope this can help!

FalconBravery

2025-10-13 15:13:07 +0000 UTC

out of RAM. how much RAM you have? did you set 100 gb virtual disk?

Furkan Gözükara

2025-10-13 10:35:36 +0000 UTC

I've got this erro on first generation. What it could be? Initial VRAM: 0.00 GB Removing weight norm... ================================================================================ SCALED FP8 T5: Loading T5 in Scaled FP8 format Expected VRAM savings: ~50% (~5-6GB saved) ================================================================================ [FP8 CACHE] Found cached FP8 checkpoint: E:\AI\Ovi_Pro_v8\Ovi_Pro\ckpts\Wan2.2-TI2V-5B\models_t5_umt5-xxl-enc-fp8_scaled.safetensors [FP8 CACHE] Creating structure on CPU first (avoids BF16 VRAM allocation) [T5 LOAD][FP8] Structure created on CPU in 32.66s (FP8 cached path) [SUBPROCESS] Generation failed with return code: 3221225477 [GENERATION 1/1] No output file found in E:\AI\Ovi_Pro_v8\Ovi_Pro\outputs after retries Total generation time: 63.85 seconds ================================================================================ VIDEO GENERATION COMPLETED Final output path: None File exists: No

Pedro Burle

2025-10-13 01:44:28 +0000 UTC

Wan 2.2 5b loras tested and working. i didnt test others. sadly you cant set ending frame. only beginning frame

Furkan Gözükara

2025-10-12 23:51:20 +0000 UTC

Would it ever be possible to set a starting frame and ending frame with this? Also, what all are the types of LoRA that can be used?

Diggy Dre

2025-10-12 22:38:02 +0000 UTC

i suppose closer shot is better. also it has 2 options you can try and see : Audio Guidance Scale - SLG Layer (Skip Layer Guidance layer - affects audio-video synchronization)

Furkan Gözükara

2025-10-12 19:05:53 +0000 UTC

This is very nice. I can't wait to try the batch feature. I'm trying to use an anime-style character. How can I get the mouth movements to be more accurate (this is for pronunciation teaching)

Taiga

2025-10-12 14:00:30 +0000 UTC

hi we have it in requirements. that means your install failed for some reason. can you run installer again and email me logs? please delete venv before : monstermmorpg@gmail.com

Furkan Gözükara

2025-10-12 12:20:02 +0000 UTC

Hi, I get this error when trying to run the app, I have downloaded everything and run the update. Do you know this error? [STARTUP] Set MKL/OMP threads to 20 for optimal CPU performance Traceback (most recent call last): File "C:\OVI\Ovi_Pro\premium.py", line 21, in from ovi.utils.io_utils import save_video File "C:\OVI\Ovi_Pro\ovi\utils\io_utils.py", line 5, in from moviepy.editor import ImageSequenceClip, AudioFileClip ModuleNotFoundError: No module named 'moviepy' Press any key to continue . . .

Dan

2025-10-12 11:51:46 +0000 UTC

nah they are official sources no issues. possibly safe tensor can be used if there is accurate version

Furkan Gözükara

2025-10-11 19:46:18 +0000 UTC

Would you be able to make it use all .safetensors files instead of it having some .pt and .pth files (that I think are riskier because those could contain pickled content in theory)?

cool1

2025-10-11 17:26:28 +0000 UTC

thanks a lot for comment. i believe what you want and what i also want will become available soon. currently vibevoice can generate other languages. hopefully i will publish an app for it soon

Furkan Gözükara

2025-10-11 15:03:23 +0000 UTC

Thank you always! If I may express a personal wish, it would be great if Korean and Japanese voice options were also available. And if there were features like generating videos that lip-sync to the input voice, or generating voices that match sample voices (e.g., GPT-SoVITS), that would be truly amazing. I believe that someday, something even better encompassing these features will emerge. :) Anyway, thank you so much!!

Mimic

2025-10-11 14:25:38 +0000 UTC

this app can't do it but multi talk is doing that : https://youtu.be/8cMIwS9qo4M

Furkan Gözükara

2025-10-11 08:44:53 +0000 UTC

I have an interesting question. Suppose you had some pre-recorded audio that you wanted this thing to animate to, would that ever be a feature? Almost like giving it something to lip-sync to?

Diggy Dre

2025-10-11 07:03:06 +0000 UTC

i dont know if any flags sadly. but --share will start on gradio live so you can use from anywhere

Furkan Gözükara

2025-10-10 23:36:14 +0000 UTC

that is gradio error. not important. what else do you see after this?

Furkan Gözükara

2025-10-09 22:36:35 +0000 UTC

Hello there. I have just tried one of the examples and this is the error im getting: traceback (most recent call last): File "C:\Python310\lib\asyncio\events.py", line 80, in _run self._context.run(self._callback, *self._args) File "C:\Python310\lib\asyncio\proactor_events.py", line 165, in _call_connection_lost self._sock.shutdown(socket.SHUT_RDWR) ConnectionResetError: [WinError 10054]

rSandor

2025-10-09 22:34:19 +0000 UTC

Hey Furkan, thanks for fixing the looping bug. I am able to get video output now. Is there a flag to fun the Gradio server on the local network?

Justin

2025-10-09 15:26:19 +0000 UTC

Did you try v8 zip file? it only has txt and bat files you can see what is inside. so please allow and download it

Furkan Gözükara

2025-10-09 14:06:02 +0000 UTC

its getting flagged as a virus when downloading the installer

Abu Bhakar

2025-10-09 13:16:16 +0000 UTC

because it is wrong. it is not s s. we even added check :D it is <s> </s>

Furkan Gözükara

2025-10-09 00:42:07 +0000 UTC

I am now able to get video, which is pretty cool, but I can't get the voice part to actually read through the script. It randomly bounces back and forth between previous words within the S and E speech brackets.

Diggy Dre

2025-10-08 23:34:48 +0000 UTC

hi are you on 7.2? can you copy cmd logs starting from first line copy into txt file and email me? monstermmorpg@gmail.com

Furkan Gözükara

2025-10-08 17:43:07 +0000 UTC

Juanmyth

2025-10-08 17:41:53 +0000 UTC

Windows error. You can see every file content with notepad. You have to allow it to download.

Furkan Gözükara

2025-10-08 10:08:22 +0000 UTC

Yes I think so

Furkan Gözükara

2025-10-08 10:08:03 +0000 UTC

What languages does this version support? Only English?

Hoàng Giang Sơn Trương

2025-10-08 02:41:48 +0000 UTC

The latest version does not allow downloading, the Windows system marks it as a virus. I was able to download it without any problem. Thank you very much.

carlos chavez

2025-10-08 02:31:06 +0000 UTC

Well you can mute sound. I added checkbox to generate without speech tags option. Try that too . But will check if no audio option available or not

Furkan Gözükara

2025-10-07 20:03:44 +0000 UTC

Hi Furkan. Can I generate videos without speech or background sounds, or with one or the other?

michele carlone

2025-10-07 19:26:27 +0000 UTC

Update latest version, delete outputs and send me entire cmd log as email : monstermmorpg@gmail.com You can use T5 cpu almost same speed

Furkan Gözükara

2025-10-07 17:26:46 +0000 UTC

I wasn't able to get this to work on my 4080. Also, for some reason, it kept trying to offload the text encoder to my 9950x.

Diggy Dre

2025-10-07 16:54:32 +0000 UTC

clear your output folder. i need to make a fix for this. after doing that restart and let me know. also try v 6.2

Furkan Gözükara

2025-10-07 16:04:52 +0000 UTC

Greetings. I did a clean install from v3 (working) to v5. Now my generations are consistently stuck in a loop back into the T5 encoding process. Only way to stop the process is closing Python. Thanks in advance

Oliver

2025-10-07 15:33:18 +0000 UTC

you can edit all of the files with notepad and see content. 100% false positive. also virustotal has 0 : https://www.virustotal.com/gui/file/9b4b81a000308cc5ce9d01a138cbd0737820331b6ed24799521637dea3b5336e

Furkan Gözükara

2025-10-07 10:24:07 +0000 UTC

you can change seed. currently it is 99 unless you enable randomize seed

Furkan Gözükara

2025-10-07 10:22:54 +0000 UTC

In the samples on the project page https://aaxwaz.github.io/Ovi/ the voices vary, mine sound all the same. Is there a way to vary the voice when using an image as a starting point for the video? A prompt trigger word or something

Neil Rhodes

2025-10-07 08:07:07 +0000 UTC

The tags are not allow on the comment lol so I will use * instead of the tag symbols :p In the "how to use" windows, you said : Check S.../S tag format Add AUDCAP.../ENDAUDCAP descriptions ect… but in the examples and on the page of the model they use the tag formats S E (for speech) AUDCAP ENDAUDCAP (for audio description). which ones are the good ones ?

thecatzman

2025-10-07 05:25:24 +0000 UTC

In the "how to use" windows, you said : Check <s>...</s> tag format Add ... descriptions ect… but in the examples and on the page of the model they use the tag formats <s> (for speech) (for audio description). which ones are the good ones ?</s>

thecatzman

2025-10-07 05:20:26 +0000 UTC

New version is lighting up my antivirus "Threat found - action needed. 10/6/2025 9:38 PM Severe Detected: Trojan:Script/Wacatac.C!ml Status: Active Active threats have not been remediated and are running on your device. Date: 10/6/2025 9:38 PM Details: This program is dangerous and executes commands from an attacker. Affected items: file: C:\Users\ Downloads\Ovi_Pro_v5.zip Learn more"

leem0nchu

2025-10-07 01:42:45 +0000 UTC

i have dual GPUs too. REM means it is commented. so make a copy of bat file and add this line before call line as SET CUDA_VISIBLE_DEVICES=1 - message me from discord i will send you bat file

Furkan Gözükara

2025-10-06 22:42:45 +0000 UTC

Hi, great work. Everything worked and I managed to create a video just fine. I have two GPU's a 4060 8gb and 5060ti 16gb, yet It ignores the 16gb even though I change the REM SET CUDA_VISIBLE_DEVICES=0 as it was =1. (or set CUDA_VISIBLE_DEVICES=0) The 8gb is plugged to monitor so I use the 5060 in OVI/ComfyUI etc for rendering. What can I do to fix this?

Daniel Smith

2025-10-06 21:45:39 +0000 UTC

try 4.5 and email me cmd logs : monstermmorpg@gmail.com

Furkan Gözükara

2025-10-06 09:12:05 +0000 UTC

I’ve updated to version 3 and v2 worked better. It did find my card but not building. It gets stuck.

James Charleston II

2025-10-06 02:18:04 +0000 UTC

what is 1tgb? are you on v 4.5? how much RAM you have and what GPU

Furkan Gözükara

2025-10-05 20:52:28 +0000 UTC

Hello, I got everything installed and running, but running this even with the 1tgb preset it slows down my computer a lot and a few times it slowed it down so much I had to hold in the button to turn it off. I'm not sure what to really do about it.

James Woodill

2025-10-05 20:48:28 +0000 UTC

hi install error. please have Python 3.10.11 installed. follow this tutorial : https://youtu.be/DrhUHnYfwC0

Furkan Gözükara

2025-10-05 17:01:03 +0000 UTC

I ran the installer ok on Windows 64 but when trying to run I get the error: The system cannot find the path specified. Traceback (most recent call last): File "C:\Ovi_Pro_v3\Ovi_Pro\premium.py", line 1, in import gradio as gr ModuleNotFoundError: No module named 'gradio'

Alexandre Rangel

2025-10-05 16:53:52 +0000 UTC

thanks gonna check now

Furkan Gözükara

2025-10-05 15:30:45 +0000 UTC

Thank you for this awesome release. I'm currently testing on H100 and 60 steps (3 min generation/video). Ive maybe found a bug, no matter which video length I set, the videos are always 5 seconds long, even if I set 10 seconds or any other number

FalconBravery

2025-10-05 15:27:34 +0000 UTC

i think you need to change prompt. did you test example prompts in example tab? they are perfectly animated with default presets.

Furkan Gözükara

2025-10-05 10:30:24 +0000 UTC

I mostly of times has a static video with sound, which parameter i have to change to avoid it?

Fco Muñoz

2025-10-05 10:16:07 +0000 UTC

Just updated app to v3.3 for you and made preset more robust. please email me logs of Windows_Install_and_Update.bat file : monstermmorpg@gmail.com and try again

Furkan Gözükara

2025-10-05 09:14:12 +0000 UTC

how much RAM VRAM you have and what preset?

Furkan Gözükara

2025-10-05 08:40:16 +0000 UTC

this one easy to solve. set your virtual disk to 100 GB : https://www.windowscentral.com/software-apps/windows-11/how-to-manage-virtual-memory-on-windows-11 and try again. i will try to add fp8 scaled loading today that will reduce needed RAM

Furkan Gözükara

2025-10-05 08:39:25 +0000 UTC

updated to v3, rtx 4060 8gb gpu and 64gb ram, got this error [OK] OviFusionEngine initialized successfully (models will load on first generation) [GENERATION 1/1] Starting with seed: 99 ================================================================================ STEP 1/2: Loading T5 text encoder FIRST to minimize RAM usage ================================================================================ ================================================================================ Loading OVI models for first generation... Block Swap: 29 blocks CPU Offload: True ================================================================================ Initial VRAM: 0.00 GB Removing weight norm... ================================================================================ T5 CPU-ONLY MODE: Loading T5 on CPU for CPU inference This saves VRAM but text encoding will be slower ================================================================================ [T5 CPU-ONLY MODE] Loading T5 text encoder on CPU for CPU inference [T5 CPU-ONLY MODE] This saves VRAM but encoding will be slower [T5 LOAD][BF16] Encoder structure created in 33.61s [T5 LOAD][BF16] Weights loaded in 6.59s (total 40.20s) [T5 CPU-ONLY MODE] T5 encoder ready on CPU ================================================================================ T5 loaded. Fusion model will load AFTER text encoding. ================================================================================ STEP 2/2: Encoding text and optionally deleting T5 before loading fusion model ================================================================================ Encoding text prompts... Text embeddings encoded on CPU and moved to GPU Keeping T5 on CPU (already in CPU-only mode) ================================================================================ STEP 3/3: Loading fusion model (T5 already deleted if enabled) ================================================================================ ================================================================================ Loading OVI models for first generation... Block Swap: 29 blocks CPU Offload: True ================================================================================ Initial VRAM: 0.00 GB Step 1/6: Creating model structure on meta device... Score model (Fusion) all parameters:11660753108 Step 2/6: Loading checkpoint weights to CPU... Error during video generation: The paging file is too small for this operation to complete. (os error 1455) [SINGLE-GEN] Failed - no result returned [SUBPROCESS] Generation failed with return code: 1 [GENERATION 1/1] No output file found

2025-10-05 06:51:02 +0000 UTC

Thanks for that! Few thing to mention: (I used the update batch file to update to 3.0. But I'll try to delete venv and hit it again just to make sure the update really happened) - Cancelling doesn't work. - Saving new preset also not working Traceback (most recent call last): File "P:\AI\Ovi_Pro_v1\Ovi_Pro\venv\lib\site-packages\gradio\queueing.py", line 759, in process_events response = await route_utils.call_process_api( File "P:\AI\Ovi_Pro_v1\Ovi_Pro\venv\lib\site-packages\gradio\route_utils.py", line 354, in call_process_api output = await app.get_blocks().process_api( File "P:\AI\Ovi_Pro_v1\Ovi_Pro\venv\lib\site-packages\gradio\blocks.py", line 2112, in process_api inputs = await self.preprocess_data( File "P:\AI\Ovi_Pro_v1\Ovi_Pro\venv\lib\site-packages\gradio\blocks.py", line 1774, in preprocess_data processed_input.append(block.preprocess(inputs_cached)) File "P:\AI\Ovi_Pro_v1\Ovi_Pro\venv\lib\site-packages\gradio\components\dropdown.py", line 206, in preprocess raise Error( gradio.exceptions.Error: "Value: True is not in the list of choices: ['BlahBlah']" Traceback (most recent call last): File "P:\AI\Ovi_Pro_v1\Ovi_Pro\venv\lib\site-packages\gradio\queueing.py", line 759, in process_events response = await route_utils.call_process_api( File "P:\AI\Ovi_Pro_v1\Ovi_Pro\venv\lib\site-packages\gradio\route_utils.py", line 354, in call_process_api output = await app.get_blocks().process_api( File "P:\AI\Ovi_Pro_v1\Ovi_Pro\venv\lib\site-packages\gradio\blocks.py", line 2112, in process_api inputs = await self.preprocess_data( File "P:\AI\Ovi_Pro_v1\Ovi_Pro\venv\lib\site-packages\gradio\blocks.py", line 1774, in preprocess_data processed_input.append(block.preprocess(inputs_cached)) File "P:\AI\Ovi_Pro_v1\Ovi_Pro\venv\lib\site-packages\gradio\components\dropdown.py", line 206, in preprocess raise Error( gradio.exceptions.Error: "Value: True is not in the list of choices: ['BlahBlah']"

Richard Nagy

2025-10-05 06:09:16 +0000 UTC

im getting this error with the 3.1 - Initial VRAM: 0.02 GB Step 1/6: Creating model structure on meta device... Score model (Fusion) all parameters:11660753108 Step 2/6: Loading checkpoint weights to CPU... [SUBPROCESS] Generation failed with return code: 3221225477 [GENERATION 1/1] No output file found

Agino Terra

2025-10-05 04:18:46 +0000 UTC

Let me check if broken

Furkan Gözükara

2025-10-05 00:05:14 +0000 UTC

It won’t run in text to video mode for me. I have to start with an image or nothing happens.

James Charleston II

2025-10-05 00:03:05 +0000 UTC

I don't know sadly. I dont have AMD card to test.

Furkan Gözükara

2025-10-04 22:09:21 +0000 UTC

I don't know both of them. I am working on improving the VAE for lower VRAM gpus but i will check them later.

Furkan Gözükara

2025-10-04 22:09:10 +0000 UTC

Hi Furkan. Thanks for this fantastic work. I tried generating videos in Italian, but the audio is horrible. Is it possible to improve this language? Can videos be generated that are at least 10 seconds long?

michele carlone

2025-10-04 22:02:20 +0000 UTC

run it with amd gpu´s?

Alexander Hempel

2025-10-04 21:18:26 +0000 UTC

yes it works perfect. you can use default config

Furkan Gözükara

2025-10-04 21:07:53 +0000 UTC

Will this work with 4090?

guni

2025-10-04 19:45:04 +0000 UTC

5090 really fast but i dont know 3080 ti. give it a try. also making improvements. you can also generate in 20 steps to speed up

Furkan Gözükara

2025-10-04 18:51:56 +0000 UTC

What are the generation time? For example on rtx 3080ti 16gb VRAM, 32 gb RAM?

ranjeet

2025-10-04 17:06:51 +0000 UTC

you are welcome thanks for comment

Furkan Gözükara

2025-10-04 14:35:02 +0000 UTC

Amazing work, really. Thank you so much :)

Damjan Žakelj

2025-10-04 13:50:12 +0000 UTC

Update to latest version and try block swap 12

Furkan Gözükara

2025-10-04 13:11:22 +0000 UTC

[GENERATE_VIDEO] Called with clear_all=False, num_generations=1 ================================================================================ INITIALIZING OVI FUSION ENGINE IN MAIN PROCESS Block Swap: 8 blocks (0 = disabled) CPU Offload: True Image Generation: False No Block Prep: False Note: Models will be loaded in main process (Clear All Memory disabled) ================================================================================ [OK] OviFusionEngine initialized successfully (models will load on first generation) [GENERATION 1/1] Starting with seed: 99 ================================================================================ STEP 1/2: Loading T5 text encoder FIRST to minimize RAM usage ================================================================================ ================================================================================ Loading OVI models for first generation... Block Swap: 8 blocks CPU Offload: True ================================================================================ Initial VRAM: 0.00 GB Removing weight norm... Loading T5 text encoder directly to GPU (BEFORE fusion model to save RAM)... Error during video generation: CUDA out of memory. Tried to allocate 160.00 MiB. GPU 0 has a total capacity of 12.00 GiB of which 0 bytes is free. Of the allocated memory 18.23 GiB is allocated by PyTorch, and 361.40 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) [SINGLE-GEN] Failed - no result returned [SUBPROCESS] Generation failed with return code: 1 [GENERATION 1/1] Failed in subprocess

REGINALDO BARBOSA

2025-10-04 12:58:41 +0000 UTC

fixed with v2.5 just run installer again and make sure Clear All Memory enabled

Furkan Gözükara

2025-10-04 12:15:28 +0000 UTC

fixed with v2.5 just run installer again and make sure Clear All Memory enabled

Furkan Gözükara

2025-10-04 12:14:57 +0000 UTC

Hi Furkan, thanks a lot for your great work on Ovi Pro Fusion! I found a bug on my RTX 4090: With block swap = 12 and CPU offload = true, the first generation works perfectly. But the second generation with the same settings always fails. Error message: RuntimeError: Input type (CUDABFloat16Type) and weight type (CPUBFloat16Type) should be the same It looks like the patch_embedding layer stays on CPU after the first run, while the inputs are already on CUDA. Without block swap/offload it doesn’t run well on 24 GB, so this is important. Restarting the app always fixes it for one run. Maybe the block swap step needs to re-sync that layer on every run. Thanks again for this amazing project!

macmotu

2025-10-04 11:09:53 +0000 UTC

Thanks for your amazing work, dear Furkan. Unfortunately, it did not work for me, I will wait for the next version. Here is the error I got, in case you're interested: "INFERENCE STARTING - VRAM: 16.58 GB Block swap active: 12/30 blocks on CPU ================================================================================ 2it [08:37, 258.78s/it] ERROR:root:Traceback (most recent call last): File "E:\AI\Ovi_Pro\Ovi_Pro\ovi\ovi_fusion_engine.py", line 455, in generate pred_vid_pos, pred_audio_pos = self.model( File "E:\AI\Ovi_Pro\Ovi_Pro\venv\lib\site-packages\torch\nn\modules\module.py", line 1773, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "E:\AI\Ovi_Pro\Ovi_Pro\venv\lib\site-packages\torch\nn\modules\module.py", line 1784, in _call_impl return forward_call(*args, **kwargs) File "E:\AI\Ovi_Pro\Ovi_Pro\ovi\modules\fusion.py", line 310, in forward vid, audio = gradient_checkpointing( File "E:\AI\Ovi_Pro\Ovi_Pro\ovi\modules\model.py", line 22, in gradient_checkpointing return module(*args, **kwargs) File "E:\AI\Ovi_Pro\Ovi_Pro\ovi\modules\fusion.py", line 223, in single_fusion_block_forward assert not torch.equal(og_audio, audio), "Audio should be changed after cross-attention!" torch.AcceleratorError: CUDA error: an illegal memory access was encountered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions. Error during video generation: cannot unpack non-iterable NoneType object"

JP LONB

2025-10-04 09:51:42 +0000 UTC

8 GB memory is too low. so far i got as low as 8.2 GB. I am trying to add FP8_Scaled right now lets see if that can fix your issue

Furkan Gözükara

2025-10-04 08:53:26 +0000 UTC

rtx 4060 with 8gb gpu and 64gb ram memory not working, seems like memory issue

2025-10-04 08:51:06 +0000 UTC

thanks a lot. working to add more features today hopefully

Furkan Gözükara

2025-10-04 08:28:34 +0000 UTC

You are amazing Dr! thank you once again!!

Hipno

2025-10-04 03:05:33 +0000 UTC

Looks like it's not going to work for me. Started fresh. 512X512. Max block swapping. Ran out of memory.

DanO..

2025-10-04 03:03:10 +0000 UTC

Just gave it a try. Ran out of memory with RTX 3060 12GB VRAM. I lowered resolution and maxed out swapped blocks but wouldn't generate. It appears as though it leaves system memory (40GB) and VRAM maxed out after failed attempt. Needs to clean it out after failure or add some mechanism to clear out memory. Restarting to try with lower resolution and maxed block swapping.

DanO..

2025-10-04 03:02:16 +0000 UTC

More Creators

ellie

patreon

RubbyAgate

patreon

tineolakoikatsu

patreon

鮎川

fanbox

环宝养成计划

gumroad

大箕すず

fanbox

Do The Bible Podcast

patreon

かながわ

fanbox

kirokat

patreon

b i m a

gumroad

saijyoji

fanbox

fastestfrogs

patreon

STF_AI

patreon

LBS AI

patreon

Cindy Wong

gumroad

Captainalfie78 Works

patreon

evilrickartworkandcomics

patreon

GSL

patreon

NonExister

boosty

Syd's Rope Goods

patreon

suppermariobroth

patreon

Cyber Owl Games

patreon

Fetcom

gumroad

陸奥あさひ

fanbox

zeesta

fanbox

hyrulchic?

patreon

MIKAA

fanbox

EXPUNGED

fanbox

Dazed Dream

patreon

devakira

patreon

yuki

fanbox

Sole Hunter

patreon

denizen1414

patreon

岩山ゲンタ

fanbox

K-SEN

fanbox

fattie0726

patreon

Big Body & Bok

patreon

NetworkMike

patreon

Orla Gartland

patreon

pandoramail

patreon

くれあ_エクレア

fanbox

hoshino

fanbox

xchiseax

patreon

Madao112

patreon

JohnyBungalow1

patreon

Tomatobird

gumroad

まろすと

fantia

𝘙𝘢𝘪𝘫𝘪𝘪𝘯

gumroad

iopichio

patreon

BrashearLushert

patreon