SamuZai
Furkan Gözükara
Furkan Gözükara

patreon


Ovi SECourses Premium App to Generate Audio Having 121 Frames Videos from Text and Images - Supports all GPUs Including RTX 5000 Series - Has Flash Attention + Batch Processing and Block Swapping - As Low as 8 GB VRAM - Like VEO 3 and SORA 2 - 1-Click to Install on Windows, RunPod and Massed Compute

Patreon exclusive posts index to find our scripts easily, Patreon scripts updates history to see which updates arrived to which scripts and amazing Patreon special generative scripts list that you can use in any of your task.

Join discord to get help, chat, discuss and also tell me your discord username to get your special rank : SECourses Discord

Please also Star, Watch and Fork our Stable Diffusion & Generative AI  GitHub repository and join our Reddit subreddit and follow me on LinkedIn (my real profile)

=======

Latest zip file : Ovi_Pro_v8.zip

Full scale ultra advanced app for Ovi - an open source project that can generate videos from both text prompts and image + text prompts with real audio.

When Clear All Memory is selected (default in 32 GB and below presets) make sure to click Cancel button first and then close CMD or it will continue working as a subprocess

15 October 2025 V8.3

9 October 2025 V8.1

Windows Requirements

Massed Compute (Recommend Cloud) :

RunPod (Cloud):

9 October 2025 V7.6

8 October 2025 V6.4

6 October 2025 V6.3

6 October 2025 V5.9

5 October 2025 V4.4

5 October 2025 V3.8

5 October 2025 V3.4

4 October 2025 V2.9

Full tutorial video coming soon hopefully

Comments

looks like your lora is incompatiable. so far we have verified Wan 2.2 5B model loras working. what lora is you trying?

Furkan Gözükara

Hi!, everything is great, the samples work I'm using 3080Ti 12GB . However, I'm having issues with LORA, it will not apply, please assist: [VIDEO MODEL] Merging 1 LoRA(s)... WARNING:root:⚠ DETECTED: bfloat16 CPU matmul is catastrophically slow on this system! WARNING:root: Automatically enabling float32 workaround for LoRA merging WARNING:root: Recommendation: Downgrade PyTorch to 2.4.x or 2.5.x (current: 2.8.0+cu129) Merging LoRA layers: 0%| | 0/1035 [00:00CPU transfers for each layer! WARNING:root:Failed to merge LoRA for layer blocks.0.self_attn.q.weight: shape '[3072, 3072]' is invalid for input of size 26214400 WARNING:root:Failed to merge LoRA for layer blocks.0.self_attn.k.weight: shape '[3072, 3072]' is invalid for input of size 26214400 WARNING:root:Failed to merge LoRA for layer blocks.0.self_attn.v.weight: shape '[3072, 3072]' is invalid for input of size 26214400 Merging LoRA layers: 1%|▊ | 15/1035 [00:00<00:07, 139.50it/s]WARNING:root:Failed to merge LoRA for layer blocks.0.self_attn.o.weight: shape '[3072, 3072]' is invalid for input of size 26214400 WARNING:root:Failed to merge LoRA for layer blocks.0.cross_attn.q.weight: shape '[3072, 3072]' is invalid for input of size 26214400 WARNING:root:Failed to merge LoRA for layer blocks.0.cross_attn.k.weight: shape '[3072, 3072]' is invalid for input of size 26214400 WARNING:root:Failed to merge LoRA for layer blocks.0.cross_attn.v.weight: shape '[3072, 3072]' is invalid for input of size 26214400 WARNING:root:Failed to merge LoRA for layer blocks.0.cross_attn.o.weight: shape '[3072, 3072]' is invalid for input of size 26214400 WARNING:root:Failed to merge LoRA for layer blocks.0.ffn.0.weight: shape '[14336, 3072]' is invalid for input of size 70778880 WARNING:root:Failed to merge LoRA for layer blocks.0.ffn.2.weight: shape '[3072, 14336]' is invalid for input of size 70778880 WARNING:root:Failed to merge LoRA for layer blocks.1.self_attn.q.weight: shape '[3072, 3072]' is invalid for input of size 26214400 Merging LoRA layers: 4%|██ | 37/1035 [00:00<00:05, 170.06it/s]WARNING:root:Failed to merge LoRA for layer blocks.1.self_attn.k.weight: shape '[3072, 3072]' is invalid for input of size 26214400

TokyoIdolsAFK

Hello again. Not VRAM how much RAM you have? Did you set 100 GB virtual RAM? can you set and let me know after restarting windows : https://www.windowscentral.com/software-apps/windows-11/how-to-manage-virtual-memory-on-windows-11

Furkan Gözükara

Hi. I have VRAM 24Gb, all parameters as you write. I checked all items and then I reinstalled all. Again, The app calculates/generates, but in the end I always have only this message : "Error during video generation: [WinError 2] The system cannot find the file specified" without any video in the Output. only wav founded in Output folder. ================================================================================ STARTING VAE DECODE - VRAM before: 2.07 GB ================================================================================ VAE DECODE PROGRESS: Decoding video (standard mode)... VAE DECODE PROGRESS: Standard decode completed ================================================================================ VAE DECODE COMPLETE VRAM after: 2.81 GB Peak during decode: 14.33 GB VRAM used by decode: 12.25 GB ================================================================================ Error during video generation: [WinError 2] The system cannot find the file specified [SINGLE-GEN] Failed - no result returned [SUBPROCESS] Generation failed with return code: 1 [GENERATION 1/1] No output file found in N:\AI\AI_Video\Ovi_Pro_v8\Ovi_Pro\outputs after retries Total generation time: 332.54 seconds ================================================================================ VIDEO GENERATION COMPLETED Final output path: None File exists: No ================================================================================ [MEMORY CLEANUP] Final cleanup completed - all generation memory freed Please I need help. Thank you for your job and other programs and scripts.

Ant-2014

yes out of RAM. how much RAM you have?

Furkan Gözükara

Hello, I have all python, cuda etc locally, but always get this error , whatever t2v, i2v : Can you help me, please? VAE DECODE COMPLETE VRAM after: 2.38 GB Peak during decode: 7.51 GB VRAM used by decode: 5.46 GB ================================================================================ Error during video generation: [WinError 2] The system cannot find the file specified [SINGLE-GEN] Failed - no result returned [SUBPROCESS] Generation failed with return code: 1 [GENERATION 1/1] No output file found in N:\AI\AI_Video\Ovi_Pro_v8\Ovi_Pro\outputs after retries Total generation time: 188.18 seconds ================================================================================ VIDEO GENERATION COMPLETED Final output path: None File exists: No ================================================================================ [MEMORY CLEANUP] Final cleanup completed - all generation memory freed

Ant-2014

thanks. next gen will be hopefully even better model and app :)

Furkan Gözükara

So after trying again this isn't for me especially how you can't let the app generate what it wants to say for you instead of you telling it everything. Anyway great job on the app

James Woodill

i am looking as well. i hope there will be speed loras

Furkan Gözükara

Thank you very much for this great work. I've already done some tests and am happy about the possibility of working locally. For the first time, I have video and sound, and it works with German voice. I hope there will soon be an option to achieve significantly reduced generation times with a 4- or 8-step LORA. Are you working on this?

thom mick

which preset you using? changed any settings? it should be instant normally

Furkan Gözükara

If I load a lora, the process requires quite some (big) time: [VIDEO MODEL] Merging 1 LoRA(s)... Merging LoRA layers: 79%|█ It takes about 5 minutes, but I see my runpod (H100 + cpu xeon platinum 8352Y), the cpu is stuck at almost 100%, and gpu not running. Maybe there's space for some optimization on lora loading (maybe now device is set on cpu). Hope this can help!

FalconBravery

out of RAM. how much RAM you have? did you set 100 gb virtual disk?

Furkan Gözükara

I've got this erro on first generation. What it could be? Initial VRAM: 0.00 GB Removing weight norm... ================================================================================ SCALED FP8 T5: Loading T5 in Scaled FP8 format Expected VRAM savings: ~50% (~5-6GB saved) ================================================================================ [FP8 CACHE] Found cached FP8 checkpoint: E:\AI\Ovi_Pro_v8\Ovi_Pro\ckpts\Wan2.2-TI2V-5B\models_t5_umt5-xxl-enc-fp8_scaled.safetensors [FP8 CACHE] Creating structure on CPU first (avoids BF16 VRAM allocation) [T5 LOAD][FP8] Structure created on CPU in 32.66s (FP8 cached path) [SUBPROCESS] Generation failed with return code: 3221225477 [GENERATION 1/1] No output file found in E:\AI\Ovi_Pro_v8\Ovi_Pro\outputs after retries Total generation time: 63.85 seconds ================================================================================ VIDEO GENERATION COMPLETED Final output path: None File exists: No

Pedro Burle

Wan 2.2 5b loras tested and working. i didnt test others. sadly you cant set ending frame. only beginning frame

Furkan Gözükara

Would it ever be possible to set a starting frame and ending frame with this? Also, what all are the types of LoRA that can be used?

Diggy Dre

i suppose closer shot is better. also it has 2 options you can try and see : Audio Guidance Scale - SLG Layer (Skip Layer Guidance layer - affects audio-video synchronization)

Furkan Gözükara

This is very nice. I can't wait to try the batch feature. I'm trying to use an anime-style character. How can I get the mouth movements to be more accurate (this is for pronunciation teaching)

Taiga

hi we have it in requirements. that means your install failed for some reason. can you run installer again and email me logs? please delete venv before : monstermmorpg@gmail.com

Furkan Gözükara

Hi, I get this error when trying to run the app, I have downloaded everything and run the update. Do you know this error? [STARTUP] Set MKL/OMP threads to 20 for optimal CPU performance Traceback (most recent call last): File "C:\OVI\Ovi_Pro\premium.py", line 21, in from ovi.utils.io_utils import save_video File "C:\OVI\Ovi_Pro\ovi\utils\io_utils.py", line 5, in from moviepy.editor import ImageSequenceClip, AudioFileClip ModuleNotFoundError: No module named 'moviepy' Press any key to continue . . .

Dan

nah they are official sources no issues. possibly safe tensor can be used if there is accurate version

Furkan Gözükara

Would you be able to make it use all .safetensors files instead of it having some .pt and .pth files (that I think are riskier because those could contain pickled content in theory)?

cool1

thanks a lot for comment. i believe what you want and what i also want will become available soon. currently vibevoice can generate other languages. hopefully i will publish an app for it soon

Furkan Gözükara

Thank you always! If I may express a personal wish, it would be great if Korean and Japanese voice options were also available. And if there were features like generating videos that lip-sync to the input voice, or generating voices that match sample voices (e.g., GPT-SoVITS), that would be truly amazing. I believe that someday, something even better encompassing these features will emerge. :) Anyway, thank you so much!!

Mimic

this app can't do it but multi talk is doing that : https://youtu.be/8cMIwS9qo4M

Furkan Gözükara

I have an interesting question. Suppose you had some pre-recorded audio that you wanted this thing to animate to, would that ever be a feature? Almost like giving it something to lip-sync to?

Diggy Dre

i dont know if any flags sadly. but --share will start on gradio live so you can use from anywhere

Furkan Gözükara

that is gradio error. not important. what else do you see after this?

Furkan Gözükara

Hello there. I have just tried one of the examples and this is the error im getting: traceback (most recent call last): File "C:\Python310\lib\asyncio\events.py", line 80, in _run self._context.run(self._callback, *self._args) File "C:\Python310\lib\asyncio\proactor_events.py", line 165, in _call_connection_lost self._sock.shutdown(socket.SHUT_RDWR) ConnectionResetError: [WinError 10054]

rSandor

Hey Furkan, thanks for fixing the looping bug. I am able to get video output now. Is there a flag to fun the Gradio server on the local network?

Justin

Did you try v8 zip file? it only has txt and bat files you can see what is inside. so please allow and download it

Furkan Gözükara

its getting flagged as a virus when downloading the installer

Abu Bhakar

because it is wrong. it is not s s. we even added check :D it is <s> </s>

Furkan Gözükara

I am now able to get video, which is pretty cool, but I can't get the voice part to actually read through the script. It randomly bounces back and forth between previous words within the S and E speech brackets.

Diggy Dre

hi are you on 7.2? can you copy cmd logs starting from first line copy into txt file and email me? monstermmorpg@gmail.com

Furkan Gözükara

Juanmyth

Windows error. You can see every file content with notepad. You have to allow it to download.

Furkan Gözükara

Yes I think so

Furkan Gözükara

What languages does this version support? Only English?

Hoàng Giang Sơn Trương

The latest version does not allow downloading, the Windows system marks it as a virus. I was able to download it without any problem. Thank you very much.

carlos chavez

Well you can mute sound. I added checkbox to generate without speech tags option. Try that too . But will check if no audio option available or not

Furkan Gözükara

Hi Furkan. Can I generate videos without speech or background sounds, or with one or the other?

michele carlone

Update latest version, delete outputs and send me entire cmd log as email : monstermmorpg@gmail.com You can use T5 cpu almost same speed

Furkan Gözükara

I wasn't able to get this to work on my 4080. Also, for some reason, it kept trying to offload the text encoder to my 9950x.

Diggy Dre

clear your output folder. i need to make a fix for this. after doing that restart and let me know. also try v 6.2

Furkan Gözükara

Greetings. I did a clean install from v3 (working) to v5. Now my generations are consistently stuck in a loop back into the T5 encoding process. Only way to stop the process is closing Python. Thanks in advance

Oliver

you can edit all of the files with notepad and see content. 100% false positive. also virustotal has 0 : https://www.virustotal.com/gui/file/9b4b81a000308cc5ce9d01a138cbd0737820331b6ed24799521637dea3b5336e

Furkan Gözükara

you can change seed. currently it is 99 unless you enable randomize seed

Furkan Gözükara

In the samples on the project page https://aaxwaz.github.io/Ovi/ the voices vary, mine sound all the same. Is there a way to vary the voice when using an image as a starting point for the video? A prompt trigger word or something

Neil Rhodes

The tags are not allow on the comment lol so I will use * instead of the tag symbols :p In the "how to use" windows, you said : Check *S*...*/S* tag format Add *AUDCAP*...*/ENDAUDCAP* descriptions ect… but in the examples and on the page of the model they use the tag formats *S* *E* (for speech) *AUDCAP* *ENDAUDCAP* (for audio description). which ones are the good ones ?

thecatzman

In the "how to use" windows, you said : Check <s>...</s> tag format Add ... descriptions ect… but in the examples and on the page of the model they use the tag formats <s> (for speech) (for audio description). which ones are the good ones ?</s>

thecatzman

New version is lighting up my antivirus "Threat found - action needed. 10/6/2025 9:38 PM Severe Detected: Trojan:Script/Wacatac.C!ml Status: Active Active threats have not been remediated and are running on your device. Date: 10/6/2025 9:38 PM Details: This program is dangerous and executes commands from an attacker. Affected items: file: C:\Users\ Downloads\Ovi_Pro_v5.zip Learn more"

leem0nchu

i have dual GPUs too. REM means it is commented. so make a copy of bat file and add this line before call line as SET CUDA_VISIBLE_DEVICES=1 - message me from discord i will send you bat file

Furkan Gözükara

Hi, great work. Everything worked and I managed to create a video just fine. I have two GPU's a 4060 8gb and 5060ti 16gb, yet It ignores the 16gb even though I change the REM SET CUDA_VISIBLE_DEVICES=0 as it was =1. (or set CUDA_VISIBLE_DEVICES=0) The 8gb is plugged to monitor so I use the 5060 in OVI/ComfyUI etc for rendering. What can I do to fix this?

Daniel Smith

try 4.5 and email me cmd logs : monstermmorpg@gmail.com

Furkan Gözükara

I’ve updated to version 3 and v2 worked better. It did find my card but not building. It gets stuck.

James Charleston II

what is 1tgb? are you on v 4.5? how much RAM you have and what GPU

Furkan Gözükara

Hello, I got everything installed and running, but running this even with the 1tgb preset it slows down my computer a lot and a few times it slowed it down so much I had to hold in the button to turn it off. I'm not sure what to really do about it.

James Woodill

hi install error. please have Python 3.10.11 installed. follow this tutorial : https://youtu.be/DrhUHnYfwC0

Furkan Gözükara

I ran the installer ok on Windows 64 but when trying to run I get the error: The system cannot find the path specified. Traceback (most recent call last): File "C:\Ovi_Pro_v3\Ovi_Pro\premium.py", line 1, in import gradio as gr ModuleNotFoundError: No module named 'gradio'

Alexandre Rangel

thanks gonna check now

Furkan Gözükara

Thank you for this awesome release. I'm currently testing on H100 and 60 steps (3 min generation/video). Ive maybe found a bug, no matter which video length I set, the videos are always 5 seconds long, even if I set 10 seconds or any other number

FalconBravery

i think you need to change prompt. did you test example prompts in example tab? they are perfectly animated with default presets.

Furkan Gözükara

I mostly of times has a static video with sound, which parameter i have to change to avoid it?

Fco Muñoz

Just updated app to v3.3 for you and made preset more robust. please email me logs of Windows_Install_and_Update.bat file : monstermmorpg@gmail.com and try again

Furkan Gözükara

how much RAM VRAM you have and what preset?

Furkan Gözükara

this one easy to solve. set your virtual disk to 100 GB : https://www.windowscentral.com/software-apps/windows-11/how-to-manage-virtual-memory-on-windows-11 and try again. i will try to add fp8 scaled loading today that will reduce needed RAM

Furkan Gözükara

updated to v3, rtx 4060 8gb gpu and 64gb ram, got this error [OK] OviFusionEngine initialized successfully (models will load on first generation) [GENERATION 1/1] Starting with seed: 99 ================================================================================ STEP 1/2: Loading T5 text encoder FIRST to minimize RAM usage ================================================================================ ================================================================================ Loading OVI models for first generation... Block Swap: 29 blocks CPU Offload: True ================================================================================ Initial VRAM: 0.00 GB Removing weight norm... ================================================================================ T5 CPU-ONLY MODE: Loading T5 on CPU for CPU inference This saves VRAM but text encoding will be slower ================================================================================ [T5 CPU-ONLY MODE] Loading T5 text encoder on CPU for CPU inference [T5 CPU-ONLY MODE] This saves VRAM but encoding will be slower [T5 LOAD][BF16] Encoder structure created in 33.61s [T5 LOAD][BF16] Weights loaded in 6.59s (total 40.20s) [T5 CPU-ONLY MODE] T5 encoder ready on CPU ================================================================================ T5 loaded. Fusion model will load AFTER text encoding. ================================================================================ STEP 2/2: Encoding text and optionally deleting T5 before loading fusion model ================================================================================ Encoding text prompts... Text embeddings encoded on CPU and moved to GPU Keeping T5 on CPU (already in CPU-only mode) ================================================================================ STEP 3/3: Loading fusion model (T5 already deleted if enabled) ================================================================================ ================================================================================ Loading OVI models for first generation... Block Swap: 29 blocks CPU Offload: True ================================================================================ Initial VRAM: 0.00 GB Step 1/6: Creating model structure on meta device... Score model (Fusion) all parameters:11660753108 Step 2/6: Loading checkpoint weights to CPU... Error during video generation: The paging file is too small for this operation to complete. (os error 1455) [SINGLE-GEN] Failed - no result returned [SUBPROCESS] Generation failed with return code: 1 [GENERATION 1/1] No output file found

b

Thanks for that! Few thing to mention: (I used the update batch file to update to 3.0. But I'll try to delete venv and hit it again just to make sure the update really happened) - Cancelling doesn't work. - Saving new preset also not working Traceback (most recent call last): File "P:\AI\Ovi_Pro_v1\Ovi_Pro\venv\lib\site-packages\gradio\queueing.py", line 759, in process_events response = await route_utils.call_process_api( File "P:\AI\Ovi_Pro_v1\Ovi_Pro\venv\lib\site-packages\gradio\route_utils.py", line 354, in call_process_api output = await app.get_blocks().process_api( File "P:\AI\Ovi_Pro_v1\Ovi_Pro\venv\lib\site-packages\gradio\blocks.py", line 2112, in process_api inputs = await self.preprocess_data( File "P:\AI\Ovi_Pro_v1\Ovi_Pro\venv\lib\site-packages\gradio\blocks.py", line 1774, in preprocess_data processed_input.append(block.preprocess(inputs_cached)) File "P:\AI\Ovi_Pro_v1\Ovi_Pro\venv\lib\site-packages\gradio\components\dropdown.py", line 206, in preprocess raise Error( gradio.exceptions.Error: "Value: True is not in the list of choices: ['BlahBlah']" Traceback (most recent call last): File "P:\AI\Ovi_Pro_v1\Ovi_Pro\venv\lib\site-packages\gradio\queueing.py", line 759, in process_events response = await route_utils.call_process_api( File "P:\AI\Ovi_Pro_v1\Ovi_Pro\venv\lib\site-packages\gradio\route_utils.py", line 354, in call_process_api output = await app.get_blocks().process_api( File "P:\AI\Ovi_Pro_v1\Ovi_Pro\venv\lib\site-packages\gradio\blocks.py", line 2112, in process_api inputs = await self.preprocess_data( File "P:\AI\Ovi_Pro_v1\Ovi_Pro\venv\lib\site-packages\gradio\blocks.py", line 1774, in preprocess_data processed_input.append(block.preprocess(inputs_cached)) File "P:\AI\Ovi_Pro_v1\Ovi_Pro\venv\lib\site-packages\gradio\components\dropdown.py", line 206, in preprocess raise Error( gradio.exceptions.Error: "Value: True is not in the list of choices: ['BlahBlah']"

Richard Nagy

im getting this error with the 3.1 - Initial VRAM: 0.02 GB Step 1/6: Creating model structure on meta device... Score model (Fusion) all parameters:11660753108 Step 2/6: Loading checkpoint weights to CPU... [SUBPROCESS] Generation failed with return code: 3221225477 [GENERATION 1/1] No output file found

Agino Terra

Let me check if broken

Furkan Gözükara

It won’t run in text to video mode for me. I have to start with an image or nothing happens.

James Charleston II

I don't know sadly. I dont have AMD card to test.

Furkan Gözükara

I don't know both of them. I am working on improving the VAE for lower VRAM gpus but i will check them later.

Furkan Gözükara

Hi Furkan. Thanks for this fantastic work. I tried generating videos in Italian, but the audio is horrible. Is it possible to improve this language? Can videos be generated that are at least 10 seconds long?

michele carlone

run it with amd gpu´s?

Alexander Hempel

yes it works perfect. you can use default config

Furkan Gözükara

Will this work with 4090?

guni

5090 really fast but i dont know 3080 ti. give it a try. also making improvements. you can also generate in 20 steps to speed up

Furkan Gözükara

What are the generation time? For example on rtx 3080ti 16gb VRAM, 32 gb RAM?

ranjeet

you are welcome thanks for comment

Furkan Gözükara

Amazing work, really. Thank you so much :)

Damjan Žakelj

Update to latest version and try block swap 12

Furkan Gözükara

[GENERATE_VIDEO] Called with clear_all=False, num_generations=1 ================================================================================ INITIALIZING OVI FUSION ENGINE IN MAIN PROCESS Block Swap: 8 blocks (0 = disabled) CPU Offload: True Image Generation: False No Block Prep: False Note: Models will be loaded in main process (Clear All Memory disabled) ================================================================================ [OK] OviFusionEngine initialized successfully (models will load on first generation) [GENERATION 1/1] Starting with seed: 99 ================================================================================ STEP 1/2: Loading T5 text encoder FIRST to minimize RAM usage ================================================================================ ================================================================================ Loading OVI models for first generation... Block Swap: 8 blocks CPU Offload: True ================================================================================ Initial VRAM: 0.00 GB Removing weight norm... Loading T5 text encoder directly to GPU (BEFORE fusion model to save RAM)... Error during video generation: CUDA out of memory. Tried to allocate 160.00 MiB. GPU 0 has a total capacity of 12.00 GiB of which 0 bytes is free. Of the allocated memory 18.23 GiB is allocated by PyTorch, and 361.40 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation. See documentation for Memory Management (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables) [SINGLE-GEN] Failed - no result returned [SUBPROCESS] Generation failed with return code: 1 [GENERATION 1/1] Failed in subprocess

REGINALDO BARBOSA

fixed with v2.5 just run installer again and make sure Clear All Memory enabled

Furkan Gözükara

fixed with v2.5 just run installer again and make sure Clear All Memory enabled

Furkan Gözükara

Hi Furkan, thanks a lot for your great work on Ovi Pro Fusion! I found a bug on my RTX 4090: With block swap = 12 and CPU offload = true, the first generation works perfectly. But the second generation with the same settings always fails. Error message: RuntimeError: Input type (CUDABFloat16Type) and weight type (CPUBFloat16Type) should be the same It looks like the patch_embedding layer stays on CPU after the first run, while the inputs are already on CUDA. Without block swap/offload it doesn’t run well on 24 GB, so this is important. Restarting the app always fixes it for one run. Maybe the block swap step needs to re-sync that layer on every run. Thanks again for this amazing project!

macmotu

Thanks for your amazing work, dear Furkan. Unfortunately, it did not work for me, I will wait for the next version. Here is the error I got, in case you're interested: "INFERENCE STARTING - VRAM: 16.58 GB Block swap active: 12/30 blocks on CPU ================================================================================ 2it [08:37, 258.78s/it] ERROR:root:Traceback (most recent call last): File "E:\AI\Ovi_Pro\Ovi_Pro\ovi\ovi_fusion_engine.py", line 455, in generate pred_vid_pos, pred_audio_pos = self.model( File "E:\AI\Ovi_Pro\Ovi_Pro\venv\lib\site-packages\torch\nn\modules\module.py", line 1773, in _wrapped_call_impl return self._call_impl(*args, **kwargs) File "E:\AI\Ovi_Pro\Ovi_Pro\venv\lib\site-packages\torch\nn\modules\module.py", line 1784, in _call_impl return forward_call(*args, **kwargs) File "E:\AI\Ovi_Pro\Ovi_Pro\ovi\modules\fusion.py", line 310, in forward vid, audio = gradient_checkpointing( File "E:\AI\Ovi_Pro\Ovi_Pro\ovi\modules\model.py", line 22, in gradient_checkpointing return module(*args, **kwargs) File "E:\AI\Ovi_Pro\Ovi_Pro\ovi\modules\fusion.py", line 223, in single_fusion_block_forward assert not torch.equal(og_audio, audio), "Audio should be changed after cross-attention!" torch.AcceleratorError: CUDA error: an illegal memory access was encountered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions. Error during video generation: cannot unpack non-iterable NoneType object"

JP LONB

8 GB memory is too low. so far i got as low as 8.2 GB. I am trying to add FP8_Scaled right now lets see if that can fix your issue

Furkan Gözükara

rtx 4060 with 8gb gpu and 64gb ram memory not working, seems like memory issue

b

thanks a lot. working to add more features today hopefully

Furkan Gözükara

You are amazing Dr! thank you once again!!

Hipno

Looks like it's not going to work for me. Started fresh. 512X512. Max block swapping. Ran out of memory.

DanO..

Just gave it a try. Ran out of memory with RTX 3060 12GB VRAM. I lowered resolution and maxed out swapped blocks but wouldn't generate. It appears as though it leaves system memory (40GB) and VRAM maxed out after failed attempt. Needs to clean it out after failure or add some mechanism to clear out memory. Restarting to try with lower resolution and maxed block swapping.

DanO..


More Creators