Hunyuan Model Text to Video Memory requirement #10484

AvisP · 2025-01-07T13:50:12Z

AvisP
Jan 7, 2025

How much GPU memory does the diffusers implementation of the Hunyuan Model take? I tried to run it on a H100 but it didn't work got the following error. Did anyone successfully manage to run and get an output?

 File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1747, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/diffusers/models/attention_processor.py", line 588, in forward
    return self.processor(
           ^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/diffusers/models/transformers/transformer_hunyuan_video.py", line 117, in __call__
    hidden_states = F.scaled_dot_product_attention(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torch.OutOfMemoryError: CUDA out of memory. Tried to allocate 26.40 GiB. GPU 0 has a total capacity of 79.32 GiB of which 19.74 GiB is free. Process 2282098 has 59.57 GiB memory in use. Of the allocated memory 56.78 GiB is allocated by PyTorch, and 2.06 GiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)

The code I was using is

import torch
        from diffusers import BitsAndBytesConfig as DiffusersBitsAndBytesConfig, HunyuanVideoTransformer3DModel, HunyuanVideoPipeline
        from diffusers.utils import export_to_video

        # from hyvideo.modules.models import HUNYUAN_VIDEO_CONFIG
        # from hyvideo.constants import PROMPT_TEMPLATE_ENCODE, PROMPT_TEMPLATE_ENCODE_VIDEO 
        # print(list(HUNYUAN_VIDEO_CONFIG.keys()))

        quant_config = DiffusersBitsAndBytesConfig(load_in_8bit=True)
        transformer_8bit = HunyuanVideoTransformer3DModel.from_pretrained(
            "tencent/HunyuanVideo",
            subfolder="transformer",
            quantization_config=quant_config,
            torch_dtype=torch.bfloat16,
            revision="refs/pr/18",
        )

        pipeline = HunyuanVideoPipeline.from_pretrained(
            "tencent/HunyuanVideo",
            transformer=transformer_8bit,
            torch_dtype=torch.float16,
            revision="refs/pr/18",
            device_map="balanced",
        )#.to("cuda")
        
        prompt = "A cat walks on the grass, realistic style."
        output = self.pipeline(prompt=prompt, 
                            height=720,
                            width=1280,  
                            num_frames=129, 
                            num_inference_steps=30
                            ).frames[0]
           save_path = 'cat.mp4'
           export_to_video(output, save_path, fps=15)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hunyuan Model Text to Video Memory requirement #10484

{{title}}

Replies: 0 comments

Select a reply

Hunyuan Model Text to Video Memory requirement #10484

AvisP Jan 7, 2025

Replies: 0 comments

AvisP
Jan 7, 2025