Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

模型转为diffusers有精度损失 #655

Open
2 tasks
linwenzhao1 opened this issue Jan 11, 2025 · 4 comments
Open
2 tasks

模型转为diffusers有精度损失 #655

linwenzhao1 opened this issue Jan 11, 2025 · 4 comments
Assignees

Comments

@linwenzhao1
Copy link

linwenzhao1 commented Jan 11, 2025

System Info / 系統信息

diffusers:0.32.dev0
cuda:12.0

Information / 问题信息

  • The official example scripts / 官方的示例脚本
  • My own modified scripts / 我自己修改的脚本和任务

Reproduction / 复现过程

使用convert_weight_sat2hf.py转化2b全参微调的模型,然后用下面命令加载推理,视频有较大的精度损失
pipe = CogVideoXPipeline.from_pretrained(args.pretrained_model_name_or_path, torch_dtype=torch.float16).to(device)
pipe.scheduler = CogVideoXDPMScheduler.from_config(pipe.scheduler.config, timestep_spacing="trailing")

Expected behavior / 期待表现

模型转化后无太大精度损失

@zRzRzRzRzRzRzR zRzRzRzRzRzRzR self-assigned this Jan 11, 2025
@zRzRzRzRzRzRzR
Copy link
Member

这个模型应该用FP16进行转换和推理,在转换的时候,应该用FP32载VAE模块。

@linwenzhao1
Copy link
Author

linwenzhao1 commented Jan 11, 2025

这个模型应该用FP16进行转换和推理,在转换的时候,应该用FP32载VAE模块。

刚才代码发错了。应该不是数据格式问题,以下是转化代码,还是有精度损失
dtype=torch.float16
args.transformer_ckpt_path = "/home/workspace/CogVideo/sat/ckpts_lora_2b/lora-disney_full_2b_0f-01-04-10-01/9500/mp_rank_00_model_states.pt"
if args.transformer_ckpt_path is not None:
init_kwargs = get_transformer_init_kwargs(args.version) ##1.0
transformer = convert_transformer(
args.transformer_ckpt_path,
args.num_layers, ##30
args.num_attention_heads, ##30
args.use_rotary_positional_embeddings, ##False
args.i2v, ##False
dtype,##fp16
init_kwargs,
)
args.vae_ckpt_path = "/home/CogVideoX-2b-sat/vae/3d-vae.pt"
if args.vae_ckpt_path is not None:
# Keep VAE in float32 for better quality
vae = convert_vae(args.vae_ckpt_path, args.scaling_factor, args.version, torch.float32)

text_encoder_id = "/home/t5-v1_1-xxl/"
tokenizer = T5Tokenizer.from_pretrained(text_encoder_id, model_max_length=TOKENIZER_MAX_LENGTH)
text_encoder = T5EncoderModel.from_pretrained(text_encoder_id, cache_dir=args.text_encoder_cache_dir)

if args.typecast_text_encoder:
    text_encoder = text_encoder.to(dtype=dtype)

transformer转化直接跑不起来,需要手动移除一些算子

@zRzRzRzRzRzRzR
Copy link
Member

跑不起来的报错事?

@linwenzhao1
Copy link
Author

linwenzhao1 commented Jan 13, 2025

跑不起来的报错事?

这行代码改为False能正常跑transformer.load_state_dict(original_state_dict, strict=False) 不然需要移除conditioner和first_stage_model算子,对于cogvideo 5b还需要移除patch_embed.pos_embedding才能加载

但是2b转模型后会多生成物体,或者生成物体的位置不对,例如杯子放在了桌子旁边空中

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants