You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've been busy with the default fairseq examples/speech_recognition/infer.py and also this repo's recognize.py, to see if it is possible to run inference using a model we made ourselves by finetuning a base model. We can get the script infer.py to work, but I've noticed that it needs to be able to find the original base model on disk. Moving the checkpoint model to a different machine is cumbersome, the base model has to be in the same location on the target machine.
I've tried to study how the model loading works for almost a day now, but I can't wrap my head around it. I think it only needs some args from the original base model, there is a lot of exchange going on between formats and names cfg, w2v_args, OmegaConf and Namespace.
The recognize.py and recognize.hydra.py break on loading a checkpoint file (but they work on published finetuned models). I would be helped if there is a way to produce a model file that works with recognize.py from the original base model and a checkpoint. I have not been able to find such a tool—I believe it is as simple as adding the correct .cfg.w2v_args info to the checkpoint, but I don't understand how.
I can get recognize.py to work with a checkpoint file with the patch below, but then model loading still refers to the original base model.
I would be helped if there is a way to produce a model file that works with recognize.py from the original base model and a checkpoint.
OK, I think I got a little further. Here I posted a little script that seems to solve this issue for us.
I can't really say I've gained much understanding about the module loading process, but with the script it is possible to convert a file of type checkpoint_best.py obtained during fine-tuning to an independent model file that can be loaded with recognize.py
Hello,
I've been busy with the default fairseq
examples/speech_recognition/infer.py
and also this repo'srecognize.py
, to see if it is possible to run inference using a model we made ourselves by finetuning a base model. We can get the scriptinfer.py
to work, but I've noticed that it needs to be able to find the original base model on disk. Moving the checkpoint model to a different machine is cumbersome, the base model has to be in the same location on the target machine.I've tried to study how the model loading works for almost a day now, but I can't wrap my head around it. I think it only needs some
args
from the original base model, there is a lot of exchange going on between formats and namescfg
,w2v_args
, OmegaConf and Namespace.The recognize.py and recognize.hydra.py break on loading a checkpoint file (but they work on published finetuned models). I would be helped if there is a way to produce a model file that works with recognize.py from the original base model and a checkpoint. I have not been able to find such a tool—I believe it is as simple as adding the correct
.cfg.w2v_args
info to the checkpoint, but I don't understand how.I can get recognize.py to work with a checkpoint file with the patch below, but then model loading still refers to the original base model.
The text was updated successfully, but these errors were encountered: