This repository has been archived by the owner on Aug 1, 2024. It is now read-only.
Replies: 1 comment
-
Similar question here! Is there a padding token of some sort we can use to rank insertions/deletions? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
For a given pdb, the native sequence extracted from the file may be a different length than the variant sequences of interest. I want to score those variants using ESM-IF model, but the model can't evaluate those variants, because of tensor size mismatch. I tried using - in the variants input file to pad the sequences to equal length, that won't work because "-" is not in the alphabet object (I think, this is a constraint due to biotite.Alphabet and ProteinSequence objects).
Is there a way to use padding or (insertion or deletions) in the input variant tokenization? Is it possible to implement this simply. I don't know enough about the internals of esm package to be able to pull this off with a simple work around.
Beta Was this translation helpful? Give feedback.
All reactions