-
Notifications
You must be signed in to change notification settings - Fork 261
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSA tensor format #56
Comments
@panganqi Hi! I just wanted to demonstrate that the MSA and the primary sequence does not have to be the same length (although they would probably be aligned in practice) The framework is in a good enough place that I'll start thinking about how to tackle data preprocessing! (I'd like to make it as seamless and easy as possible) How is the data laid out in your directory at the moment? |
@panganqi are you working with templates by any chance? |
I use the combined sidechainnet data which does not contain the MSA and we run hhblits on CASP data to get the MSA files. I want to combine those two to be a new dataset. And the MSA and the primary sequence are of the same length |
No, I'm working with Free Modelling mode |
@panganqi do you want to chat about this in Discord? we have an alphafold2 channel |
In the Usage, there is code like
seq = torch.randint(0, 21, (1, 128)).cuda()
msa = torch.randint(0, 21, (1, 5, 64)).cuda()
.If I have a a3m msa file, how to encode the file to this tensor? And why the seq length is 128 but the msa is 5 times 64 (5 timeshalf the length of seq?).
Could you give an example of how to use that or how to generate that msa tensor?
The text was updated successfully, but these errors were encountered: