Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I get tfrecord data of proteins? #8

Open
eunos-1128 opened this issue Oct 5, 2023 · 2 comments
Open

How can I get tfrecord data of proteins? #8

eunos-1128 opened this issue Oct 5, 2023 · 2 comments

Comments

@eunos-1128
Copy link

eunos-1128 commented Oct 5, 2023

Thank you for your work to help train/fine-tune AF2/OpenFold/pLMFold models.

I tried to run pLMFold's training using my own protein datasets, but couldn't figure out how proteins' tfrecord data can be obtained.

Reading Paper, Supplementary Data and README didn't help me because it has no descriptions in detail about obtaining tfrecord data.

I tried to make use of AF2 modules to get those data.
It seems to work but I found that some features written in the paper are missing in features generated by correspondent AF2 codes(template_all_atom_exists and pdb_cluster_size).

How could I obtain necessary features from my own proteins' data to train/fine-tune the model?
Is there any tool to do so?

I need your help.

Ref. #7

@lijxgit
Copy link

lijxgit commented Oct 7, 2023

I have the same problem. Could you give me a help?

@eunos-1128
Copy link
Author

eunos-1128 commented Oct 10, 2023

And if you used only AF2 modules to collect training data, could you tell me which AF2 modules (functions/methods) you used?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants