From b624563b73fcc8823d8108994cbd01bbb488a7fd Mon Sep 17 00:00:00 2001 From: Jan Kosinski Date: Wed, 8 Jan 2025 08:57:45 +0100 Subject: [PATCH] Clarify data dir for structure prediction --- README.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/README.md b/README.md index 5bfc4b58..d02f44ef 100644 --- a/README.md +++ b/README.md @@ -194,7 +194,7 @@ alphafold_database/ # Total: ~ 2.2 TB (download: 438 > Uniclust30 is the version of the database generated before 2019, UniRef30 is the one generated after 2019. Please note that AlphaPulldown is using UniRef30_2023_02 by default. This version can be downloaded by [this script](https://github.com/KosinskiLab/alphafold/blob/main/scripts/download_uniref30.sh). Alternatively, please overwrite the default path to the uniref30 database using the --uniref30_database_path flag of create_individual_features.py. > [!NOTE] -> Since the local installation of all genetic databases is space-consuming, you can alternatively use the [remotely-run MMseqs2 and ColabFold databases](https://github.com/sokrypton/ColabFold). Follow the corresponding [instructions](#13-run-using-mmseqs2-and-colabfold-databases-faster). However, for AlphaPulldown to function, you must download the parameters stored in the `params/` directory of the AlphaFold database. +> Since the local installation of all genetic databases is space-consuming, you can alternatively use the [remotely-run MMseqs2 and ColabFold databases](https://github.com/sokrypton/ColabFold). Follow the corresponding [instructions](#13-run-using-mmseqs2-and-colabfold-databases-faster). However, for AlphaPulldown to function, you must download the parameters stored in the `params/` directory of the AlphaFold database by downloading and executing this script: https://github.com/google-deepmind/alphafold/blob/main/scripts/download_alphafold_params.sh

@@ -1075,7 +1075,7 @@ source activate AlphaPulldown run_multimer_jobs.py \ --mode=custom \ --monomer_objects_dir= \ - --data_dir= I am running a few minutes late; my previous meeting is running over. + --data_dir= --protein_lists= \ --output_path= \ --num_cycle= \ @@ -1087,7 +1087,7 @@ Explanation of arguments: * Instead of `` provide the path to the directory containing the `.pkl` feature files generated in the [first step](#11-basic-run). The path is the same as `--output_dir` for `create_individual_features.py`. * Instead of `` provide the path to a text file containing a list of protein combinations to be modeled. * Instead of `` provide a path where subdirectories containing the final structures will be saved. -* Instead of `` provide a path to the genetic database (see [0. Alphafold-databases](#installation) of the installation part). +* Instead of `` provide a path to the directory containing `params` directory with AlphaFold paramaters (see [0. Alphafold-databases](#installation) of the installation part). Sequence and PDB databases are not needed at this stage. * `--num_cycle`: specifies the number of times the AlphaFold neural network will run, using the output of one cycle as input for the next. Increasing this number may improve the quality of the final structures (especially for large complexes), but it will also increase the runtime. * `--num_predictions_per_model`: Specifies the number of predictions per model. The number of predicted structures is N\*5. The default value is 1, which gives 5 structures.