From b32fa50c4429555e2a4d8cc8f034b90938eb230a Mon Sep 17 00:00:00 2001 From: Dima <33123184+DimaMolod@users.noreply.github.com> Date: Thu, 24 Oct 2024 13:12:55 +0200 Subject: [PATCH 01/18] Fix #435 ...and remove `module load` in all slurm scripts examples --- README.md | 21 ++++----------------- 1 file changed, 4 insertions(+), 17 deletions(-) diff --git a/README.md b/README.md index 9b0e5de0..bd85db31 100644 --- a/README.md +++ b/README.md @@ -712,7 +712,7 @@ Create the `create_individual_features_SLURM.sh` script and place the following #SBATCH -o logs/create_individual_features_%A_%a_out.txt #qos sets priority -#SBATCH --qos=low +#SBATCH --qos=normal #Limit the run to a single node #SBATCH -N 1 @@ -721,11 +721,7 @@ Create the `create_individual_features_SLURM.sh` script and place the following #SBATCH --ntasks=8 #SBATCH --mem=64000 -module load HMMER/3.4-gompi-2023a -module load HH-suite/3.3.0-gompi-2023a eval "$(conda shell.bash hook)" -module load CUDA/11.8.0 -module load cuDNN/8.7.0.84-CUDA-11.8.0 conda activate AlphaPulldown # CUSTOMIZE THE FOLLOWING SCRIPT PARAMETERS FOR YOUR SPECIFIC TASK: @@ -742,13 +738,7 @@ create_individual_features.py \ ##### ``` -Make the script executable by running: - -```bash -chmod +x create_individual_features_SLURM.sh -``` - -Next, execute the following commands, replacing `` with the path to your input FASTA file: +Execute the following commands, replacing `` with the path to your input FASTA file: ```bash mkdir logs @@ -1160,11 +1150,8 @@ Create the `run_multimer_jobs_SLURM.sh` script and place the following code in i #Adjust this depending on the node #SBATCH --ntasks=8 #SBATCH --mem=64000 - -module load Anaconda3 -module load CUDA/11.8.0 -module load cuDNN/8.7.0.84-CUDA-11.8.0 -source activate AlphaPulldown +eval "$(conda shell.bash hook)" +conda activate AlphaPulldown MAXRAM=$(echo `ulimit -m` '/ 1024.0'|bc) GPUMEM=`nvidia-smi --query-gpu=memory.total --format=csv,noheader,nounits|tail -1` From 83bf24bbb031f307971624178516ae7d1df24e24 Mon Sep 17 00:00:00 2001 From: Dima <33123184+DimaMolod@users.noreply.github.com> Date: Thu, 24 Oct 2024 13:15:04 +0200 Subject: [PATCH 02/18] Always install hhsuite and hmmer in conda --- README.md | 5 ----- 1 file changed, 5 deletions(-) diff --git a/README.md b/README.md index bd85db31..d8c7e053 100644 --- a/README.md +++ b/README.md @@ -430,11 +430,6 @@ AlphaPulldown can be used as a set of scripts for every particular step. ```bash conda create -n AlphaPulldown -c omnia -c bioconda -c conda-forge python==3.11 openmm==8.0 pdbfixer==1.9 kalign2 hhsuite hmmer modelcif -``` - -**Optionally**, if you do not have it yet on your system, install [HMMER](http://hmmer.org/documentation.html) from Anaconda: - -```bash source activate AlphaPulldown ``` This usually works, but on some compute systems, users may prefer to use other versions or optimized builds of HMMER and HH-suite that are already installed. From 94a790d736321165b262c1ed1cc5ea554ccd39e3 Mon Sep 17 00:00:00 2001 From: Dima <33123184+DimaMolod@users.noreply.github.com> Date: Thu, 24 Oct 2024 13:23:44 +0200 Subject: [PATCH 03/18] Fix #411 --- README.md | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/README.md b/README.md index d8c7e053..c99f2f40 100644 --- a/README.md +++ b/README.md @@ -320,6 +320,12 @@ example:2:1-50 example:1-50_example:1-50 example:1:1-50_example:1:1-50 ``` +One can also specify several amino acid ranges in one line to be modeled together: + +``` +example:1-50:70-100 +example:2:1-50:70-100 +``` This format similarly extends for the folding of heteromers: From 71de6512d6fbb6f75ba1a292b674079cc74531cc Mon Sep 17 00:00:00 2001 From: Dima <33123184+DimaMolod@users.noreply.github.com> Date: Thu, 24 Oct 2024 13:28:57 +0200 Subject: [PATCH 04/18] Fix #413 --- README.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index c99f2f40..80b078df 100644 --- a/README.md +++ b/README.md @@ -296,7 +296,12 @@ You can delete the now. ## 2. Configuration -Adjust `config/config.yaml` for your particular use case. +Adjust `config/config.yaml` for your particular use case. It is possible to use pre-calculated features (e.g. [downloaded from our features database](https://github.com/KosinskiLab/AlphaPulldown?tab=readme-ov-file#installation)) by adding paths to the features to your config/config.yaml + +```yaml +feature_directory : + - "/path/to/directory/with/features/" +``` If you want to use CCP4 for analysis, open `config/config.yaml` in a text editor and change the path to the analysis container to: From 911a252d9df9cd03b9f9b7f19ca5510c4a850c21 Mon Sep 17 00:00:00 2001 From: Dima <33123184+DimaMolod@users.noreply.github.com> Date: Thu, 24 Oct 2024 13:37:57 +0200 Subject: [PATCH 05/18] Fix #420 --- README.md | 69 ++++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 68 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index 80b078df..01a11c76 100644 --- a/README.md +++ b/README.md @@ -378,9 +378,76 @@ Slurm specific parameters that do not need to be modified by non-expert users. **only_generate_features** If set to True, stops after generating features and does not perform structure prediction and reporting. + ## 3. Execution -After following the Installation and Configuration steps, you are now ready to run the snakemake pipeline. To do so, navigate into the cloned pipeline directory and run: +After following the Installation and Configuration steps, you are now ready to run the Snakemake pipeline. To do so, navigate into the cloned pipeline directory and run: + +```bash +snakemake \ + --use-singularity \ + --singularity-args "-B /scratch:/scratch \ + -B /g/kosinski:/g/kosinski \ + --nv " \ + --jobs 200 \ + --restart-times 5 \ + --profile slurm_noSidecar \ + --rerun-incomplete \ + --rerun-triggers mtime \ + --latency-wait 30 \ + -n +``` + +> [!Warning] +> Running Snakemake in the foreground on a remote server can cause the process to terminate if the session is disconnected. To avoid this, you can run Snakemake in the background and redirect the output to log files. Here are two approaches depending on your environment: + +- **For SLURM clusters:** Use `srun` to submit the job in the background: + + ```bash + srun --job-name=snakemake_job --output=snakemake_output.log --error=snakemake_error.log \ + snakemake \ + --use-singularity \ + --singularity-args "-B /scratch:/scratch \ + -B /g/kosinski:/g/kosinski \ + --nv " \ + --jobs 200 \ + --restart-times 5 \ + --profile slurm_noSidecar \ + --rerun-incomplete \ + --rerun-triggers mtime \ + --latency-wait 30 \ + ``` + +- **For non-SLURM systems:** You can use `screen` to run the process in a persistent session: + + 1. Start a `screen` session: + ```bash + screen -S snakemake_session + ``` + 2. Run Snakemake as usual: + ```bash + snakemake \ + --use-singularity \ + --singularity-args "-B /scratch:/scratch \ + -B /g/kosinski:/g/kosinski \ + --nv " \ + --jobs 200 \ + --restart-times 5 \ + --profile slurm_noSidecar \ + --rerun-incomplete \ + --rerun-triggers mtime \ + --latency-wait 30 \ + ``` + 3. Detach from the `screen` session by pressing `Ctrl + A` then `D`. You can later reattach with: + ```bash + screen -r snakemake_session + ``` + +By following these methods, you ensure that Snakemake continues running even if the remote session disconnects. + +--- + +This should guide users in handling both SLURM and non-SLURM environments when running the pipeline. ```bash snakemake \ From 42da7acdefc8d0f5bebb754a78ccc2a0d5eb7aad Mon Sep 17 00:00:00 2001 From: Dima <33123184+DimaMolod@users.noreply.github.com> Date: Thu, 24 Oct 2024 13:40:29 +0200 Subject: [PATCH 06/18] Do not save logs again --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index 01a11c76..2028e866 100644 --- a/README.md +++ b/README.md @@ -404,7 +404,7 @@ snakemake \ - **For SLURM clusters:** Use `srun` to submit the job in the background: ```bash - srun --job-name=snakemake_job --output=snakemake_output.log --error=snakemake_error.log \ + srun --job-name=snakemake_job \ snakemake \ --use-singularity \ --singularity-args "-B /scratch:/scratch \ From aeddb643b557026ef005490eed058a083e90d916 Mon Sep 17 00:00:00 2001 From: Dima <33123184+DimaMolod@users.noreply.github.com> Date: Thu, 7 Nov 2024 14:43:39 +0100 Subject: [PATCH 07/18] new line --- README.md | 1 + 1 file changed, 1 insertion(+) diff --git a/README.md b/README.md index b5b0538e..bc5c0fd3 100644 --- a/README.md +++ b/README.md @@ -304,6 +304,7 @@ feature_directory : ``` > [!NOTE] > If your folders contain compressed features, you have to set `--compress-features` flag to True, otherwise AlphaPulldown will not recognize these features and start calculations from scratch! + If you want to use CCP4 for analysis, open `config/config.yaml` in a text editor and change the path to the analysis container to: ```yaml From d496910ffc687b0a53ffc88a139ffdff654b36b8 Mon Sep 17 00:00:00 2001 From: Jan Kosinski Date: Fri, 15 Nov 2024 17:34:01 +0100 Subject: [PATCH 08/18] Refine create_individual_features.py SLURM example --- README.md | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-) diff --git a/README.md b/README.md index a14ea033..a8a1513e 100644 --- a/README.md +++ b/README.md @@ -719,20 +719,14 @@ Create the `create_individual_features_SLURM.sh` script and place the following #SBATCH --ntasks=8 #SBATCH --mem=64000 -module load HMMER/3.4-gompi-2023a -module load HH-suite/3.3.0-gompi-2023a -eval "$(conda shell.bash hook)" -module load CUDA/11.8.0 -module load cuDNN/8.7.0.84-CUDA-11.8.0 -conda activate AlphaPulldown +module load Mamba +source activate AlphaPulldown # CUSTOMIZE THE FOLLOWING SCRIPT PARAMETERS FOR YOUR SPECIFIC TASK: #### create_individual_features.py \ --fasta_paths=example_1_sequences.fasta \ - --data_dir=/scratch/AlphaFold_DBs/2.3.2 - -/ \ + --data_dir=/scratch/AlphaFold_DBs/2.3.2 \ --output_dir=/scratch/mydir/test_AlphaPulldown/ \ --max_template_date=2050-01-01 \ --skip_existing=True \ From ff9de9c4d27bbe9dc5beca0086a52fe54c8a31ab Mon Sep 17 00:00:00 2001 From: Dima Molodenskiy Date: Fri, 8 Nov 2024 14:25:19 +0100 Subject: [PATCH 09/18] New. --- test/test_data/protein_lists/test_long_name.txt | 1 + 1 file changed, 1 insertion(+) create mode 100755 test/test_data/protein_lists/test_long_name.txt diff --git a/test/test_data/protein_lists/test_long_name.txt b/test/test_data/protein_lists/test_long_name.txt new file mode 100755 index 00000000..602107c6 --- /dev/null +++ b/test/test_data/protein_lists/test_long_name.txt @@ -0,0 +1 @@ +A0A075B6L2,10,1-3,4-5,6-7,7-8 \ No newline at end of file From 776c268e7ceaee02db02cc18e71c660118c1eca2 Mon Sep 17 00:00:00 2001 From: Dima Molodenskiy Date: Fri, 8 Nov 2024 14:26:47 +0100 Subject: [PATCH 10/18] ignore tmp files --- .gitignore | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/.gitignore b/.gitignore index f2d4b96d..da03f8c1 100644 --- a/.gitignore +++ b/.gitignore @@ -1 +1,3 @@ -__pycache__* \ No newline at end of file +__pycache__* +.DS_Store* +.idea* From d204d8e6144e1c49a2360570cabcb6016b7cef97 Mon Sep 17 00:00:00 2001 From: Dima Molodenskiy Date: Fri, 8 Nov 2024 14:29:37 +0100 Subject: [PATCH 11/18] Add run_alphafold.py to scripts --- setup.cfg | 1 + 1 file changed, 1 insertion(+) diff --git a/setup.cfg b/setup.cfg index 0ec383ba..ae29647d 100644 --- a/setup.cfg +++ b/setup.cfg @@ -85,6 +85,7 @@ scripts = ./alphapulldown/scripts/create_individual_features.py ./alphapulldown/scripts/convert_to_modelcif.py ./alphapulldown/scripts/run_structure_prediction.py ./alphapulldown/scripts/truncate_pickles.py + ./alphafold/run_alphafold.py [options.package_data] alphafold.common = stereo_chemical_props.txt From f9083702b73be8ef4b5f8fa61fd105576e730714 Mon Sep 17 00:00:00 2001 From: Dima Molodenskiy Date: Fri, 8 Nov 2024 14:34:22 +0100 Subject: [PATCH 12/18] Fix #454 --- alphapulldown/scripts/run_structure_prediction.py | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/alphapulldown/scripts/run_structure_prediction.py b/alphapulldown/scripts/run_structure_prediction.py index a33f2d06..32462a2b 100644 --- a/alphapulldown/scripts/run_structure_prediction.py +++ b/alphapulldown/scripts/run_structure_prediction.py @@ -224,8 +224,9 @@ def pre_modelling_setup( if flags.use_ap_style: output_dir = join(output_dir, object_to_model.description) - if len(output_dir) > 100: - raise ValueError(f"Output directory path is too long: {output_dir}." + if len(output_dir) > 4096: #max path length for most filesystems + # TODO: rename complex to something shorter + logging.warning(f"Output directory path is too long: {output_dir}." "Please use a shorter path with --output_directory.") makedirs(output_dir, exist_ok=True) # Copy features metadata to output directory From 828bb1ccab1185808c3231fb0d7cbb7ec1fb45e1 Mon Sep 17 00:00:00 2001 From: Dima Molodenskiy Date: Fri, 8 Nov 2024 15:11:34 +0100 Subject: [PATCH 13/18] update alphafold --- alphafold | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/alphafold b/alphafold index fd47f2da..227fc696 160000 --- a/alphafold +++ b/alphafold @@ -1 +1 @@ -Subproject commit fd47f2dae7faf479f5c3c1c7de921863b3260be5 +Subproject commit 227fc696c872feabdd3ff9c620d23de15f23b4a7 From a372787de75a75013bea47e366ce2f412feac0a2 Mon Sep 17 00:00:00 2001 From: Dima Molodenskiy Date: Mon, 11 Nov 2024 10:36:56 +0100 Subject: [PATCH 14/18] Bump 2.0.1 --- alphapulldown/__init__.py | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/alphapulldown/__init__.py b/alphapulldown/__init__.py index 8c0d5d5b..159d48b8 100755 --- a/alphapulldown/__init__.py +++ b/alphapulldown/__init__.py @@ -1 +1 @@ -__version__ = "2.0.0" +__version__ = "2.0.1" From 67965440fea2f4f2639b6c4c6fa536ab257a3308 Mon Sep 17 00:00:00 2001 From: Dima Molodenskiy Date: Wed, 13 Nov 2024 16:04:57 +0100 Subject: [PATCH 15/18] add confidence metric back to prediction_results --- alphafold | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/alphafold b/alphafold index 227fc696..b2bd782d 160000 --- a/alphafold +++ b/alphafold @@ -1 +1 @@ -Subproject commit 227fc696c872feabdd3ff9c620d23de15f23b4a7 +Subproject commit b2bd782d3b7854dfc7a94c87161898a932c0a0c8 From 3005031d8b18b454e4b60dc19777b2b016ffa2b2 Mon Sep 17 00:00:00 2001 From: Dima Molodenskiy Date: Fri, 15 Nov 2024 16:54:07 +0100 Subject: [PATCH 16/18] Fix #457 --- .github/workflows/github_actions.yml | 1 + alphapulldown/utils/modelling_setup.py | 54 +++++----- test/test_parse_fold.py | 130 +++++++++++++++++++++++++ 3 files changed, 160 insertions(+), 25 deletions(-) create mode 100644 test/test_parse_fold.py diff --git a/.github/workflows/github_actions.yml b/.github/workflows/github_actions.yml index 1c5d4a45..6404c856 100644 --- a/.github/workflows/github_actions.yml +++ b/.github/workflows/github_actions.yml @@ -68,6 +68,7 @@ jobs: pytest -s test/test_modelcif.py pytest -s test/test_features_with_templates.py pytest -s test/test_post_prediction.py + pytest -s test/test_parse_fold.py #export PYTHONPATH=$PWD/alphapulldown/analysis_pipeline:$PYTHONPATH ## Test analysis pipeline #conda install -c bioconda biopandas diff --git a/alphapulldown/utils/modelling_setup.py b/alphapulldown/utils/modelling_setup.py index 1bbdcddc..b2aa325d 100644 --- a/alphapulldown/utils/modelling_setup.py +++ b/alphapulldown/utils/modelling_setup.py @@ -21,7 +21,7 @@ logging.set_verbosity(logging.INFO) -def parse_fold(input, features_directory, protein_delimiter): +def parse_fold(input_list, features_directory, protein_delimiter): """ Parses a list of protein fold specifications and returns structured folding jobs. @@ -37,50 +37,54 @@ def parse_fold(input, features_directory, protein_delimiter): FileNotFoundError: If any required protein features are missing. """ all_folding_jobs = [] - for i in input: - formatted_folds, missing_features, unique_features = [], [], [] + missing_features = set() # Initialize as a set to collect unique missing features + for i in input_list: + formatted_folds = [] protein_folds = [x.split(":") for x in i.split(protein_delimiter)] for protein_fold in protein_folds: name, number, region = None, 1, "all" - if len(protein_fold) ==1: - # protein_fold is in this format: [protein_name] + if len(protein_fold) == 1: + # Format: [protein_name] name = protein_fold[0] elif len(protein_fold) > 1: - name, number= protein_fold[0], protein_fold[1] - if ("-") in protein_fold[1]: - # protein_fold is in this format: [protein_name:1-10:14-30:40-100:etc] + name = protein_fold[0] + if "-" in protein_fold[1]: + # Format: [protein_name:1-10:14-30:40-100:etc] try: number = 1 region = protein_fold[1:] region = [tuple(int(x) for x in r.split("-")) for r in region] - except Exception as e: - logging.error(f"Your format: {i} is wrong. The programme will terminate.") + except Exception: + logging.error(f"Your format: {i} is wrong. The program will terminate.") sys.exit() else: - # protein_fold is in this format: [protein_name:copy_number:1-10:14-30:40-100:etc] + # Format: [protein_name:copy_number:1-10:14-30:40-100:etc] try: - number = protein_fold[1] - if len(protein_fold[2:]) > 0: + number = int(protein_fold[1]) + if len(protein_fold) > 2: region = protein_fold[2:] region = [tuple(int(x) for x in r.split("-")) for r in region] - except Exception as e: - logging.error(f"Your format: {i} is wrong. The programme will terminate.") + except Exception: + logging.error(f"Your format: {i} is wrong. The program will terminate.") sys.exit() - + number = int(number) - unique_features.append(name) - if not any([exists(join(monomer_dir, f"{name}.pkl")) or exists(join(monomer_dir, f"{name}.pkl.xz")) for - monomer_dir in features_directory]): - missing_features.append(name) + # Check for missing features + if not any( + exists(join(monomer_dir, f"{name}{ext}")) + for monomer_dir in features_directory + for ext in [".pkl", ".pkl.xz"] + ): + missing_features.add(name) # Use .add() since missing_features is a set formatted_folds.extend([{name: region} for _ in range(number)]) all_folding_jobs.append(formatted_folds) - missing_features = set(missing_features) - if len(missing_features): - raise FileNotFoundError( - f"{missing_features} not found in {features_directory}" - ) + + if missing_features: + raise FileNotFoundError( + f"{sorted(missing_features)} not found in {features_directory}" + ) return all_folding_jobs def pad_input_features(feature_dict: dict, diff --git a/test/test_parse_fold.py b/test/test_parse_fold.py new file mode 100644 index 00000000..da88ac03 --- /dev/null +++ b/test/test_parse_fold.py @@ -0,0 +1,130 @@ +import logging +from absl.testing import parameterized +from unittest import mock +from alphapulldown.utils.modelling_setup import parse_fold + +""" +Test parse_fold function with different scenarios +""" + +class TestParseFold(parameterized.TestCase): + + def setUp(self) -> None: + super().setUp() + # Set logging level to INFO + logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s') + + @parameterized.named_parameters( + { + 'testcase_name': 'single_protein_no_copy', + 'input': ['protein1'], + 'features_directory': ['dir1'], + 'protein_delimiter': '_', + 'mock_side_effect': { + 'dir1/protein1.pkl': True, + 'dir1/protein1.pkl.xz': False, + }, + 'expected_result': [[{'protein1': 'all'}]], + }, + { + 'testcase_name': 'single_protein_with_copy_number', + 'input': ['protein1:2'], + 'features_directory': ['dir1'], + 'protein_delimiter': '_', + 'mock_side_effect': { + 'dir1/protein1.pkl': True, + 'dir1/protein1.pkl.xz': False, + }, + 'expected_result': [[{'protein1': 'all'}, {'protein1': 'all'}]], + }, + { + 'testcase_name': 'single_protein_with_region', + 'input': ['protein1:1-10'], + 'features_directory': ['dir1'], + 'protein_delimiter': '_', + 'mock_side_effect': { + 'dir1/protein1.pkl': True, + 'dir1/protein1.pkl.xz': False, + }, + 'expected_result': [[{'protein1': [(1, 10)]}]], + }, + { + 'testcase_name': 'single_protein_with_copy_and_regions', + 'input': ['protein1:2:1-10:20-30'], + 'features_directory': ['dir1'], + 'protein_delimiter': '_', + 'mock_side_effect': { + 'dir1/protein1.pkl': True, + 'dir1/protein1.pkl.xz': False, + }, + 'expected_result': [[{'protein1': [(1, 10), (20, 30)]}, {'protein1': [(1, 10), (20, 30)]}]], + }, + { + 'testcase_name': 'multiple_proteins', + 'input': ['protein1:2_protein2:1-50'], + 'features_directory': ['dir1'], + 'protein_delimiter': '_', + 'mock_side_effect': { + 'dir1/protein1.pkl': True, + 'dir1/protein1.pkl.xz': False, + 'dir1/protein2.pkl': True, + 'dir1/protein2.pkl.xz': False, + }, + 'expected_result': [[{'protein1': 'all'}, {'protein1': 'all'}, {'protein2': [(1, 50)]}]], + }, + { + 'testcase_name': 'missing_features', + 'input': ['protein1', 'protein2'], + 'features_directory': ['dir1'], + 'protein_delimiter': '_', + 'mock_side_effect': { + 'dir1/protein1.pkl': False, + 'dir1/protein1.pkl.xz': False, + 'dir1/protein2.pkl': False, + 'dir1/protein2.pkl.xz': False, + }, + 'expected_exception': FileNotFoundError, + 'expected_exception_message': "['protein1', 'protein2'] not found in ['dir1']", + }, + { + 'testcase_name': 'invalid_format', + 'input': ['protein1::1-10'], + 'features_directory': ['dir1'], + 'protein_delimiter': '_', + 'mock_side_effect': {}, + 'expected_exception': SystemExit, + }, + { + 'testcase_name': 'feature_exists_in_multiple_dirs', + 'input': ['protein1'], + 'features_directory': ['dir1', 'dir2'], + 'protein_delimiter': '_', + 'mock_side_effect': { + 'dir1/protein1.pkl': False, + 'dir1/protein1.pkl.xz': False, + 'dir2/protein1.pkl': True, + 'dir2/protein1.pkl.xz': False, + }, + 'expected_result': [[{'protein1': 'all'}]], + }, + ) + def test_parse_fold(self, input, features_directory, protein_delimiter, mock_side_effect, + expected_result=None, expected_exception=None, expected_exception_message=None): + """Test parse_fold with different input scenarios""" + with mock.patch('alphapulldown.utils.modelling_setup.exists') as mock_exists, \ + mock.patch('sys.exit') as mock_exit: + mock_exists.side_effect = lambda path: mock_side_effect.get(path, False) + # Mock sys.exit to raise SystemExit exception + mock_exit.side_effect = SystemExit + logging.info(f"Testing with input: {input}, features_directory: {features_directory}, " + f"protein_delimiter: '{protein_delimiter}'") + logging.info(f"Mock side effects: {mock_side_effect}") + if expected_exception: + with self.assertRaises(expected_exception) as context: + result = parse_fold(input, features_directory, protein_delimiter) + if expected_exception_message: + self.assertEqual(str(context.exception), expected_exception_message) + else: + result = parse_fold(input, features_directory, protein_delimiter) + logging.info(f"Result: {result}, Expected: {expected_result}") + self.assertEqual(result, expected_result) From 18295814015fb95ff51c4dd6a433ac2a078e4d37 Mon Sep 17 00:00:00 2001 From: Jan Kosinski Date: Fri, 15 Nov 2024 17:41:57 +0100 Subject: [PATCH 17/18] Refine prediction slurm example Anaconda3 is no longer available and cuda modules no longer necessary --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index bc5c0fd3..ec881b5c 100644 --- a/README.md +++ b/README.md @@ -1226,7 +1226,7 @@ Create the `run_multimer_jobs_SLURM.sh` script and place the following code in i #Adjust this depending on the node #SBATCH --ntasks=8 #SBATCH --mem=64000 -eval "$(conda shell.bash hook)" +module load Mamba source activate AlphaPulldown MAXRAM=$(echo `ulimit -m` '/ 1024.0'|bc) From 1f7ecc561097ac24eaca3a8df1ee4b02fc29b1a0 Mon Sep 17 00:00:00 2001 From: Dima <33123184+DimaMolod@users.noreply.github.com> Date: Thu, 21 Nov 2024 12:26:01 +0100 Subject: [PATCH 18/18] Update README.md --- README.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/README.md b/README.md index 22c5e398..83f69cb8 100644 --- a/README.md +++ b/README.md @@ -396,8 +396,7 @@ snakemake \ --profile slurm_noSidecar \ --rerun-incomplete \ --rerun-triggers mtime \ - --latency-wait 30 \ - -n + --latency-wait 30 ``` > [!Warning]