Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RoseTTAFold-All-Atom #220

Open
wants to merge 131 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
131 commits
Select commit Hold shift + click to select a range
aa9be9f
feat(katana.config): Created file katana.config
nbtm-sh Jul 29, 2024
678b212
feat(katana.config): Added params for PBS queues
nbtm-sh Jul 29, 2024
82466e0
feat(katana.config): Added executor parameter to allow the use Katana…
nbtm-sh Jul 29, 2024
8d2a771
feat(katana.config): Added label configs for pushing to GPU partition
nbtm-sh Jul 29, 2024
1ad4996
Merge pull request #1 from Australian-Structural-Biology-Computing/cr…
nbtm-sh Jul 29, 2024
3c1fb28
feat(run_alphafold2): Added 'gpu_compute' label to the Alphafold process
nbtm-sh Jul 29, 2024
67704b4
Merge pull request #2 from Australian-Structural-Biology-Computing/ad…
nbtm-sh Jul 29, 2024
e8d2abb
feat(run_alphafold2_pred): Added 'gpu_compute' label
nbtm-sh Jul 29, 2024
2021623
Merge pull request #3 from Australian-Structural-Biology-Computing/ad…
nbtm-sh Jul 29, 2024
13dd1cb
revert(run_alphafold2.nf): Removed GPU compute label from pipeline
nbtm-sh Jul 29, 2024
b0d483e
Merge pull request #4 from Australian-Structural-Biology-Computing/ad…
nbtm-sh Jul 29, 2024
2444395
Updated database links
jscgh Jul 29, 2024
4ce2e18
Merge pull request #6 from Australian-Structural-Biology-Computing/da…
jscgh Jul 30, 2024
8b9452a
feat(pf_files): Added testing files
nbtm-sh Jul 30, 2024
32311ba
Merge branch 'unsw-dev' into add-testing-files
nbtm-sh Jul 30, 2024
e676c33
Merge pull request #7 from Australian-Structural-Biology-Computing/ad…
nbtm-sh Jul 30, 2024
0962f91
fix(proteinfold_test.sh): Made path to main.nf rel
nbtm-sh Jul 30, 2024
992d6d1
revert(base.config): Changed executor back to local for testing as cl…
nbtm-sh Jul 30, 2024
32d466c
fix(proteinfold_test.sh): Changed mode to 'split_msa_production'
nbtm-sh Jul 30, 2024
d135dc8
Merge pull request #2 from nf-core/master
ziadbkh Aug 1, 2024
b3140e7
fix(dbs.conf): Updated dbs.conf to work on UNSW infrastructure
nbtm-sh Aug 8, 2024
4047e62
fix(run_alphafold2_msa): Fixed incorrectly named files
nbtm-sh Aug 8, 2024
93513bc
fix(run_alphafold2_pred): Fixed incorrectly named files
nbtm-sh Aug 8, 2024
a007d5a
fix(proteinfold_test.sh): Added singulairty argument
nbtm-sh Aug 8, 2024
232c8c9
fix(samplesheet): Changed sample to a much smaller sample
nbtm-sh Aug 8, 2024
03f2575
fix(samplesheet): Changed sampel to a smaller sample
nbtm-sh Aug 8, 2024
632610b
Merge pull request #8 from Australian-Structural-Biology-Computing/ad…
nbtm-sh Aug 8, 2024
964f5d0
feat(conf/dbs): Added variables for database names, and file names
nbtm-sh Aug 8, 2024
2a79fe4
feat(conf/dbs): Changed config paths to use database variables instea…
nbtm-sh Aug 8, 2024
c218ad2
feat(run_alphafold2): Changed hardcoded paths to use variables and up…
nbtm-sh Aug 8, 2024
edff052
feat(run_alphafold2_msa): Removed hardcoded paths and changed variables
nbtm-sh Aug 8, 2024
ea4459a
feat(run_alphafold2_msa): Added code from run_alphafold2.nf so that t…
nbtm-sh Aug 8, 2024
83ca302
fix(conf/dbs): Changed variable names to have _prefix on the end to a…
nbtm-sh Aug 8, 2024
faba0ab
fix(conf/dbs): Changed existing variables to use new prefix variables
nbtm-sh Aug 8, 2024
04cad9d
feat(nextflow.config): Added new param variables and defaults to the …
nbtm-sh Aug 8, 2024
afeb122
feat(dbs): Made variables global
nbtm-sh Aug 9, 2024
3d615b7
fix(dbs): Changed database directory default
nbtm-sh Aug 16, 2024
cb29256
feat(katana): Temporarily removed PBS job scheduling
nbtm-sh Aug 16, 2024
5829301
fix(run_alphafold2): Fixed copy command to point to the correct direc…
nbtm-sh Aug 16, 2024
485e400
fix(run_alphafold2): Updated paths to point to the correct uniclust d…
nbtm-sh Aug 16, 2024
d008f38
fix(run_alphafold2): Fixed typo
nbtm-sh Aug 16, 2024
a0dbd9c
feat(run_alphafold2): Added symlink for params file
nbtm-sh Aug 16, 2024
598cc26
feat(nextflow): Changed default to use GPU
nbtm-sh Aug 16, 2024
eba4412
feat(nextflow): Included katana config
nbtm-sh Aug 16, 2024
4811d3c
feat(test): Added options to katana tests
nbtm-sh Aug 16, 2024
9212111
revert(nextflow): Changed default GPU to false
nbtm-sh Aug 16, 2024
0795469
revert(nextflow): Changed config back to base config
nbtm-sh Aug 16, 2024
3ade215
modified: conf/dbs.config
jscgh Aug 23, 2024
5fe8e1d
feat(katana): Added katana config
nbtm-sh Sep 5, 2024
9af55b1
feat(style): pushing uncommited changes
nbtm-sh Oct 10, 2024
8ce0598
Merge pull request #9 from Australian-Structural-Biology-Computing/cl…
nbtm-sh Oct 10, 2024
3a13ab7
deleted: null/pipeline_info/ as per https://github.com/Australian-…
jscgh Oct 11, 2024
1d0f413
Merge branch 'unsw-dev' into add-rosettafold-all-atom
jscgh Oct 14, 2024
c993475
Initial draft rosettafold-all-atom.nf
jscgh Oct 14, 2024
65897c8
added workflows/rosettafold-all-atom.nf first draft
jscgh Oct 16, 2024
f362ae9
Updating main.nf to current master version and adding RFAA lines
jscgh Oct 16, 2024
2299830
Imported subworkflows and fixed formatting errors with RFAA lines
jscgh Oct 18, 2024
4de580a
Adjusted naming to snake_case for compatibility, various minor change…
jscgh Oct 21, 2024
7faa62a
Added schema support for rosetta_fold_all_atom mode and .yaml or .yml…
jscgh Oct 21, 2024
375e330
modified: assets/schema_input.json
jscgh Oct 21, 2024
6fdb111
Updating input to work with .yaml https://github.com/Australian-Stru…
jscgh Oct 21, 2024
f910f1c
Merge branch 'master' into add-rosettafold-all-atom
jscgh Oct 22, 2024
3a9d7f2
Merging
jscgh Oct 22, 2024
2136354
Updated naming scheme with merged changes
jscgh Oct 22, 2024
b0f13c3
modified: modules/local/run_alphafold2.nf
jscgh Oct 22, 2024
903b6a2
For https://github.com/nf-core/proteinfold/issues/197
jscgh Oct 22, 2024
9aa5054
Cleaned up files
jscgh Oct 22, 2024
1f29711
Merge remote-tracking branch 'refs/remotes/origin/add-rosettafold-all…
jscgh Oct 22, 2024
2eae3c1
Ran nf-core schema build
jscgh Oct 22, 2024
1352649
Merged with dev
jscgh Oct 22, 2024
f07c612
Ran nf-core schema build
jscgh Oct 22, 2024
637d67c
Dealing with permissions
jscgh Oct 22, 2024
1dd9f03
Readding directory
jscgh Oct 22, 2024
09c64fa
modified: nextflow_schema.json
jscgh Oct 22, 2024
3dd9367
Removed deprecated "check_max"
jscgh Oct 22, 2024
0fb9735
Aligning input channels for RFAA
jscgh Oct 23, 2024
125a702
Aligning input channels for RFAA
jscgh Oct 23, 2024
ef6f516
Runs through rfaa -profile test and -stub successfully
jscgh Oct 23, 2024
1f6e1dd
modified: modules/local/run_rosettafold_all_atom.nf
jscgh Oct 23, 2024
4b68ed8
Debugging RFAA
jscgh Oct 28, 2024
0eba56d
Debugging RFAA
jscgh Oct 28, 2024
9a15bdf
RFAA now working to produce structures
jscgh Oct 29, 2024
51878df
Modified rfaa output to properly emit PDB file
jscgh Oct 30, 2024
076db5a
Fixed renaming pdb
jscgh Oct 30, 2024
c523139
Pipeline now completes successfully
jscgh Nov 1, 2024
70e8b6b
Cleaned up test configs
jscgh Nov 1, 2024
902ebaf
Built schema as per CONTRIBUTING.md
jscgh Nov 1, 2024
8379c50
Fixed db conflicts
jscgh Nov 1, 2024
7c9cf19
Troubleshooting benchmarks and having jobs queued and run by nextflow
jscgh Nov 4, 2024
7dd2e45
Removed leftover blast-2.2.6 references
jscgh Nov 4, 2024
a7aa7eb
Updated nextflow_schema
jscgh Nov 4, 2024
cbb7841
Katana HPC gpu compute option
jscgh Nov 4, 2024
43f7364
Fixing crashes caused by HPC not being able to reach the online custo…
jscgh Nov 4, 2024
e96e175
Ran nf-core linter
jscgh Nov 4, 2024
0c173e5
deleted: .github/workflows/linting_comment.yml
jscgh Nov 4, 2024
b58be9f
Linting files
jscgh Nov 4, 2024
11a2d9f
Genericised pdb emission
jscgh Nov 5, 2024
2558ee8
Fixing linting
jscgh Nov 5, 2024
37917d5
More linting
jscgh Nov 5, 2024
a19d4d4
Updated apptainer image paths
jscgh Nov 6, 2024
8cc32e1
Updated rosettafold_all_atom files
jscgh Nov 15, 2024
b78cdee
Updated pathing
jscgh Nov 15, 2024
122f05f
Merge remote-tracking branch 'upstream/dev' into add-rosettafold-all-…
jscgh Nov 20, 2024
1a710b1
Aligned with nf-core/dev
jscgh Nov 20, 2024
de835f9
Aligned RoseTTAFold-All-Atom module to dev base
jscgh Nov 20, 2024
ed1cf3a
Aligned RoseTTAFold-All-Atom module to dev base
jscgh Nov 20, 2024
11347c9
Aligned RoseTTAFold-All-Atom module to dev base
jscgh Nov 20, 2024
d4c1cbc
Fixed tests
jscgh Nov 20, 2024
5fcf98b
Fixed tests
jscgh Nov 20, 2024
0050033
Updated CHANGELOG started on other docs
jscgh Nov 20, 2024
b3582d7
Updated CHANGELOG started on other docs
jscgh Nov 20, 2024
6c62771
Working multiqc for RFAA
jscgh Nov 22, 2024
aaafcd2
RFAA dbs will now be downloaded by prepare_rosettafold_all_atom_dbs.nf
jscgh Nov 27, 2024
b2616f6
Added docs
jscgh Nov 27, 2024
4144328
Resolved conflict
jscgh Nov 27, 2024
fa6ef41
Merge remote-tracking branch 'upstream/dev' into add-rosettafold-all-…
jscgh Nov 27, 2024
5939ce6
Merged with nf-core/dev
jscgh Nov 27, 2024
362d06b
Passed linting
jscgh Nov 27, 2024
e1bcc66
Improved multiqc processing to fix RFAA decimals and added model to g…
jscgh Nov 28, 2024
4eef3db
Removed unrelated add-helixfold file
jscgh Nov 28, 2024
3dc1448
Merge remote-tracking branch 'upstream/dev' into add-rosettafold-all-…
jscgh Nov 29, 2024
dcaf470
Updated RFAA definition file
jscgh Dec 1, 2024
1dfe3a3
modified: dockerfiles/rosettafold_all_atom.def
jscgh Dec 2, 2024
0883a33
Added RFAA dockerfile
jscgh Dec 4, 2024
409da08
Prettier
jscgh Dec 5, 2024
ca466e6
Updated container path to repo
jscgh Dec 9, 2024
3db3aa0
Linted
jscgh Dec 10, 2024
6f04a07
Fixed LD_LIBRARY_PATH in RFAA dockerfile
jscgh Dec 10, 2024
bf08a5b
Linted
jscgh Dec 10, 2024
150e449
Linted
jscgh Dec 11, 2024
16fd780
Merge branch 'dev' into add-rosettafold-all-atom
jscgh Jan 10, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions .github/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ If you're not used to this workflow with git, you can start with some [docs from
You have the option to test your changes locally by running the pipeline. For receiving warnings about process selectors and other `debug` information, it is recommended to use the debug profile. Execute all the tests with the following command:

```bash
nextflow run . --profile debug,test,docker --outdir <OUTDIR>
nextflow run . -profile debug,test,docker --outdir <OUTDIR>
```

When you create a pull request with changes, [GitHub Actions](https://github.com/features/actions) will run automatic tests.
Expand Down Expand Up @@ -78,8 +78,8 @@ If you wish to contribute a new step, please use the following coding standards:
5. Add any new parameters to `nextflow_schema.json` with help text (via the `nf-core pipelines schema build` tool).
6. Add sanity checks and validation for all relevant parameters.
7. Perform local tests to validate that the new code works as expected.
8. If applicable, add a new test command in `.github/workflow/ci.yml`.
9. Update MultiQC config `assets/multiqc_config.yml` so relevant suffixes, file name clean up and module plots are in the appropriate order. If applicable, add a [MultiQC](https://https://multiqc.info/) module.
8. If applicable, add a new test command in `.github/workflows/ci.yml`.
9. Update MultiQC config `assets/multiqc_config.yml` so relevant suffixes, file name clean up and module plots are in the appropriate order. If applicable, add a [MultiQC](https://multiqc.info/) module.
10. Add a description of the output files and if relevant any appropriate images from the MultiQC report to `docs/output.md`.

### Default values
Expand Down
1 change: 1 addition & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,6 +44,7 @@ jobs:
- "test_colabfold_download"
- "test_esmfold"
- "test_split_fasta"
- "test_rosettafold_all_atom"
isMaster:
- ${{ github.base_ref == 'master' }}
# Exclude conda and singularity on dev
Expand Down
6 changes: 4 additions & 2 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- [[#180](https://github.com/nf-core/proteinfold/issues/180)] - Implement Fooldseek.
- [[#188](https://github.com/nf-core/proteinfold/issues/188)] - Fix colabfold image to run in gpus.
- [[PR ##205](https://github.com/nf-core/proteinfold/pull/205)] - Change input schema from `sequence,fasta` to `id,fasta`.
- [[PR #210](https://github.com/nf-core/proteinfold/pull/210)] - Moving post-processing logic to a subworkflow, change wave images pointing to oras to point to https and refactor module to match nf-core folder structure.
- [[#214](https://github.com/nf-core/proteinfold/issues/214)] - Fix colabfold image to run in cpus after [#188](https://github.com/nf-core/proteinfold/issues/188) fix.
- [[PR #210](https://github.com/nf-core/proteinfold/pull/210)]- Moving post-processing logic to a subworkflow, change wave images pointing to oras to point to https and refactor module to match nf-core folder structure.
- [[#214](https://github.com/nf-core/proteinfold/issues/214)]- Fix colabfold image to run in cpus after [#188](https://github.com/nf-core/proteinfold/issues/188) fix.
- [[PR ##220](https://github.com/nf-core/proteinfold/pull/220)] - Add RoseTTAFold-All-Atom module.
- [[#235](https://github.com/nf-core/proteinfold/issues/235)] - Update samplesheet to new version (switch from `sequence` column to `id`).

## [[1.1.1](https://github.com/nf-core/proteinfold/releases/tag/1.1.1)] - 2025-07-30
Expand Down Expand Up @@ -106,6 +107,7 @@ Thank you to everyone else that has contributed by reporting bugs, enhancements
| | `--esm2_t36_3B_UR50D_contact_regression` |
| | `--esmfold_params_path` |
| | `--skip_multiqc` |
| | `--rosettafold_all_atom_db` |

> **NB:** Parameter has been **updated** if both old and new parameter information is present.
> **NB:** Parameter has been **added** if just the new parameter information is present.
Expand Down
16 changes: 15 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,8 @@ On release, automated continuous integration tests run the pipeline on a full-si

v. [ESMFold](https://github.com/facebookresearch/esm) - Regular ESM

vi. [RoseTTAFold-All-Atom](https://github.com/baker-laboratory/RoseTTAFold-All-Atom/) - Regular RFAA

## Usage

> [!NOTE]
Expand All @@ -53,7 +55,7 @@ nextflow run nf-core/proteinfold \
--outdir <OUTDIR>
```

The pipeline takes care of downloading the databases and parameters required by AlphaFold2, Colabfold or ESMFold. In case you have already downloaded the required files, you can skip this step by providing the path to the databases using the corresponding parameter [`--alphafold2_db`], [`--colabfold_db`] or [`--esmfold_db`]. Please refer to the [usage documentation](https://nf-co.re/proteinfold/usage) to check the directory structure you need to provide for each of the databases.
The pipeline takes care of downloading the databases and parameters required by AlphaFold2, Colabfold or ESMFold. In case you have already downloaded the required files, you can skip this step by providing the path to the databases using the corresponding parameter [`--alphafold2_db`], [`--colabfold_db`], [`--esmfold_db`] or ['--rosettafold_all_atom_db']. Please refer to the [usage documentation](https://nf-co.re/proteinfold/usage) to check the directory structure you need to provide for each of the databases.

- The typical command to run AlphaFold2 mode is shown below:

Expand Down Expand Up @@ -136,6 +138,18 @@ The pipeline takes care of downloading the databases and parameters required by
-profile <docker/singularity/podman/shifter/charliecloud/conda/institute>
```

- The rosettafold_all_atom mode can be run using the command below:

```console
nextflow run nf-core/proteinfold \
--input samplesheet.csv \
--outdir <OUTDIR> \
--mode rosettafold_all_atom \
--rosettafold_all_atom_db <null (default) | PATH> \
--use_gpu <true/false> \
-profile <docker/singularity/podman/shifter/charliecloud/conda/institute>
```

> [!WARNING]
> Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_; see [docs](https://nf-co.re/docs/usage/getting_started/configuration#custom-configuration-files).

Expand Down
13 changes: 10 additions & 3 deletions assets/schema_input.json
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,12 @@
"items": {
"type": "object",
"properties": {
"sequence": {
"type": "string",
"pattern": "^\\S+$",
"errorMessage": "Sequence name must be provided and cannot contain spaces",
"meta": ["sequence"]
},
"id": {
"type": "string",
"pattern": "^\\S+$",
Expand All @@ -17,10 +23,11 @@
"type": "string",
"format": "file-path",
"exists": true,
"pattern": "^\\S+\\.fa(sta)?$",
"errorMessage": "Fasta file must be provided, cannot contain spaces and must have extension '.fa' or '.fasta'"
"pattern": "^\\S+\\.(fa(sta)?|yaml|yml|json)$",
"errorMessage": "Fasta, yaml or json file must be provided, cannot contain spaces and must have extension '.fa', '.fasta', '.yaml', '.yml', or '.json'"
}
},
"required": ["id", "fasta"]
"required": ["fasta"],
"anyOf": [{ "required": ["sequence"] }, { "required": ["id"] }]
}
}
1 change: 1 addition & 0 deletions bin/generate_report.py
Original file line number Diff line number Diff line change
Expand Up @@ -307,6 +307,7 @@ def pdb_to_lddt(pdb_files, generate_tsv):
"esmfold": "ESMFold",
"alphafold2": "AlphaFold2",
"colabfold": "ColabFold",
"rosettafold_all_atom": "Rosettafold_All_Atom",
}

parser = argparse.ArgumentParser()
Expand Down
12 changes: 12 additions & 0 deletions conf/dbs.config
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,18 @@ params {
"alphafold2_ptm" : "alphafold_params_2021-07-14"
]

// RoseTTAFold_All_Atom links
uniref30_rosettafold_all_atom_link = 'http://wwwuser.gwdg.de/~compbiol/uniclust/2020_06/UniRef30_2020_06_hhsuite.tar.gz'
pdb100_rosettafold_all_atom_link = 'https://files.ipd.uw.edu/pub/RoseTTAFold/pdb100_2021Mar03.tar.gz'
bfd_rosettafold_all_atom_link = 'https://bfd.mmseqs.com/bfd_metaclust_clu_complete_id30_c90_final_seq.sorted_opt.tar.gz'
rfaa_paper_weights_link = 'http://files.ipd.uw.edu/pub/RF-All-Atom/weights/RFAA_paper_weights.pt'

// RoseTTAFold_All_Atom paths
uniref30_rosettafold_all_atom_path = "${params.rosettafold_all_atom_db}/uniref30/UniRef30_2020_06/*"
pdb100_rosettafold_all_atom_path = "${params.rosettafold_all_atom_db}/pdb100_2021Mar03/*"
bfd_rosettafold_all_atom_path = "${params.rosettafold_all_atom_db}/bfd/*"
rfaa_paper_weights_path = "${params.rosettafold_all_atom_db}/RFAA_paper_weights.pt"

// Esmfold links
esmfold_3B_v1 = 'https://dl.fbaipublicfiles.com/fair-esm/models/esmfold_3B_v1.pt'
esm2_t36_3B_UR50D = 'https://dl.fbaipublicfiles.com/fair-esm/models/esm2_t36_3B_UR50D.pt'
Expand Down
22 changes: 22 additions & 0 deletions conf/modules_rosettafold_all_atom.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Config file for defining DSL2 per module options and publishing paths
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Available keys to override module options:
ext.args = Additional arguments appended to command in module.
ext.args2 = Second set of arguments appended to command in module (multi-tool modules).
ext.args3 = Third set of arguments appended to command in module (multi-tool modules).
ext.prefix = File name prefix for output files.
----------------------------------------------------------------------------------------
*/

process {
withName: 'NFCORE_PROTEINFOLD:ROSETTAFOLD_ALL_ATOM:MULTIQC' {
publishDir = [
path: { "${params.outdir}/multiqc" },
mode: 'copy',
saveAs: { filename -> filename.equals('versions.yml') ? null : "rosettafold_all_atom_$filename" }
]
}

}
37 changes: 37 additions & 0 deletions conf/test_rosettafold_all_atom.config
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Nextflow config file for running minimal tests
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Defines input files and everything required to run a fast and simple pipeline test.
Use as follows:
nextflow run nf-core/proteinfold -profile test_rosettafold_all_atom,<docker/singularity> --outdir <OUTDIR>
----------------------------------------------------------------------------------------
*/

stubRun = true

// Limit resources so that this can run on GitHub Actions
process {
resourceLimits = [
cpus: 4,
memory: '15.GB',
time: '1.h'
]
}

params {
config_profile_name = 'Test profile'
config_profile_description = 'Minimal test dataset to check pipeline function'

// Input data to test rosettafold_all_atom
mode = 'rosettafold_all_atom'
rosettafold_all_atom_db = "${projectDir}/assets/dummy_db_dir"
input = params.pipelines_testdata_base_path + 'proteinfold/testdata/samplesheet/v1.0/samplesheet.csv'
}

process {
withName: 'RUN_ROSETTAFOLD_ALL_ATOM' {
container = '/srv/scratch/sbf-pipelines/proteinfold/singularity/rosettafold_all_atom.sif'
}
}

35 changes: 35 additions & 0 deletions dockerfiles/Dockerfile_nfcore-proteinfold_rosettafold_all_atom
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
FROM nvidia/cuda:12.6.0-cudnn-devel-ubuntu24.04

LABEL Author="[email protected]" \
title="nfcore/proteinfold_rosettafold_all_atom" \
Version="0.9.0" \
description="Docker image containing all software requirements to run the RUN_ROSETTAFOLD_ALL_ATOM module using the nf-core/proteinfold pipeline"

ENV PYTHONPATH="/app/RoseTTAFold-All-Atom" \
PATH="/conda/bin:/app/RoseTTAFold-All-Atom:$PATH" \
DGLBACKEND="pytorch" \
LD_LIBRARY_PATH="/conda/lib:/usr/local/cuda-12.6/lib64:$LD_LIBRARY_PATH"

RUN apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y wget git && \
wget -q -P /tmp "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh" && \
bash /tmp/Miniforge3-$(uname)-$(uname -m).sh -b -p /conda && \
rm -rf /tmp/Miniforge3-$(uname)-$(uname -m).sh /var/lib/apt/lists/* && \
apt-get autoremove -y && apt-get clean -y

RUN git clone --single-branch --depth 1 https://github.com/Australian-Structural-Biology-Computing/RoseTTAFold-All-Atom.git /app/RoseTTAFold-All-Atom && \
cd /app/RoseTTAFold-All-Atom && \
/conda/bin/mamba env create --file=environment.yaml && \
/conda/bin/mamba run -n RFAA bash -c \
"python /app/RoseTTAFold-All-Atom/rf2aa/SE3Transformer/setup.py install && \
bash /app/RoseTTAFold-All-Atom/install_dependencies.sh" && \
/conda/bin/mamba clean --all --force-pkgs-dirs -y

RUN cd /app/RoseTTAFold-All-Atom && \
wget https://ftp.ncbi.nlm.nih.gov/blast/executables/legacy.NOTSUPPORTED/2.2.26/blast-2.2.26-x64-linux.tar.gz && \
mkdir -p blast-2.2.26 && \
tar -xf blast-2.2.26-x64-linux.tar.gz -C blast-2.2.26 && \
cp -r blast-2.2.26/blast-2.2.26/ blast-2.2.26_bk && \
rm -r blast-2.2.26 && \
mv blast-2.2.26_bk/ blast-2.2.26 && \
rm -rf /root/.cache *.tar.gz && \
apt-get autoremove -y && apt-get remove --purge -y wget git && apt-get clean -y
41 changes: 41 additions & 0 deletions dockerfiles/rosettafold_all_atom.def
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
Bootstrap: docker
From: nvidia/cuda:12.4.1-cudnn-runtime-ubuntu22.04

%labels
Author [email protected]
Version 0.2.5

%post
apt update && DEBIAN_FRONTEND=noninteractive apt install --no-install-recommends -y wget git build-essential

wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
bash Miniforge3-Linux-x86_64.sh -b -p /opt/miniforge
rm Miniforge3-Linux-x86_64.sh
export PATH="/opt/miniforge/bin:$PATH"

git clone --single-branch --depth 1 https://github.com/Australian-Structural-Biology-Computing/RoseTTAFold-All-Atom.git /app/RoseTTAFold-All-Atom
cd /app/RoseTTAFold-All-Atom
mamba env create --file=environment.yaml

mamba run -n RFAA \
'python rf2aa/SE3Transformer/setup.py install && \
bash install_dependencies.sh'

wget https://ftp.ncbi.nlm.nih.gov/blast/executables/legacy.NOTSUPPORTED/2.2.26/blast-2.2.26-x64-linux.tar.gz
mkdir -p blast-2.2.26
tar -xf blast-2.2.26-x64-linux.tar.gz -C blast-2.2.26
cp -r blast-2.2.26/blast-2.2.26/ blast-2.2.26_bk
rm -r blast-2.2.26
mv blast-2.2.26_bk/ blast-2.2.26

apt autoremove -y && apt remove --purge -y wget git build-essential && apt clean -y
rm -rf /var/lib/apt/lists/* /root/.cache *.tar.gz
mamba clean --all --force-pkgs-dirs -y

%environment
export PYTHONPATH="/app/RoseTTAFold-All-Atom:$PYTHONPATH"
export PATH="/opt/miniforge/bin:/app/RoseTTAFold-All-Atom:$PATH"
export DGLBACKEND="pytorch"

%runscript
mamba run --name RFAA python -m rf2aa.run_inference --config-name "$@"
14 changes: 14 additions & 0 deletions docs/output.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io/) and predicts pr
- [AlphaFold2](https://github.com/deepmind/alphafold)
- [ColabFold](https://github.com/sokrypton/ColabFold) - MMseqs2 (API server or local search) followed by ColabFold
- [ESMFold](https://github.com/facebookresearch/esm)
- [RoseTTAFold-All-Atom](https://github.com/baker-laboratory/RoseTTAFold-All-Atom/)

See main [README.md](https://github.com/nf-core/proteinfold/blob/master/README.md) for a condensed overview of the steps in the pipeline, and the bioinformatics tools used at each step.

Expand Down Expand Up @@ -176,6 +177,19 @@ Below you can find an indicative example of the TSV file with the pLDDT scores p
| 49 | CB | VAL | 7 | 52.74 |
| 50 | O | VAL | 7 | 56.46 |

### RoseTTAFold-All-Atom

<details markdown="1">
<summary>Output files</summary>

- `run/`
- `<SEQUENCE NAME>_rosettafold_all_atom.pdb` that is the structure with the highest pLDDT score (ranked first)
- `<SEQUENCE NAME>_plddt_mqc.tsv` that presents the pLDDT scores per residue for the predicted model
- `<SEQUENCE NAME>_aux.pt` pytorch file with confidence metrics stored (can load with torch.load(file, map_location="cpu"))
- `<SEQUENCE NAME>/` that contains the computed MSAs, prediction metadata

</details>

### MultiQC report

<details markdown="1">
Expand Down
14 changes: 13 additions & 1 deletion docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ Each FASTA file should contain a single protein sequence unless using multimer m

## Running the pipeline

The typical commands for running the pipeline on AlphaFold2, Colabfold and ESMFold modes are shown below.
The typical commands for running the pipeline on AlphaFold2, Colabfold, ESMFold and RoseTTAFold-All-Atom modes are shown below.

> You can run any combination of the models by providing them to the `--mode` parameter separated by a comma. For example: `--mode alphafold2,esmfold,colabfold` will run the three models in parallel.

Expand Down Expand Up @@ -428,6 +428,18 @@ If you specify the `--esmfold_db <PATH>` parameter, the directory structure of y

This will launch the pipeline with the `docker` configuration profile. See below for more information about profiles.

AlphaFold2 regular can be run using this command:

```bash
nextflow run nf-core/proteinfold \
--input samplesheet.csv \
--outdir <OUTDIR> \
--mode rosettafold_all_atom \
--rosettafold_all_atom_db <null (default) | DB_PATH> \
--use_gpu <true/false> \
-profile <docker/singularity/.../institute>
```

Note that the pipeline will create the following files in your working directory:

```bash
Expand Down
Loading
Loading