Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: move config accesses in wrappers to params #536

Draft
wants to merge 67 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
56db085
chore: alfred/qc: move config access to params
tedil Aug 2, 2024
d0c8d2c
chore: remove alfred/qc_external wrapper (unused)
tedil Aug 2, 2024
bd25fab
chore: arriba/run: move config access to params
tedil Aug 2, 2024
d55a86b
chore: arriba/run: move config access to params (pt 2)
tedil Aug 2, 2024
affc68b
chore: ascat/*: move config access to params
tedil Aug 2, 2024
59d5f32
chore: baf_file_generation: move config access to params
tedil Aug 2, 2024
ed6e0ca
chore: bbduk: move config access to params (might need str conversions)
tedil Aug 5, 2024
f9fe113
chore: {bcftools,platypus}/call_joint: move config access to params
tedil Aug 5, 2024
260cdd2
chore: bcftools/filter: rename params to args for consistency
tedil Aug 5, 2024
9d40314
chore: bcftools/gvcf_to_vcf: move config access to params
tedil Aug 5, 2024
d0a4163
chore: bcftools/heterozygous_variants: move config access to params
tedil Aug 5, 2024
5ad8316
chore: bcftools/merge_snv_vcf: move config access to params
tedil Aug 5, 2024
6c11e19
chore: bcftools/merge_vcf: move config access to params
tedil Aug 5, 2024
842225f
chore: bcftools/pileups: move config access to params
tedil Aug 5, 2024
6bcbcf3
chore: bcftools/protected: use snakemake type hints
tedil Aug 6, 2024
5aea5fd
chore: bcftools/regions: use snakemake type hints
tedil Aug 6, 2024
5040037
chore: bcftools/stats: remove unused wrapper
tedil Aug 6, 2024
a3d5ec7
chore: bcftools/TMB: move config access to params
tedil Aug 6, 2024
1c6b52f
chore: bcftools_call: move config access to params
tedil Aug 6, 2024
c4fca92
chore: bcftools_roh: move config access to params
tedil Aug 6, 2024
1d79c9b
chore: only import snakemake.script when type checking
tedil Aug 6, 2024
f8c9c9e
fix: alfred/qc incorrect param name
tedil Aug 6, 2024
1c136b9
chore: bed_jaccard_operations: import snakemake.script.snakemake for …
tedil Aug 6, 2024
a33de05
chore: bed_venn: import snakemake.script.snakemake for type hints
tedil Aug 6, 2024
dac4295
chore: bwa: move config access to params
tedil Aug 7, 2024
31821ae
chore: bwa_mem2: move config access to params
tedil Aug 7, 2024
aad0e80
chore: canvas/somatic_wgs: move config access to params
tedil Aug 7, 2024
6b7994a
chore: cbioportal/case_lists: move config access to params
tedil Aug 7, 2024
c5b1430
chore: cbioportal/clinical_data: move config access to params
tedil Aug 7, 2024
a086273
add FIXME for cbioportal/clinical_data wrapper+model combo
tedil Aug 7, 2024
6ace434
chore: defuse: move config access to params
ericblanc20 Dec 2, 2024
9dc8543
chore: delly2: move config to params (germline_cnv & somatic not used…
ericblanc20 Dec 3, 2024
38cfec0
chore: dkfz & eb_filter: move config access to params (params to args…
ericblanc20 Dec 3, 2024
38d9f2c
chore: fastp: move config access to params (with test of adapter trim…
ericblanc20 Dec 3, 2024
1e300f1
chore: expansionhunter: move config to params & input
ericblanc20 Dec 3, 2024
55b333d
chore: gatk3, gatk4 gatk_phase_by_transmission
ericblanc20 Jan 21, 2025
c28cc39
chore: gcnv
ericblanc20 Jan 21, 2025
bb1d66c
chore: gcnv once more
ericblanc20 Jan 21, 2025
87350f9
chore: featurecounts
ericblanc20 Jan 21, 2025
3990564
chore: cnvetti & control_freec
ericblanc20 Jan 21, 2025
ed52461
chore: maelstrom
ericblanc20 Jan 21, 2025
6b93fc9
chore: manta
ericblanc20 Jan 21, 2025
a1c40da
chore: mantis2
ericblanc20 Jan 21, 2025
7c75009
chore: mbcs
ericblanc20 Jan 21, 2025
67d3b35
chore: mehari
ericblanc20 Jan 21, 2025
5d9d77b
chore: melt
ericblanc20 Jan 21, 2025
13fb144
chore: ngs_chew
ericblanc20 Jan 21, 2025
ee4f8c2
chore: optitype
ericblanc20 Jan 21, 2025
2498fa1
chore: picard
ericblanc20 Jan 21, 2025
628ce2e
chore: pizzly
ericblanc20 Jan 21, 2025
5181d1b
chore: popdel
ericblanc20 Jan 21, 2025
046ca8b
chore: rnaqc
ericblanc20 Jan 21, 2025
77d21a5
chore: rseqc
ericblanc20 Jan 21, 2025
ef7b8f7
chore: salmon
ericblanc20 Jan 21, 2025
4258edb
chore: scalpel
ericblanc20 Jan 21, 2025
5dd5228
chore: scarHRD
ericblanc20 Jan 21, 2025
23ea2ae
chore: sequenza
ericblanc20 Jan 21, 2025
31c4655
chore: somatic_cnv_checking (& friends in bcftools & vcfpy)
ericblanc20 Jan 23, 2025
e91afcf
chore: somatic_variant_filtration
ericblanc20 Jan 23, 2025
19ad956
chore: star
ericblanc20 Jan 23, 2025
627b2d9
chore: starfusion
ericblanc20 Jan 23, 2025
998a72c
chore: strelka2
ericblanc20 Jan 23, 2025
80238e9
chore: varfish_annotator
ericblanc20 Jan 23, 2025
c9b6d6f
style: make ruff check happy
ericblanc20 Jan 23, 2025
925b414
chore: variant_filtration
ericblanc20 Jan 23, 2025
45a0ddf
style: make ruff happy
ericblanc20 Jan 24, 2025
d401ab4
chore: vep
ericblanc20 Jan 24, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions snappy_pipeline/workflows/adapter_trimming/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -237,6 +237,7 @@ def args_function(wildcards):
"reads_left": {key: reads_left[key] for key in sorted(reads_left.keys())},
"reads_right": {key: reads_right[key] for key in sorted(reads_right.keys())},
},
"config": dict(self.config.get(self.name)),
}

# Validate action
Expand Down
6 changes: 3 additions & 3 deletions snappy_pipeline/workflows/adapter_trimming/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ class UmiLoc(Enum):


class Fastp(SnappyModel):
num_threads: int = 0
num_threads: int = 4
trim_front1: int = 0
"""
trimming how many bases in front for read1, default is 0 (int [=0])
Expand Down Expand Up @@ -361,9 +361,9 @@ class Bbduk(SnappyModel):
Field(
examples=[
[
"/fast/work/groups/cubi/projects/biotools/static_data/app_support/"
"/data/cephfs-1/work/groups/cubi/projects/biotools/static_data/app_support/"
"bbtools/39.01/resources/adapters.fa",
"/fast/work/groups/cubi/projects/biotools/static_data/app_support/"
"/data/cephfs-1/work/groups/cubi/projects/biotools/static_data/app_support/"
"bbtools/39.01/resources/phix174_ill.ref.fa.gz",
]
]
Expand Down
3 changes: 3 additions & 0 deletions snappy_pipeline/workflows/cbioportal_export/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -602,6 +602,8 @@ def get_args(self, action):
# Multiple libraries should not be returned by _yield_libraries
assert extraction_type not in donors[donor_name][sample_name]
donors[donor_name][sample_name][extraction_type] = lib.name
assert "__config" not in donors.keys(), "__config is a reserved key, not a valid donor"
donors["__config"] = dict(self.config)
return donors

@dictify
Expand Down Expand Up @@ -654,6 +656,7 @@ def get_args(self, action):
for sample_name in args["cnaseq"]["samples"]:
if sample_name in args["rna_seq_mrna"]["samples"]:
args["3way_complete"]["samples"] += [sample_name]
args["__cancer_study_id"] = self.config.study.cancer_study_identifier
return args

@dictify
Expand Down
2 changes: 2 additions & 0 deletions snappy_pipeline/workflows/cbioportal_export/model.py
Original file line number Diff line number Diff line change
Expand Up @@ -177,6 +177,8 @@ class CbioportalExport(SnappyStepModel):
datatype="NUMBER",
priority="2",
column="TMB",
# FIXME: the cbioportal/clinical_data wrapper mentions key named "path"
# which seems to be mandatory but is not listed here
)
}
],
Expand Down
9 changes: 9 additions & 0 deletions snappy_pipeline/workflows/common/delly.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,11 +53,20 @@ def __init__(self, parent):
for sheet in self.parent.shortcut_sheets:
self.donor_ngs_library_to_pedigree.update(sheet.donor_ngs_library_to_pedigree)

def get_args(self, action):
# Validate action
self._validate_action(action)
return {
"genome": self.w_config.static_data_config.reference.path,
"config": dict(self.config.get(self.name)),
}

@dictify
def _get_input_files_call(self, wildcards):
ngs_mapping = self.parent.sub_workflows["ngs_mapping"]
token = f"{wildcards.mapper}.{wildcards.library_name}"
yield "bam", ngs_mapping(f"output/{token}/out/{token}.bam")
yield "bai", ngs_mapping(f"output/{token}/out/{token}.bam.bai")

@dictify
def _get_output_files_call(self):
Expand Down
13 changes: 13 additions & 0 deletions snappy_pipeline/workflows/common/gcnv/gcnv_run.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
import warnings
from glob import glob
from itertools import chain
from typing import Any

from snakemake.io import Wildcards, expand, touch

Expand Down Expand Up @@ -426,6 +427,9 @@ def _get_input_files_joint_germline_cnv_segmentation(self, wildcards):
name_pattern = f"write_pedigree.{wildcards.library_name}"
yield "ped", f"work/{name_pattern}/out/{wildcards.library_name}.ped"

def _get_params_joint_germline_cnv_segmentation(self, wildcards: Wildcards) -> dict[str, Any]:
return {"reference": self.parent.w_config.static_data_config.reference.path}


class MergeMultikitFamiliesMixin:
"""Methods for merging families with multiple kits.
Expand Down Expand Up @@ -524,6 +528,15 @@ def get_params(self, action: str):
# Return requested function
return getattr(self, f"_get_params_{action}")

def _get_params_preprocess_intervals(self, wildcards: Wildcards) -> dict[str, Any]:
return {"reference": self.parent.w_config.static_data_config.reference.path}

def _get_params_coverage(self, wildcards: Wildcards) -> dict[str, Any]:
return {"reference": self.parent.w_config.static_data_config.reference.path}

def _get_params_joint_germline_cnv_segmentation(self, wildcards: Wildcards) -> dict[str, Any]:
return {"reference": self.parent.w_config.static_data_config.reference.path}

@listify
def get_result_files(self):
"""Return list of **concrete** paths to result files for the given configuration and sample sheets.
Expand Down
6 changes: 6 additions & 0 deletions snappy_pipeline/workflows/common/manta.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,8 @@
These are used in both ``sv_calling_targeted`` and ``sv_calling_wgs``.
"""

from typing import Any

from snappy_pipeline.base import UnsupportedActionException
from snappy_pipeline.utils import dictify
from snappy_pipeline.workflows.abstract import BaseStepPart
Expand Down Expand Up @@ -80,3 +82,7 @@ def _get_output_files_run(self):
yield from augment_work_dir_with_output_links(
work_files, self.get_log_file().values()
).items()

def get_args(self, action: str) -> dict[str, Any]:
self._validate_action(action)
return {"reference": self.parent.w_config.static_data_config.reference.path}
7 changes: 7 additions & 0 deletions snappy_pipeline/workflows/common/melt.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import re
import typing
from itertools import chain
from typing import Any

from biomedsheets.shortcuts import is_not_background
from snakemake.io import touch
Expand Down Expand Up @@ -228,3 +229,9 @@ def _get_output_files_merge_vcf(self):
@dictify
def _get_log_file_merge_vcf(self):
yield from self._get_log_file_with_infix("{mapper}.melt.{library_name}").items()

def get_args(self, action: str) -> dict[str, Any]:
self._validate_action(action)
return self.config.melt.model_dump(by_alias=True) | {
"reference": self.parent.w_config.static_data_config.reference.path
}
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,8 @@ rule gene_expression_quantification_duplication_run:
decision=wf.get_strandedness_file("run"),
output:
**wf.get_output_files("duplication", "run"),
params:
**{"args": wf.get_args("duplication", "run")},
threads: wf.get_resource("duplication", "run", "threads")
resources:
time=wf.get_resource("duplication", "run", "time"),
Expand All @@ -113,6 +115,8 @@ rule gene_expression_quantification_dupradar_run:
decision=wf.get_strandedness_file("run"),
output:
**wf.get_output_files("dupradar", "run"),
params:
**{"args": wf.get_args("dupradar", "run")},
threads: wf.get_resource("dupradar", "run", "threads")
resources:
time=wf.get_resource("dupradar", "run", "time"),
Expand All @@ -131,6 +135,8 @@ rule gene_expression_quantification_rnaseqc_run:
decision=wf.get_strandedness_file("run"),
output:
**wf.get_output_files("rnaseqc", "run"),
params:
**{"args": wf.get_args("rnaseqc", "run")},
threads: wf.get_resource("rnaseqc", "run", "threads")
resources:
time=wf.get_resource("rnaseqc", "run", "time"),
Expand All @@ -143,12 +149,14 @@ rule gene_expression_quantification_rnaseqc_run:
wf.wrapper_path("rnaqc/rnaseqc")


rule gene_expression_quantification_star_run:
rule gene_expression_quantification_stats_run:
input:
unpack(wf.get_input_files("stats", "run")),
decision=wf.get_strandedness_file("run"),
output:
**wf.get_output_files("stats", "run"),
params:
**{"args": wf.get_args("stats", "run")},
threads: wf.get_resource("stats", "run", "threads")
resources:
time=wf.get_resource("stats", "run", "time"),
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@
"""

import os
from typing import Any

from biomedsheets.shortcuts import GenericSampleSheet, is_not_background
from snakemake.io import expand
Expand Down Expand Up @@ -188,6 +189,8 @@ def args_function(wildcards):
)
if reads_right:
result["input"]["reads_right"] = reads_right
result |= self.config.salmon.model_dump(by_alias=True)
result["strand"] = self.config.strand
return result

assert action == "run", "Unsupported actions"
Expand Down Expand Up @@ -259,6 +262,10 @@ def get_output_files(self, action):
)
)

def get_args(self, action: str) -> dict[str, Any]:
self._validate_action(action)
return {"strand": self.config.strand}

@dictify
def get_log_file(self, action):
"""Return mapping of log files."""
Expand Down Expand Up @@ -287,6 +294,11 @@ class FeatureCountsStepPart(GeneExpressionQuantificationStepPart):
#: Class available actions
actions = ("run",)

def get_args(self, action: str) -> dict[str, Any]:
return super().get_args(action) | {
"path_annotation_gtf": self.config.featurecounts.path_annotation_gtf,
}

def get_resource_usage(self, action: str, **kwargs) -> ResourceUsage:
"""Get Resource Usage

Expand Down Expand Up @@ -373,6 +385,12 @@ class QCStepPartDupradar(GeneExpressionQuantificationStepPart):
#: Class available actions
actions = ("run",)

def get_args(self, action: str) -> dict[str, Any]:
return super().get_args(action) | {
"dupradar_path_annotation_gtf": self.config.dupradar.dupradar_path_annotation_gtf,
"num_threads": self.config.dupradar.num_threads,
}

def get_resource_usage(self, action: str, **kwargs) -> ResourceUsage:
"""Get Resource Usage

Expand All @@ -397,6 +415,12 @@ class QCStepPartRnaseqc(GeneExpressionQuantificationStepPart):
#: Class available actions
actions = ("run",)

def get_args(self, action: str) -> dict[str, Any]:
return super().get_args(action) | {
"reference": self.parent.w_config.static_data_config.reference.path,
"rnaseqc_path_annotation_gtf": self.config.rnaseqc.rnaseqc_path_annotation_gtf,
}

def get_resource_usage(self, action: str, **kwargs) -> ResourceUsage:
"""Get Resource Usage

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ rule build_gcnv_model_preprocess_intervals:
log:
wf.get_log_file("gcnv", "preprocess_intervals"),
params:
step_key="helper_gcnv_model_targeted",
**{"args": wf.get_args("gcnv", "preprocess_intervals")},
wrapper:
wf.wrapper_path("gcnv/preprocess_intervals")

Expand All @@ -68,6 +68,8 @@ rule build_gcnv_model_annotate_gc:
unpack(wf.get_input_files("gcnv", "annotate_gc")),
output:
**wf.get_output_files("gcnv", "annotate_gc"),
params:
**{"args": wf.get_args("gcnv", "annotate_gc")},
threads: wf.get_resource("gcnv", "annotate_gc", "threads")
resources:
time=wf.get_resource("gcnv", "annotate_gc", "time"),
Expand All @@ -85,6 +87,8 @@ rule build_gcnv_model_coverage:
unpack(wf.get_input_files("gcnv", "coverage")),
output:
**wf.get_output_files("gcnv", "coverage"),
params:
**{"args": wf.get_args("gcnv", "coverage")},
threads: wf.get_resource("gcnv", "coverage", "threads")
resources:
time=wf.get_resource("gcnv", "coverage", "time"),
Expand Down Expand Up @@ -128,7 +132,7 @@ rule build_gcnv_model_contig_ploidy:
log:
wf.get_log_file("gcnv", "contig_ploidy"),
params:
step_key="helper_gcnv_model_targeted",
**{"args": wf.get_args("gcnv", "contig_ploidy")},
wrapper:
wf.wrapper_path("gcnv/contig_ploidy")

Expand Down
10 changes: 10 additions & 0 deletions snappy_pipeline/workflows/helper_gcnv_model_targeted/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,7 @@

import os
import re
from typing import Any

from biomedsheets.shortcuts import GermlineCaseSheet, is_not_background
from snakemake.io import glob_wildcards
Expand Down Expand Up @@ -159,6 +160,15 @@ def _get_input_files_post_germline_calls(self, wildcards, checkpoints):
)
yield ext, "work/{name_pattern}/out/{name_pattern}/.done".format(name_pattern=name_pattern)

def get_args(self, action: str) -> dict[str, Any]:
gcnv_config = self.w_config.step_config["helper_gcnv_model_targeted"].gcnv
return {
"reference": self.parent.w_config.static_data_config.reference.path,
"path_par_intervals": gcnv_config.path_par_intervals,
"path_target_interval_list_mapping": gcnv_config.path_target_interval_list_mapping,
"path_uniquely_mapable_bed": gcnv_config.path_uniquely_mapable_bed,
}


class HelperBuildTargetSeqGcnvModelWorkflow(BaseStep):
"""Perform gCNV model building for WES samples by library kit"""
Expand Down
10 changes: 9 additions & 1 deletion snappy_pipeline/workflows/helper_gcnv_model_wgs/Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,8 @@ rule build_gcnv_model_preprocess_intervals:
unpack(wf.get_input_files("gcnv", "preprocess_intervals")),
output:
**wf.get_output_files("gcnv", "preprocess_intervals"),
params:
**{"args": wf.get_args("gcnv", "preprocess_intervals")}
threads: wf.get_resource("gcnv", "preprocess_intervals", "threads")
resources:
time=wf.get_resource("gcnv", "preprocess_intervals", "time"),
Expand All @@ -66,6 +68,8 @@ rule build_gcnv_model_annotate_gc:
unpack(wf.get_input_files("gcnv", "annotate_gc")),
output:
**wf.get_output_files("gcnv", "annotate_gc"),
params:
**{"args": wf.get_args("gcnv", "annotate_gc")}
threads: wf.get_resource("gcnv", "annotate_gc", "threads")
resources:
time=wf.get_resource("gcnv", "annotate_gc", "time"),
Expand All @@ -83,6 +87,8 @@ rule build_gcnv_model_coverage:
unpack(wf.get_input_files("gcnv", "coverage")),
output:
**wf.get_output_files("gcnv", "coverage"),
params:
**{"args": wf.get_args("gcnv", "coverage")}
threads: wf.get_resource("gcnv", "coverage", "threads")
resources:
time=wf.get_resource("gcnv", "coverage", "time"),
Expand Down Expand Up @@ -117,6 +123,8 @@ rule build_gcnv_model_contig_ploidy:
unpack(wf.get_input_files("gcnv", "contig_ploidy")),
output:
**wf.get_output_files("gcnv", "contig_ploidy"),
params:
**{"args": wf.get_args("gcnv", "preprocess_intervals")}
threads: wf.get_resource("gcnv", "contig_ploidy", "threads")
resources:
time=wf.get_resource("gcnv", "contig_ploidy", "time"),
Expand All @@ -126,7 +134,7 @@ rule build_gcnv_model_contig_ploidy:
log:
wf.get_log_file("gcnv", "contig_ploidy"),
params:
step_key="helper_gcnv_model_wgs",
**{"args": wf.get_args("gcnv", "coverage")}
wrapper:
wf.wrapper_path("gcnv/contig_ploidy")

Expand Down
9 changes: 9 additions & 0 deletions snappy_pipeline/workflows/helper_gcnv_model_wgs/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@
"""

import os
from typing import Any

import attr
from biomedsheets.shortcuts import GermlineCaseSheet, is_not_background
Expand Down Expand Up @@ -184,6 +185,14 @@ def _get_input_files_post_germline_calls(self, wildcards, checkpoints):
)
yield ext, "work/{name_pattern}/out/{name_pattern}/.done".format(name_pattern=name_pattern)

def get_args(self, action: str) -> dict[str, Any]:
gcnv_config = self.w_config.step_config["helper_gcnv_model_wgs"].gcnv
return {
"reference": self.parent.w_config.static_data_config.reference.path,
"path_par_intervals": gcnv_config.path_par_intervals,
"path_uniquely_mapable_bed": gcnv_config.path_uniquely_mapable_bed,
}

def get_resource_usage(self, action: str, **kwargs) -> ResourceUsage:
"""Get Resource Usage

Expand Down
2 changes: 2 additions & 0 deletions snappy_pipeline/workflows/hla_typing/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -165,6 +165,8 @@ def args_function(wildcards):
)
if reads_right:
result["input"]["reads_right"] = reads_right
result["num_mapping_threads"] = self.config.optitype.num_mapping_threads
result["max_reads"] = self.confing.optitype.max_reads
return result

assert action == "run", "Unsupported actions"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -83,6 +83,8 @@ rule homologous_recombination_deficiency_scarHRD_run:
unpack(wf.get_input_files("scarHRD", "run")),
output:
**wf.get_output_files("scarHRD", "run"),
params:
**{"args": wf.get_args("scarHRD", "run")},
threads: wf.get_resource("scarHRD", "run", "threads")
resources:
time=wf.get_resource("scarHRD", "run", "time"),
Expand Down
Loading
Loading