-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How should we standardize config schema for files? #26
Comments
(1) I'm slightly wary of Snakemake's different interpretations of relative paths, but this should be okay as long as config files are always used as |
Copying over @jameshadfield 's relevant comment:
files:
exclude: &files_exclude 'config/exclude_accessions.txt'
filter:
exclude: *files_exclude |
One quick point which I think can help with clarity: I'd call these input files or similar since config files already means the YAML/JSON files that are loaded into the workflow
… how else would they be provided? If we revert back to a
That's very fair! If really needed, I think we could have utilities to standardize paths inside the workflow. But I think/hope it wouldn't be necessary with due care around
Agreed.
To me, |
Context
This issue was originally brought up in nextstrain/measles#9 (review) and I wanted to document options and consensus on how we should standardize Nextstrain config schemas for files. This is specifically for discussing config files required for workflows such as reference.gb, exclude.txt, include.txt, etc.
Historically, these files were provided at the top of the Snakefile. The files are grouped together instead of within rule specific params because a single file may be used by multiple rules.
To make the files easily configurable, the ncov workflow moved the files to the config YAML under a top level
files
key. Other pathogen repos have taken similar paths of providing config files through the config YAML, but have varying schemas. mpox uses top level file name keys (i.e. drops thefiles
key) and it also includes rule specific file name keys:rsv uses the top level
files
key and top level file name keysOpen questions
files
key, top level file name keys, or rule specific file name keys? Are there other suggested schemas?The text was updated successfully, but these errors were encountered: