You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Setting the disk to local-ssd for use with google-batch could be quite helpful for speeding up the writing of temp files (docs).
More generally, if disk is not set (and dynamically expanded for large SRA files), what would prevent the job from running out of disk space -- at least for cloud jobs?
The text was updated successfully, but these errors were encountered:
You can configure the module args to change, for example, the temporary file path. You can check options here https://github.com/ncbi/sra-tools/wiki/HowTo:-fasterq-dump. If you have enough memory and files are not that big you can even use an in memory location for the temporary files.
Unless I forget something there is nothing preventing the processes from running out of disk space. Setting the disk limit also doesn't prevent that, it just determines when that happens more explicitly. Note that the prefetch has a default max download size so that's also an option for avoiding large files.
If you have enough memory and files are not that big you can even use an in memory location for the temporary files
As far as I can tell, fasterq-dump nf-core module doesn't provide a method of setting local-ssd, so how could I (adaptively) provision enough temp disk space? Moreover, local-ssd expands the space at /tmp, but /tmp is out-of-scope for output: paths, so the final reads must be in the process working directory. These final reads could be very large, so I need to (adaptively) provision enough ssd and boot disk space for each GCP Batch just running fasterq-dump.
Setting the disk limit also doesn't prevent that, it just determines when that happens more explicitly. Note that the prefetch has a default max download size so that's also an option for avoiding large files
Yeah, the disk limit just will just explicitly result in a non-zero exit. It doesn't seem that useful.
I'm using a max file size of 300 GB for prefetch, but that can still result in some very large fastq files.
Description of feature
As far as I can tell, the fasterqdump module does not set disk.
Setting the disk to
local-ssd
for use withgoogle-batch
could be quite helpful for speeding up the writing of temp files (docs).More generally, if
disk
is not set (and dynamically expanded for large SRA files), what would prevent the job from running out of disk space -- at least for cloud jobs?The text was updated successfully, but these errors were encountered: