Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minimap2 alignment step fails because of samtools sort #145

Open
trum994 opened this issue Feb 16, 2024 · 6 comments
Open

minimap2 alignment step fails because of samtools sort #145

trum994 opened this issue Feb 16, 2024 · 6 comments
Labels
question Further information is requested

Comments

@trum994
Copy link

trum994 commented Feb 16, 2024

Ask away!

I'm using this workflow on our HPC via Singularity. The input bam is a Promethion run mini bams merged via samtools merge into a single bam (140GB). Is this a missing header issue or a RAM issue or something else?

This is the output I get:

ERROR ~ Error executing process > 'bam_ingress:minimap2_alignment (1)'

Caused by:
Process bam_ingress:minimap2_alignment (1) terminated with an error exit status (1)

Command executed:

samtools bam2fq -@ 1 -T 1 R8967.bam | minimap2 -y -t 8 -ax map-ont Homo_sapiens.GRCh38.dna.primary_assembly.fa - | samtools sort -@ 3 --write-index -o R8967.cram##idx##R8967.cram.crai -O cram --reference Homo_sapiens.GRCh38.dna.primary_assembly.fa -

Command exit status:
1

Command output:
(empty)

Command error:
[M::mm_idx_gen::84.6931.70] collected minimizers
[M::mm_idx_gen::93.043
2.21] sorted minimizers
[M::main::93.0432.21] loaded/built the index for 194 target sequence(s)
[M::mm_mapopt_update::95.457
2.18] mid_occ = 705
[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 194
[M::mm_idx_stat::96.375*2.17] distinct minimizers: 100159079 (38.79% are singletons); average occurrences: 5.540; average spacing: 5.586; total length: 3099750718
[W::sam_hdr_create] Ignored @sq SN:KI270330 : bad or missing LN tag
[E::sam_hrecs_update_hashes] Header includes @sq line "KI270330" with no LN: tag
[E::sam_hrecs_update_hashes] Header includes @sq line "KI270330" with no LN: tag
samtools sort: failed to change sort order header to 'SO:coordinate'

@trum994 trum994 added the question Further information is requested label Feb 16, 2024
@SamStudio8
Copy link
Member

Hi @trum994, it is quite likely that the workflow has exceeded its memory limit for the alignment step. Can you confirm that you're using the latest version as we've made several recent improvements to the memory directives and performance generally.

@stfacc
Copy link

stfacc commented Mar 5, 2024

I'm getting a similar error with the gencode hg38 reference.

I'm using the current "master" version of the pipeline.

ERROR ~ Error executing process > 'bam_ingress:minimap2_alignment (1)'

Caused by:
  Process `bam_ingress:minimap2_alignment (1)` terminated with an error exit status (1)

Command executed:

  samtools bam2fq -@ 1 -T 1  merged_pass.bam | minimap2 -y -t 8 -ax map-ont GRCh38.primary_assembly.genome.fa -     | samtools sort -@ 3 --write-index -o sample_180057.bam##idx##sample_180057.bam.bai -O bam --reference GRCh38.primary_assembly.genome.fa -

Command exit status:
  1

Command output:
  (empty)

Command error:
  [M::mm_idx_gen::74.061*1.88] collected minimizers
  [M::mm_idx_gen::82.639*2.49] sorted minimizers
  [M::main::82.639*2.49] loaded/built the index for 194 target sequence(s)
  [M::mm_mapopt_update::84.520*2.46] mid_occ = 706
  [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 194
  [M::mm_idx_stat::85.610*2.44] distinct minimizers: 100159079 (38.75% are singletons); average occurrences: 5.545; average spacing: 5.581; total length: 3099750718
  [W::sam_hdr_create] Ignored @SQ line with missing SN: tag
  [E::sam_hrecs_error] Malformed key:value pair at line 157: "@SQ       SN"
  [E::sam_hrecs_error] Malformed key:value pair at line 157: "@SQ       SN"
  samtools sort: failed to change sort order header to 'SO:coordinate'

@LisaHagenau
Copy link

I am also having workflows fail with this error. I could complete the workflow with the same settings for 2 of 3 samples, but one keeps failing and it is not even a large file (8 GB cram). I have 64 GB memory available and have explicitly allocated 60 GB to run this workflow, but in the final report, the process bam_ingress:minimap2_alignment only has 16 GB allocated.


N E X T F L O W  ~  version 23.04.2
Launching `/home/nanopore/data-hdd/epi2melabs/workflows/epi2me-labs/wf-human-variation/main.nf` [adoring_carson] DSL2 - revision: 03fcebc94c
WARN: Found unexpected parameters:
* --client_fields: /home/nanopore/data-hdd/epi2melabs/instances/wf-human-variation_01HRYP06TJJ6N6BYMKKAAX9T06/client_fields.json
- Ignore this warning: params.schema_ignore_params = "client_fields" 
||||||||||   _____ ____ ___ ____  __  __ _____      _       _
||||||||||  | ____|  _ \_ _|___ \|  \/  | ____|    | | __ _| |__  ___
|||||       |  _| | |_) | |  __) | |\/| |  _| _____| |/ _` | '_ \/ __|
|||||       | |___|  __/| | / __/| |  | | |__|_____| | (_| | |_) \__ \
||||||||||  |_____|_|  |___|_____|_|  |_|_____|    |_|\__,_|_.__/|___/
||||||||||  wf-human-variation v2.0.0
--------------------------------------------------------------------------------
Core Nextflow options
  runName         : adoring_carson
  containerEngine : docker
  container       : ontresearch/wf-human-variation:shad3aed855cd007c653b8fc8cb16fe46c90199990f
  launchDir       : /mnt/data-hdd/epi2melabs/instances/wf-human-variation_01HRYP06TJJ6N6BYMKKAAX9T06
  workDir         : /home/nanopore/data-hdd/epi2melabs/instances/wf-human-variation_01HRYP06TJJ6N6BYMKKAAX9T06/work
  projectDir      : /home/nanopore/data-hdd/epi2melabs/workflows/epi2me-labs/wf-human-variation
  userName        : nanopore
  profile         : standard
  configFiles     : /home/nanopore/data-hdd/epi2melabs/workflows/epi2me-labs/wf-human-variation/nextflow.config, /home/nanopore/data-hdd/epi2melabs/instances/wf-human-variation_01HRYP06TJJ6N6BYMKKAAX9T06/local.config, /home/nanopore/data-hdd/epi2melabs/instances/wf-human-variation_01HRYP06TJJ6N6BYMKKAAX9T06/global.config
Workflow Options
  mod             : true
Main options
  sample_name     : 2Gy
  bam             : /home/nanopore/data-hdd/epi2melabs/instances/wf-basecalling_01HRRV905NBW78W9CHMS7870WV/output/2Gy.pass.cram
  ref             : /home/nanopore/data-hdd/genomes/hg38/GCA_000001405.15_GRCh38_no_alt_analysis_set.fna
  old_ref         : /home/nanopore/data-hdd/genomes/T2T_chm13_hg/chm13v2.0.fa
  basecaller_cfg  : [email protected]
  bam_min_coverage: 2
  out_dir         : /home/nanopore/data-hdd/epi2melabs/instances/wf-human-variation_01HRYP06TJJ6N6BYMKKAAX9T06/output
!! Only displaying parameters that differ from the pipeline defaults !!
--------------------------------------------------------------------------------
If you use epi2me-labs/wf-human-variation for your analysis please cite:
* The nf-core framework
  https://doi.org/10.1038/s41587-020-0439-x
--------------------------------------------------------------------------------
This is epi2me-labs/wf-human-variation v2.0.0.
--------------------------------------------------------------------------------
[8a/d5a790] Submitted process > cram_cache (1)
[f6/7655a2] Submitted process > getVersions
[bf/9b3bf9] Submitted process > index_ref_fai (1)
[2e/76b4ae] Submitted process > getParams
[3c/aee16c] Submitted process > publish_artifact (1)
[c3/9a1bc6] Submitted process > bam_ingress:check_for_alignment (1)
[b6/410aad] Submitted process > bam_ingress:minimap2_alignment (1)
[a7/2f5a43] Submitted process > publish_artifact (2)
[7c/4642e3] Submitted process > publish_artifact (4)
[31/d9f933] Submitted process > publish_artifact (3)
[97/0c1c3e] Submitted process > getAllChromosomesBed (1)
ERROR ~ Error executing process > 'bam_ingress:minimap2_alignment (1)'
Caused by:
  Process `bam_ingress:minimap2_alignment (1)` terminated with an error exit status (1)
Command executed:
  samtools bam2fq -@ 1 -T 1 --reference chm13v2.0.fa 2Gy.pass.cram | minimap2 -y -t 8 -ax map-ont GCA_000001405.15_GRCh38_no_alt_analysis_set.fna -     | samtools sort -@ 3 --write-index -o 2Gy.cram##idx##2Gy.cram.crai -O cram --reference GCA_000001405.15_GRCh38_no_alt_analysis_set.fna -
Command exit status:
  1
Command output:
  (empty)
Command error:
  WARNING: Your kernel does not support swap limit capabilities or the cgroup is not mounted. Memory limited without swap.
  [M::mm_idx_gen::30.736*1.65] collected minimizers
  [M::mm_idx_gen::34.937*2.41] sorted minimizers
  [M::main::34.937*2.41] loaded/built the index for 195 target sequence(s)
  [M::mm_mapopt_update::35.965*2.37] mid_occ = 694
  [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 195
  [M::mm_idx_stat::36.486*2.35] distinct minimizers: 100167746 (38.80% are singletons); average occurrences: 5.519; average spacing: 5.607; total length: 3099922541
  [W::sam_hdr_create] Ignored @SQ SN:chrUn_K : bad or missing LN tag
  [E::sam_hrecs_update_hashes] Header includes @SQ line "chrUn_K" with no LN: tag
  [E::sam_hrecs_update_hashes] Header includes @SQ line "chrUn_K" with no LN: tag
  samtools sort: failed to change sort order header to 'SO:coordinate'
Work dir:
  /home/nanopore/data-hdd/epi2melabs/instances/wf-human-variation_01HRYP06TJJ6N6BYMKKAAX9T06/work/b6/410aadc15d15d2ba94b7949fdc4554
Tip: view the complete command output by changing to the process work dir and entering the command `cat .command.out`
 -- Check '/home/nanopore/data-hdd/epi2melabs/instances/wf-human-variation_01HRYP06TJJ6N6BYMKKAAX9T06/nextflow.log' file for details
Execution cancelled -- Finishing pending tasks before exit

@LisaHagenau
Copy link

I was able to run this workflow today after restarting the computer, which emptied my swap. Before, htop showed the swap at 6 GB (of 8 GB total). Not sure what was causing the swap to fill up, only EPI2ME was running and after successful workflow completion the swap emptied again.

@SamStudio8
Copy link
Member

SamStudio8 commented Mar 15, 2024 via email

@RenzoTale88
Copy link
Contributor

@LisaHagenau @stfacc @trum994 we have recently released several updates to the workflow, could you please check if you still come across the same problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Development

No branches or pull requests

5 participants