ChaissonLab / danbing-tk

Toolkit for VNTR genotyping and repeat-pan genome graph construction

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

danbing-tk: src/aQueryFasta_thread.h:457: Assertion `fin' failed

hehuiying1125 opened this issue · comments

Dear ChaissonLab,

I tried running the test example command using snakemake. I encountered the error message:
"danbing-tk: src/aQueryFasta_thread.h:457: void readBinaryIndex(kmerIndex_uint32_umap&, std::vector&, std::string&): Assertion `fin' failed."

If you could give me some suggestion about what should I do to make the command successfully, it will be appreciated.
The following is the complete log:

Building DAG of jobs...
Using shell: /bin/bash
Provided cores: 4
Rules claiming more threads will be scaled down.
Job counts:
count jobs
1 GenPanGenomeGraph
2 GenPrunedGenomeGraph
2 GenRawGenomeGraph
1 GenSerializedGraphAndIndex
1 all
7

[Mon Feb 14 23:45:53 2022]
rule GenRawGenomeGraph:
input: /public/home/software/danbing-tk-1.3/test/output/HG00514.0.tr.fasta, /public/home/software/danbing-tk-1.3/test/output/HG00514.1.tr.fasta, /public/home/software/danbing-tk-1.3/test/input/HG00514.filtered.reads.bam, /public/home/software/danbing-tk-1.3/test/output/OrthoMap.v2.tsv
output: /public/home/software/danbing-tk-1.3/test/output/HG00514.rawPB.tr.kmers, /public/home/software/danbing-tk-1.3/test/output/HG00514.rawPB.ntr.kmers, /public/home/software/danbing-tk-1.3/test/output/HG00514.rawPB.graph.kmers, /public/home/software/danbing-tk-1.3/test/output/HG00514.rawIL.tr.kmers
jobid: 13
wildcards: genome=HG00514
priority: 95
resources: cores=24, mem=25

set -eu
ulimit -c 20000
cd /public/home/software/danbing-tk-1.3/test/output/
module load gcc

/public/home/software/danbing-tk-1.3//bin/vntr2kmers_thread -g -m <(cut -f $((0+1)),$((0+2)) /public/home/software/danbing-tk-1.3/test/output/OrthoMap.v2.tsv) -k 21 -fs 700 -ntr 700 -o HG00514.rawPB -fa 2 /public/home/software/danbing-tk-1.3/test/output/HG00514.0.tr.fasta /public/home/software/danbing-tk-1.3/test/output/HG00514.1.tr.fasta

if [ 1 == "1" ]; then
samtools fasta -@2 -n /public/home/software/danbing-tk-1.3/test/input/HG00514.filtered.reads.bam |
/public/home/software/danbing-tk-1.3//bin/bam2pe -fai /dev/stdin |
/public/home/software/danbing-tk-1.3//bin/danbing-tk -g 50 -k 21 -qs /public/home/software/danbing-tk-1.3/test/output//HG00514.rawPB -fai /dev/stdin -o HG00514.rawIL -p 24 -cth 45 -rth 0.5
fi

rule GenRawGenomeGraph:
input: /public/home/software/danbing-tk-1.3/test/output/HG00733.0.tr.fasta, /public/home/software/danbing-tk-1.3/test/output/HG00733.1.tr.fasta, /public/home/software/danbing-tk-1.3/test/input/HG00733.filtered.reads.bam, /public/home/software/danbing-tk-1.3/test/output/OrthoMap.v2.tsv
output: /public/home/software/danbing-tk-1.3/test/output/HG00733.rawPB.tr.kmers, /public/home/software/danbing-tk-1.3/test/output/HG00733.rawPB.ntr.kmers, /public/home/software/danbing-tk-1.3/test/output/HG00733.rawPB.graph.kmers, /public/home/software/danbing-tk-1.3/test/output/HG00733.rawIL.tr.kmers
jobid: 14
wildcards: genome=HG00733
priority: 95
resources: cores=24, mem=25

set -eu
ulimit -c 20000
cd /public/home/software/danbing-tk-1.3/test/output/
module load gcc

/public/home/software/danbing-tk-1.3//bin/vntr2kmers_thread -g -m <(cut -f $((2+1)),$((2+2)) /public/home/software/danbing-tk-1.3/test/output/OrthoMap.v2.tsv) -k 21 -fs 700 -ntr 700 -o HG00733.rawPB -fa 2 /public/home/software/danbing-tk-1.3/test/output/HG00733.0.tr.fasta /public/home/software/danbing-tk-1.3/test/output/HG00733.1.tr.fasta

if [ 1 == "1" ]; then
samtools fasta -@2 -n /public/home/software/danbing-tk-1.3/test/input/HG00733.filtered.reads.bam |
/public/home/software/danbing-tk-1.3//bin/bam2pe -fai /dev/stdin |
/public/home/software/danbing-tk-1.3//bin/danbing-tk -g 50 -k 21 -qs /public/home/software/danbing-tk-1.3/test/output//HG00733.rawPB -fai /dev/stdin -o HG00733.rawIL -p 24 -cth 45 -rth 0.5
fi

Using orthology map, total number of loci: Using orthology map, total number of loci: 11

building and counting building and counting /public/home/software/danbing-tk-1.3/test/output/HG00514.0.tr.fasta/public/home/software/danbing-tk-1.3/test/output/HG00733.0.tr.fasta kmers
kmers
building and counting /public/home/software/danbing-tk-1.3/test/output/HG00733.1.tr.fasta kmers
writing outputs
building and counting /public/home/software/danbing-tk-1.3/test/output/HG00514.1.tr.fasta kmers
writing outputs
fname: /dev/stdin
fname: /dev/stdin
use baitDB: 0
extract fasta: 0
interleaved: 1
sim mode: 0
trim mode: 0
augmentation mode: 0
graph threading mode: 1
output alignment: 0
use baitDB: 0
extract fasta: 0
interleaved: 1output successfully aligned reads only:
0sim mode:
0k:
21trim mode:
0# of subsampled kmers in pre-filtering:
4augmentation mode:
0minimal # of matches in pre-filtering:
1graph threading mode:
1Cthreshold:
45output alignment:
0Rthreshold:
0.5output successfully aligned reads only:
0threading Cthreshold:
50k:
21Running both step1 (kmer-based filtering) and step2 (threading)

of subsampled kmers in pre-filtering: fastx: 4/dev/stdin

minimal # of matches in pre-filtering: query: 1/public/home/software/danbing-tk-1.3/test/output//HG00514.rawPB
.(tr/ntr).kmersCthreshold:
45

total number of loci in Rthreshold: /public/home/software/danbing-tk-1.3/test/output//HG00514.rawPB.tr.kmers: 0.5
threading Cthreshold: 50
Running both step1 (kmer-based filtering) and step2 (threading)
fastx: /dev/stdin
query: /public/home/software/danbing-tk-1.3/test/output//HG00733.rawPB.(tr/ntr).kmers

total number of loci in /public/home/software/danbing-tk-1.3/test/output//HG00733.rawPB.tr.kmers: 0
0
deserializing kmerDBi.umap
deserializing kmerDBi.umap
danbing-tk: src/aQueryFasta_thread.h:457: void readBinaryIndex(kmerIndex_uint32_umap&, std::vector&, std::string&): Assertion fin' failed. danbing-tk: src/aQueryFasta_thread.h:457: void readBinaryIndex(kmerIndex_uint32_umap&, std::vector<unsigned int>&, std::string&): Assertion fin' failed.
/bin/bash: line 11: 220388 Broken pipe samtools fasta -@2 -n /public/home/software/danbing-tk-1.3/test/input/HG00514.filtered.reads.bam
220389 | /public/home/zhaoFL/software/danbing-tk-1.3//bin/bam2pe -fai /dev/stdin
220390 Aborted (core dumped) | /public/home/software/danbing-tk-1.3//bin/danbing-tk -g 50 -k 21 -qs /public/home/software/danbing-tk-1.3/test/output//HG00514.rawPB -fai /dev/stdin -o HG00514.rawIL -p 24 -cth 45 -rth 0.5
/bin/bash: line 11: 220385 Broken pipe samtools fasta -@2 -n /public/home/software/danbing-tk-1.3/test/input/HG00733.filtered.reads.bam
220386 | /public/home/software/danbing-tk-1.3//bin/bam2pe -fai /dev/stdin
220387 Aborted (core dumped) | /public/home/software/danbing-tk-1.3//bin/danbing-tk -g 50 -k 21 -qs /public/home/software/danbing-tk-1.3/test/output//HG00733.rawPB -fai /dev/stdin -o HG00733.rawIL -p 24 -cth 45 -rth 0.5
[Mon Feb 14 23:45:54 2022]
Error in rule GenRawGenomeGraph:
[Mon Feb 14 23:45:54 2022]
jobid: 13
Error in rule GenRawGenomeGraph:
output: /public/home/software/danbing-tk-1.3/test/output/HG00514.rawPB.tr.kmers, /public/home/software/danbing-tk-1.3/test/output/HG00514.rawPB.ntr.kmers, /public/home/software/danbing-tk-1.3/test/output/HG00514.rawPB.graph.kmers, /public/home/software/danbing-tk-1.3/test/output/HG00514.rawIL.tr.kmers
jobid: 14
shell:

set -eu
ulimit -c 20000
cd /public/home/software/danbing-tk-1.3/test/output/
module load gcc

/public/home/software/danbing-tk-1.3//bin/vntr2kmers_thread -g -m <(cut -f $((0+1)),$((0+2)) /public/home/software/danbing-tk-1.3/test/output/OrthoMap.v2.tsv) -k 21 -fs 700 -ntr 700 -o HG00514.rawPB -fa 2 /public/home/software/danbing-tk-1.3/test/output/HG00514.0.tr.fasta /public/home/software/danbing-tk-1.3/test/output/HG00514.1.tr.fasta

if [ 1 == "1" ]; then
samtools fasta -@2 -n /public/home/software/danbing-tk-1.3/test/input/HG00514.filtered.reads.bam |
/public/home/software/danbing-tk-1.3//bin/bam2pe -fai /dev/stdin |
/public/home/software/danbing-tk-1.3//bin/danbing-tk -g 50 -k 21 -qs /public/home/software/danbing-tk-1.3/test/output//HG00514.rawPB -fai /dev/stdin -o HG00514.rawIL -p 24 -cth 45 -rth 0.5
fi

    (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)
output: /public/home/software/danbing-tk-1.3/test/output/HG00733.rawPB.tr.kmers, /public/home/software/danbing-tk-1.3/test/output/HG00733.rawPB.ntr.kmers, /public/home/software/danbing-tk-1.3/test/output/HG00733.rawPB.graph.kmers, /public/home/software/danbing-tk-1.3/test/output/HG00733.rawIL.tr.kmers

shell:

set -eu
ulimit -c 20000
cd /public/home/software/danbing-tk-1.3/test/output/
module load gcc

/public/home/software/danbing-tk-1.3//bin/vntr2kmers_thread -g -m <(cut -f $((2+1)),$((2+2)) /public/home/software/danbing-tk-1.3/test/output/OrthoMap.v2.tsv) -k 21 -fs 700 -ntr 700 -o HG00733.rawPB -fa 2 /public/home/software/danbing-tk-1.3/test/output/HG00733.0.tr.fasta /public/home/software/danbing-tk-1.3/test/output/HG00733.1.tr.fasta

if [ 1 == "1" ]; then
samtools fasta -@2 -n /public/home/software/danbing-tk-1.3/test/input/HG00733.filtered.reads.bam |
/public/home/software/danbing-tk-1.3//bin/bam2pe -fai /dev/stdin |
/public/home/software/danbing-tk-1.3//bin/danbing-tk -g 50 -k 21 -qs /public/home/software/danbing-tk-1.3/test/output//HG00733.rawPB -fai /dev/stdin -o HG00733.rawIL -p 24 -cth 45 -rth 0.5
fi

    (one of the commands exited with non-zero exit code; note that snakemake uses bash strict mode!)

Removing output files of failed job GenRawGenomeGraph since they might be corrupted:
/public/home/software/danbing-tk-1.3/test/output/HG00514.rawPB.tr.kmers, /public/home/software/danbing-tk-1.3/test/output/HG00514.rawPB.ntr.kmers, /public/home/software/danbing-tk-1.3/test/output/HG00514.rawPB.graph.kmers, /public/home/software/danbing-tk-1.3/test/output/HG00514.rawIL.tr.kmers
Removing output files of failed job GenRawGenomeGraph since they might be corrupted:
/public/home/software/danbing-tk-1.3/test/output/HG00733.rawPB.tr.kmers, /public/home/software/danbing-tk-1.3/test/output/HG00733.rawPB.ntr.kmers, /public/home/software/danbing-tk-1.3/test/output/HG00733.rawPB.graph.kmers, /public/home/software/danbing-tk-1.3/test/output/HG00733.rawIL.tr.kmers
Shutting down, this might take some time.
Exiting because a job execution failed. Look above for error message

Best wishes,
Huiying

Hi Huiying,

Could you pull the latest version and see if it works? The pruning step should be off this time.

Thanks,
-Tony

Is "pruning" still a supported step or is it not recommended anymore?

This issue happens because the pruning step happening here is using the per-assembly graph which has not yet been serialised or indexed.

I tested this manually for my "O" sample, first running

danbing-tk//bin/ktools serialize O.rawPB
danbing-tk//bin/ktools ksi  O.rawPB.tr.kmers > O.rawPB.tr.ksi

And this allowed the line in Snakemake to proceed

Started a draft in #19, hopefully will allow pruning in the new pipeline