soedinglab / plass

sensitive and precise assembly of short sequencing reads

Home Page:https://plass.mmseqs.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

mem or disk issue?

colindaven opened this issue · comments

Current Behavior

Plass died. I am unsure whether this is due to a RAM issue or tmp space issue.
Server: 512GB Ubuntu1604.

Failed to mmap memory dataSize=0 File=/tmp/6803214812655189031/nucl_6f_long. Error 22.

Thanks

Steps to Reproduce (for bugs)

srun -c 48 /mnt/ngsnfs/tools/plass/plass/bin/plass assemble --threads 48 MBCF_117_S38_R1.fastq out.fa /tmp/

Plass Output (for bugs)

Program call:
assemble --threads 48 MBCF_117_S38_R1.fastq out.fa /tmp/

MMseqs Version: 26b5d66
Sub Matrix blosum62.out
Rescore mode 0
Remove hits by seq.id. and coverage false
E-value threshold 1e-05
Coverage threshold 0
Coverage Mode 0
Seq. Id Threshold 0.9
Seq. Id. Mode 0
Include identical Seq. Id. false
Sort results 0
In substitution scoring mode, performs global alignment along the diagonal false
Preload mode 0
Threads 48
Verbosity 3
Alphabet size 13
Kmer per sequence 60
Mask Residues 0
K-mer size 14
Max. sequence length 65535
Shift hash 5
Split Memory Limit 0
Include only extendable true
Skip sequence with n repeating k-mers 8
Min codons in orf 45
Max codons in length 2147483647
Max orf gaps 2147483647
Contig start mode 2
Contig end mode 2
Orf start mode 0
Forward Frames 1,2,3
Reverse Frames 1,2,3
Translation Table 1
Use all table starts false
Offset of numeric ids 0
Protein Filter Threshold 0.2
Filter Proteins 1
Number search iterations 12
Remove Temporary Files false
Sets the MPI runner

Program call:
createdb MBCF_117_S38_R1.fastq /tmp/6803214812655189031/nucl_reads --max-seq-len 65535 --dont-split-seq-by-len 0 --dont-shuffle 1 --id-offset 0 -v 3

MMseqs Version: 26b5d66
Max. sequence length 65535
Split Seq. by len false
Do not shuffle input database true
Offset of numeric ids 0
Verbosity 3

................................................................................................... 1 Mio. sequences processed
................................................................................................... 2 Mio. sequences processed
................................................................................................... 3 Mio. sequences processed
................................................................................................... 4 Mio. sequences processed
................................................................................................... 5 Mio. sequences processed
................................................................................................... 6 Mio. sequences processed
................................................................................................... 7 Mio. sequences processed
................................................................................................... 8 Mio. sequences processed
................................................................................................... 9 Mio. sequences processed
................................................................................................... 10 Mio. sequences processed
................................................................................................... 11 Mio. sequences processed
................................................................................................... 12 Mio. sequences processed
................................................................................................... 13 Mio. sequences processed
................................................................................................... 14 Mio. sequences processed
................................................................................................... 15 Mio. sequences processed
................................................................................................... 16 Mio. sequences processed
...........Time for merging files: 0h 0m 2s 140ms
Time for merging files: 0h 0m 2s 28ms
Touch data file /tmp/6803214812655189031/nucl_reads ... Done.
Time for merging files: 0h 0m 15s 353ms
Touch data file /tmp/6803214812655189031/nucl_reads_h ... Done.
Time for merging files: 0h 0m 15s 312ms
Time for processing: 0h 1m 55s 831ms
Program call:
extractorfs /tmp/6803214812655189031/nucl_reads /tmp/6803214812655189031/nucl_6f_start --min-length 20 --max-length 45 --max-gaps 0 --contig-start-mode 1 --contig-end-mode 0 --orf-start-mode 0 --forward-frames 1,2,3 --reverse-frames 1,2,3 --translation-table 1 --use-all-table-starts 0 --id-offset 0 --threads 48 -v 3

MMseqs Version: 26b5d66
Min codons in orf 20
Max codons in length 45
Max orf gaps 0
Contig start mode 1
Contig end mode 0
Orf start mode 0
Forward Frames 1,2,3
Reverse Frames 1,2,3
Translation Table 1
Use all table starts false
Offset of numeric ids 0
Threads 48
Verbosity 3

................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ 16 Mio. sequences processed
................................................................................................................................................................................................................................................................................................................................................................................................................................. 10 Mio. sequences processed
..... 8 Mio. sequences processed
................................................................................................................................................. 14 Mio. sequences processed
.. 15 Mio. sequences processed
. 13 Mio. sequences processed
....... 7 Mio. sequences processed
...................................... 11 Mio. sequences processed
........... 12 Mio. sequences processed
............................................................................................ 9 Mio. sequences processed
................................ 6 Mio. sequences processed
........................ 5 Mio. sequences processed
...... 1 Mio. sequences processed
.......................................... 3 Mio. sequences processed
.................... 2 Mio. sequences processed
4 Mio. sequences processed
.................................Time for merging files: 0h 0m 0s 96ms
Time for merging files: 0h 0m 0s 95ms
Time for processing: 0h 0m 5s 85ms
Program call:
translatenucs /tmp/6803214812655189031/nucl_6f_start /tmp/6803214812655189031/aa_6f_start --translation-table 1 --add-orf-stop 1 -v 3 --threads 48

MMseqs Version: 26b5d66
Translation Table 1
Add Orf Stop true
Verbosity 3
Threads 48

...............................Time for merging files: 0h 0m 0s 202ms
Time for processing: 0h 0m 0s 452ms
Program call:
extractorfs /tmp/6803214812655189031/nucl_reads /tmp/6803214812655189031/nucl_6f_long --min-length 45 --max-length 2147483647 --max-gaps 0 --contig-start-mode 2 --contig-end-mode 2 --orf-start-mode 0 --forward-frames 1,2,3 --reverse-frames 1,2,3 --translation-table 1 --use-all-table-starts 0 --id-offset 0 --threads 48 -v 3

MMseqs Version: 26b5d66
Min codons in orf 45
Max codons in length 2147483647
Max orf gaps 0
Contig start mode 2
Contig end mode 2
Orf start mode 0
Forward Frames 1,2,3
Reverse Frames 1,2,3
Translation Table 1
Use all table starts false
Offset of numeric ids 0
Threads 48
Verbosity 3

............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ 16 Mio. sequences processed
............................................................................................................................................................................................................................................................................................................................................................................................. 3 Mio. sequences processed
........................... 14 Mio. sequences processed
........................................... 15 Mio. sequences processed
.................................................... 13 Mio. sequences processed
......................................... 2 Mio. sequences processed
..... 11 Mio. sequences processed
............................... 9 Mio. sequences processed
............................................................... 8 Mio. sequences processed
..... 5 Mio. sequences processed
............................ 6 Mio. sequences processed
.................... 10 Mio. sequences processed
.......................................................................................................... 12 Mio. sequences processed
................ 1 Mio. sequences processed
.. 7 Mio. sequences processed
..................................... 4 Mio. sequences processed
......Time for merging files: 0h 0m 0s 1ms
Time for merging files: 0h 0m 0s 1ms
Time for processing: 0h 0m 4s 905ms
Program call:
translatenucs /tmp/6803214812655189031/nucl_6f_long /tmp/6803214812655189031/aa_6f_long --translation-table 1 --add-orf-stop 1 -v 3 --threads 48

MMseqs Version: 26b5d66
Translation Table 1
Add Orf Stop true
Verbosity 3
Threads 48

Failed to mmap memory dataSize=0 File=/tmp/6803214812655189031/nucl_6f_long. Error 22.
Error: translatenucs long step died
srun: error: hpc-rc03: task 0: Exited with exit code 1

How does your input data look? What is the average read length?

This error can happen, when the ORF extraction module was not able to extract a single ORF, due to the minimum ORF cutoff.

If your reads are only 100 residues long, then you should use an lower cutoff (something like --min-length 30).

We intend to fix this in the next release by taking always a fraction of the sequence length as cutoff for the orf extraction.

Thanks, that got me a lot further. The reads were only 1x75bp. I selected minimum ORF --min-length 20 and got a lot further.

Thanks!

The sensitivity of Plass can suffer from such short reads because we compute an e-value for the overlap. It is difficult to be significant which such short fragments.