Nextomics / NextDenovo

Fast and accurate de novo assembler for long reads

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool


vanbie opened this issue · comments

Describe the bug
An error occured when I was trying to assemble corrected data. Wonder if it was an issue about parameter setting.

Error message

  • hostname
    cd /root/nd/NextDenovo/Mo/Mo_out/02.cns_align/
  • cd /root/nd/NextDenovo/Mo/Mo_out/02.cns_align/
    time /usr/bin/python3 /root/nd/NextDenovo/lib/ -f /root/nd/NextDenovo/Mo/input.fofn -l 37491 -c 6
  • time /usr/bin/python3 /root/nd/NextDenovo/lib/ -f /root/nd/NextDenovo/Mo/input.fofn -l 37491 -c 6
    [INFO] 2021-08-11 22:39:29,574 Split step options:
    [INFO] 2021-08-11 22:39:29,574 Namespace(count=6, fofn='/root/nd/NextDenovo/Mo/input.fofn', index=True, min_len=37491, outdir='./', rename=True)
    Traceback (most recent call last):
    File "/root/nd/NextDenovo/lib/", line 155, in
    File "/root/nd/NextDenovo/lib/", line 129, in main
    f.cutf(args.count, rn = args.rename, ml = args.min_len, pdir = args.outdir, index = args.index)
    File "/root/nd/NextDenovo/lib/", line 108, in cutf
    print('>%d %d %f pid=%s\n%s' % (t, lens, 1, name, seq), file=fa_files[i])
    Command exited with non-zero status 1
    90.50user 132.68system 4:47.14elapsed 77%CPU (0avgtext+0avgdata 245949344maxresident)k
    48378432inputs+8outputs (119major+61515086minor)pagefaults 0swaps

Genome characteristics
genome size=490m heterozygous rate=1.3% repeat content=58%

Input data
Total base count=62880679007bp sequencing depth=129, average/N50 read length=30172

Config file
job_type = local
job_prefix = nextDenovo
task = assemble # 'all', 'correct', 'assemble'
rewrite = yes # yes/no
deltmp = yes
rerun = 3
parallel_jobs = 8
input_type = corrected
read_type = ont
input_fofn = ./input.fofn
workdir = M_out

genome_size = 485m

minimap2_options_cns = -t 4
nextgraph_options = -a 1

Operating system
Ubuntu 18.04 64bit

gcc version 7.5.0

Python 3.6.9

nextDenovo v2.4.0

To Reproduce (Optional)

Additional context (Optional)
32core 256G server

It seems you have too much data, I think you can run NextDenovo with raw data (uncorrected data), which may run faster. Regarding the error you mentioned, actually, I do not know why the print expression causes MemoryError, I need more time to figure it out.

It seems you have too much data, I think you can run NextDenovo with raw data (uncorrected data), which may run faster. Regarding the error you mentioned, actually, I do not know why the print expression causes MemoryError, I need more time to figure it out.

Thanks. Because the data was released in corrected reads, so I can only download the clean data. The original report used NextDenovo for analyzing as well, but did not mentioned too much details.