How to improve the N50 and reduce contigs numbers?
cj2jy opened this issue · comments
Hi, I finished an assembly and the result is:
Type Length (bp) Count (#)
N10 22880485 3
N20 10335838 10
N30 6877938 22
N40 5222529 39
N50 3377214 63
N60 1919981 103
N70 927783 178
N80 440773 335
N90 218142 666
Min. 28326 -
Max. 57348206 -
Ave. 742827 -
Total 992417917 1336
run.cfg:
[General]
job_type = slurm
submit = sbatch --cpus-per-task=20 --mem-per-cpu=4g -o {out} -e {err} {script}
job_prefix = nextDenovo
task = all # 'all', 'correct', 'assemble'
rewrite = yes # yes/no
deltmp = yes
rerun = 1
parallel_jobs = 5
input_type = raw
read_type = ont
input_fofn = ./input.fofn
workdir = ./02_rundir
[correct_option]
read_cutoff = 2k
genome_size = 850M
seed_cutoff = 25000
pa_correction = 3
sort_options = -m 20g -t 18
minimap2_options_raw = -t 18
correction_options = -p 18
[assemble_option]
random_round = 20
minimap2_options_cns = -t 18 -k 23 -w 10
nextgraph_options = -a 1 -q 10
What can I do to increase the N50 and reduce the total number of contigs? I want a better result for 3d-DNA.
Looking forward to reply. Thank you.
It's hard to say, if I had a better solution I would set it as the default value. How ever, I think you can try to optimize these parameters: seed_cutoff
, -k -w -f
in minimap2_options_raw
and minimap2_options_cns
. BTW, you should make sure you are using the latest version of NextDenovo
. You also can sequencing more ultra-long ONT SUP reads. At the last, you can try some other assemblers.
Thank you, I will change those parameters and try again. But I don't know what the -f means and how to optimize it, do you have any suggestion?
try -f 0.0001
or less
Thank you, I ran again and it is still running. Can I use my last assembly result nd.asm.fasta as input to run assemble again? Would that be a better result?
Thank you, I ran again and it is still running. Can I use my last assembly result nd.asm.fasta as input to run assemble again? Would that be a better result?
No