Nextomics / NextDenovo

Fast and accurate de novo assembler for long reads

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Genome assembly of the autotetraploid plant

fengyuanli304 opened this issue · comments

Question or Expected behavior
Genome assembly of the autotetraploid plant
Hi, I have recently assembled a genome using nanopore reads, but the results are not good. It may result from its autotetraploidy. Could you please give me some suggestions? Thank you very much.

  1. nextdenovo + nextpolish
    Type Length (bp) Count (#)
    N10 3398628 14
    N20 2568560 32
    N30 1871279 58
    N40 1356392 91
    N50 939240 138
    N60 640978 207
    N70 451803 308
    N80 316553 449
    N90 218120 654
    Min. 33288 -
    Max. 5440223 -
    Ave. 528507 -
    Total 538020557 1018

C:96.7%[S:72.4%,D:24.3%],F:0.7%,M:2.6%,n:1375
1329 Complete BUSCOs (C)
995 Complete and single-copy BUSCOs (S)
334 Complete and duplicated BUSCOs (D)
10 Fragmented BUSCOs (F)
36 Missing BUSCOs (M)
1375 Total BUSCO groups searched
2) wtdbg2 + nextpolish
Type Length (bp) Count (#)
N10 7117562 7
N20 5320460 16
N30 3664794 29
N40 1774681 51
N50 816375 95
N60 316343 206
N70 151485 470
N80 79657 986
N90 35990 2038
Min. 2066 -
Max. 11154505 -
Ave. 101844 -
Total 560244256 5501

C:90.9%[S:80.6%,D:10.3%],F:1.7%,M:7.4%,n:1375
1250 Complete BUSCOs (C)
1108 Complete and single-copy BUSCOs (S)
142 Complete and duplicated BUSCOs (D)
24 Fragmented BUSCOs (F)
101 Missing BUSCOs (M)
1375 Total BUSCO groups searched

  1. necat4 + nextpolish
    Type Length (bp) Count (#)
    N10 1255211 77
    N20 940996 191
    N30 758574 333
    N40 617986 509
    N50 508059 724
    N60 416166 985
    N70 319953 1315
    N80 236195 1754
    N90 146935 2391
    Min. 504 -
    Max. 2499020 -
    Ave. 288546 -
    Total 1204105988 4173

C:97.0%[S:27.5%,D:69.5%],F:0.9%,M:2.1%,n:1375
1333 Complete BUSCOs (C)
378 Complete and single-copy BUSCOs (S)
955 Complete and duplicated BUSCOs (D)
13 Fragmented BUSCOs (F)
29 Missing BUSCOs (M)
1375 Total BUSCO groups searched

  1. canu (corrected) +smartdenovo +nextpolish
    Type Length (bp) Count (#)
    N10 2180472 18
    N20 1526697 47
    N30 1236412 85
    N40 941656 133
    N50 780034 193
    N60 584817 269
    N70 442020 371
    N80 288149 518
    N90 150110 760
    Min. 9906 -
    Max. 4119799 -
    Ave. 367003 -
    Total 515272840 1404

C:96.9%[S:73.6%,D:23.3%],F:0.5%,M:2.6%,n:1375
1332 Complete BUSCOs (C)
1012 Complete and single-copy BUSCOs (S)
320 Complete and duplicated BUSCOs (D)
7 Fragmented BUSCOs (F)
36 Missing BUSCOs (M)
1375 Total BUSCO groups searched
Which one is more suitable for this autotetraploid plant?
I choose the result of nextdenovo + nextpolish for downstream analyses. After removing haplotigs and contig overlaps by purge_dups, I got a smaller genome size.

contigs 598

Largest contig 5440223
Total length 389747736
GC (%) 34.54
N50 1536005
N75 638325
L50 77
L75 176

N's per 100 kbp 0.22

C:92.8%[S:81.2%,D:11.6%],F:1.8%,M:5.4%,n:1375
1275 Complete BUSCOs (C)
1116 Complete and single-copy BUSCOs (S)
159 Complete and duplicated BUSCOs (D)
25 Fragmented BUSCOs (F)
75 Missing BUSCOs (M)
1375 Total BUSCO groups searched
How can I improve this result?
Waiting for your reply.
Operating system
CentOS Linux release 7.6.1810

GCC
What version of GCC are you using?
4.8.5 20150623 (Red Hat 4.8.5-36)

Python
What version of Python are you using?
python2.7.18

NextDenovo
What version of NextDenovo are you using?
nextdenovo2.4

Additional context (Optional)
Add any other context about the problem here.

What is your question? smaller assembly size? see FAQ