Polish with raw or corrected ONT reads?

Question

Polish with raw or corrected ONT reads?

tbrown91 opened this issue 2 years ago · comments

Hi,

I'm assembling a rather large (>20Gb) genome with only ONT reads. I am having fairly good success so far doing read correcting with NextDenovo (uses about 10% of the disk space compared to canu), assembling the corrected reads, and then polishing with NextPolish.

My question is whether I should be using the raw ONT reads as input for NextPolish, or the corrected reads given by NextDenovo. I expect that some errors could be made worse with the read-correction, but in general should be better. However I don't know if using these slightly more accurate reads would violate some assumptions you have inside NextPolish. Given the size of genome and volume of data, it takes a long time and a lot of space to run anything, so I would like to avoid running too many tasks in parallel.

What would you recommend - raw or corrected reads for NextPolish?

Cheers,

Tom

Hu Jiang commented 2 years ago

raw