luntergroup / octopus

Bayesian haplotype-based mutation calling

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Please let the user decide, if the files are okay ....

HMPNK opened this issue · comments

commented

Sorry guys I am a bit devastated, I ran octopus for more than 4 days on our server at 80 cores, what comes out is this:
During the run I have checked some of the temporary bcf files and it looked okay, I was not expecting too much for a diverged hexaploid. But now Octopus deleted all those files and did not manage to write the final vcf, ouch, I have just turned the universe one step closer to heat death...

[2023-06-22 03:29:22] - 100% 4d 12h -
[2023-06-22 03:37:01] Starting Call Set Refinement (CSR) filtering
[2023-06-22 03:37:02] Removed 84 temporary files
[2023-06-22 03:37:04] A program error has occurred:
[2023-06-22 03:37:04]
[2023-06-22 03:37:04] Encountered an exception during calling 'VCF file
[2023-06-22 03:37:04] /data2/octopus-temp/HIFI.octopus.unfiltered.vcf
[2023-06-22 03:37:04] is too big'. This means there is a bug and your results are
[2023-06-22 03:37:04] untrustworthy.
[2023-06-22 03:37:04]
[2023-06-22 03:37:04] To help resolve this error run in debug mode and send the log file to
[2023-06-22 03:37:04] https://github.com/luntergroup/octopus/issues.
[2023-06-22 03:37:04] ------------------------------------------------------------------------

Hi, sorry for the slow response - agreed this is rather frustrating. I'd recommend always using compressed output (either .vcf.gz or .bcf) rather than raw VCF - then you'll never run into this issue. Regarding the runtime, that seems rather excessive - I see your files are called "HIFI" so I'm assuming you're using PacBio CCS data - what configuration options are you using?