OptiType killed OOM

Question

OptiType killed OOM

Akazhiel opened this issue 3 years ago · comments

Jonatan Gonzalez Rodriguez commented 3 years ago

Hello!
I'm trying to run optitype in the cloud in a machine with 64GB of RAM with RNA Samples that are around 5GB each due to it being paired end. And the process is killed because it runs out of memory. I've also run OptiType in a pipeline with DNA Tumor-Normal pair samples which are hlatyped in parallel without memory problems.

My questions is, how much memory is needed to run OptiType?

Best regards!

Hugues Fontenelle · Answer 1 · Fri Feb 04 2022 23:38:18 GMT+0800 (China Standard Time)

In my experience, 12GB is sufficient.
But I don't feed the entire FASTQ's, only the HLA region on chr6 that is relevant.

b-niu · Answer 2 · Sat Feb 26 2022 13:52:55 GMT+0800 (China Standard Time)

Hello @Akazhiel , in my own situation, a server with 128GB RAM gets OOM often when treating WES samples with fastq.gz files size of 20GB.
This is really upset and I am seeking for a workaround.

Perhaps extract reads on chr6 from bam files, and run optitype will help?

Richard A. Schäfer · Answer 3 · Sat Nov 04 2023 13:16:20 GMT+0800 (China Standard Time)

Hello, is there a solution for this? I found that files >100,000 reads cause a kill signal... But these are only 400-500MB files. But works well if I stay below that.

karlestira · Answer 4 · Sun Apr 14 2024 01:47:13 GMT+0800 (China Standard Time)

If you are running razers3 and it ran out of memory, you can try to split input file.

bgzip -cd [your fastq] | split -l 40000000 -a 5 -d --filter='razers3 -i 97 -m 99999 --distance-range 0 -pa -tc 0 -o $FILE.bam [your ref] /dev/stdin' /dev/stdin [split prefix]
samtools cat [split prefix]*.bam | samtools view -o res.bam

Then give merged bam files instead of fastqs to optitype.

You can use gzip instead of bgzip(but it is mush slower), adjust split unit(40M lines means 10M reads in my case, which need 4G mem when align reads).
I strongly suggest to use samtools view to recompress the bam file from samtools cat, because samtools cat use some tricky method to merge bam files, and it may not be supported by older decompressor. And, very important, if using samtools cat for directly output, make sure the output file is not in input list, or you will get a infinite file size.

Another things you should notice, is that you should force single thread in razers3, it has multi-thread inconsistency, may lead to a little problem in its result. And its multi-thread seem to have very less speed-up in a small ref.

If you think split fastq is not a good plan, you can also use bowtie2 to filter fastq.

bowtie2 --no-unal --very-sensitive-local --local --omit-sec-seq -p 10 --reorder .....(index and fastq)

Bowtie2 use about 200MB memory(will not increase when input become larger), and can give a filtered bam, remove useless seq. Then you can convert its bam to fastq for optitype use. However, I'm not sure this method give completely consistent output compared with directly use raw fastq.