FRED-2 / OptiType

Precision HLA typing from next-generation sequencing data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Individual random samples failing

egnst opened this issue · comments

commented

I have been using OptiType v1.3.1, installed using a singularity. I've trying to submit large batches of samples using a shell script, using the following command:

singularity exec -B /home/usr:/home/biodocker /home/usr/optitype/optitype.simg OptiTypePipeline.py -i ./${sample_number}_R1.fastq ./${sample_number}_R2.fastq --config /home/usr/OptiType/config.ini --dna --prefix Sample_${sample_number} -v --outdir ./OptiType_Results

When I submit batches of jobs, some samples will be successfully typed, and some will fail, and report the following error. The samples that fail are not processed any differently from the ones that succeed, and this problem persists regardless of if I am using position-filtered FASTQ files. At this point, I am out of ideas as to what is causing this error, and I could really use some help figuring out what's wrong.

[E::hts_open_format] Failed to open file ./Optitype_Results/2019_05_07_13_05_51_1.bam

mapping with 16 threads...

0:00:02.02 Mapping 1_R1.fastq to GEN reference...

1:09:51.06 Mapping 1_R2.fastq to GEN reference...

2:18:54.67 Generating binary hit matrix.
Traceback (most recent call last):
File "/usr/local/bin/OptiType/OptiTypePipeline.py", line 310, in
pos, read_details = ht.pysam_to_hdf(bam_paths[0])
File "/usr/local/bin/OptiType/hlatyper.py", line 186, in pysam_to_hdf
sam = pysam.AlignmentFile(samfile, sam_or_bam)
File "pysam/libcalignmentfile.pyx", line 728, in pysam.libcalignmentfile.AlignmentFile.cinit
File "pysam/libcalignmentfile.pyx", line 918, in pysam.libcalignmentfile.AlignmentFile._open
IOError: [Errno 2] could not open alignment file ./Optitype_Results/2019_05_07_13_05_51_1.bam: No such file or directory
~

Hi,

guess I have a similar problem. Only other related issue I could find was #43.

I suspect that RazerS3 is not properly configured

If I watch the output directory, I never see the bam occuring.
Well, turns out the latest RazerS3 in bioconda is broken (for me).

$ razers3 --help
Illegal instruction

Downgrading to 3.5.0:


$ razers3 
razers3 - Faster, fully sensitive read mapping
==============================================
    razers3 [OPTIONS] <GENOME FILE> <READS FILE>
    razers3 [OPTIONS] <GENOME FILE> <PE-READS FILE1> <PE-READS FILE2>
    Try 'razers3 --help' for more information.

I will report back how this turns out, but can you check where your razers3 binary comes from and if it works properly?

** edit: Downgrading to 3.5.0 fixes the problem. **

Best,
Clemens

Hi Clemens,
Let me know if downgrading solved anything. In the Travis build and the dockerfile razers3 is built from source (SeqAn master), and they seem to work fine. But if it only happens with specific input, and you see a benefit in downgrading, we can change it to an older revision.

@egnst Could you e-mail me such a position-filtered input fastq file?

Hi,

we did a bit more research and it looks like my problem arose because the newer razers3 binaries in bioconda were build with instruction sets that our CPU's don't know. That's why the older version works and the newest one doesn't, it is not a bug in razers3 per se.

Now, I have no idea how this translates to the docker world. But I remember a similar problem that we had a while back, where the solver would crash on some nodes and not on other because of different CPU architecture versions.

So make sure that it is always the same file that fails, not different files on the same nodes.

back to the problem at hand

I just noticed the time stamps in the log. 1:09:51.06 is 1 hour, right? That means that razers3 is actually doing something. Might you be running out of memory?

commented

Thanks for your help everybody, downgrading to an older version of RazerS3 seems to have solved the problem.