FRED-2 / OptiType

Precision HLA typing from next-generation sequencing data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unable to get docker container to work

mhalagan opened this issue · comments

Steps I've taken:

  1. Filter out MHC region from WGS BAM file
    • samtools view -b subject1.bam chr6:29940260-33086201 > subject1.mhc.bam
  2. Sort BAM file and extract pair reads as fastq files
    • samtools sort -n -l 1 subject1.mhc.bam subject1.mhc.sorted
    • bedtools bamtofastq -i subject1.mhc.sorted.bam -fq subject1.mhc-end1.fq -fq2 subject1.mhc-end2.fq
  3. User razers3 for filtering out reads as suggested
    • razers3 --percent-identity 90 --max-hits 1 --distance-range 0
      --output subject1.mhc-raz1.sam OptiType/data/hla_reference_dna.fasta subject1.mhc-end1.fq
    • razers3 --percent-identity 90 --max-hits 1 --distance-range 0
      --output subject1.mhc-raz2.sam OptiType/data/hla_reference_dna.fasta subject1.mhc-end2.fq
  4. Converting SAM files to fastq
    • cat subject1.mhc-raz1.sam grep -v ^@ | awk '{print "@"$1"\n"$10"\n+\n"$11}' > subject1.mhc-raz1.fastq
    • cat subject1.mhc-raz2.sam grep -v ^@ | awk '{print "@"$1"\n"$10"\n+\n"$11}' > subject1.mhc-raz2.fastq
  5. Finally I'm trying to run the OptiType container on the two fastq files.
    • docker run -v /home/biodocker:/data/ -t fred2/optitype -i /home/biodocker/subject1.mhc-raz1.fastq /home/biodocker/subject1.mhc-raz2.fastq -d -o /home/biodocker

Here is the error I'm getting:

docker run -v /home/biodocker:/data/ -t fred2/optitype -i /home/biodocker/subject1.mhc-raz1.fastq /home/biodocker/subject1.mhc-raz2.fastq -d -o /home/biodocker
Traceback (most recent call last):
  File "/usr/local/bin/OptiType/OptiTypePipeline.py", line 299, in <module>
    pos, read_details = ht.pysam_to_hdf(bam_paths[0])
  File "/usr/local/bin/OptiType/hlatyper.py", line 186, in pysam_to_hdf
    sam = pysam.AlignmentFile(samfile, sam_or_bam)
  File "pysam/libcalignmentfile.pyx", line 397, in pysam.libcalignmentfile.AlignmentFile.__cinit__ (pysam/libcalignmentfile.c:5831)
  File "pysam/libcalignmentfile.pyx", line 558, in pysam.libcalignmentfile.AlignmentFile._open (pysam/libcalignmentfile.c:7556)
IOError: file `/home/biodocker/2017_05_10_23_04_02/2017_05_10_23_04_02_1.bam` not found

Any thoughts on what I may be doing wrong? Any advice or suggestions would be greatly appreciated.

Hi,

In your docker call, change -i /home/biodocker/subject1.mhc-raz1.fastq /home/biodocker/subject1.mhc-raz2.fastq to -i subject1.mhc-raz1.fastq subject1.mhc-raz2.fastq or -i /data/subject1.mhc-raz1.fastq /data/subject1.mhc-raz2.fastq and -o /home/biodocker to -o /data.

Inside the docker container you can no longer access the path /home/biodocker, only /data where you mounted it to.

It works now! Thank you!

Do I need to do the razers3 filtering step when using the docker container or is that done within the container?

The Docker container is just a virtualization of OptiTypePipline.py. So, no it does not perform the pre-filtering.

However, the filtering step is just to speed-up OptiType. You can run OptiType without filtering first; it will just run much longer...