Unable to get docker container to work
mhalagan opened this issue · comments
Steps I've taken:
- Filter out MHC region from WGS BAM file
- samtools view -b subject1.bam chr6:29940260-33086201 > subject1.mhc.bam
- Sort BAM file and extract pair reads as fastq files
- samtools sort -n -l 1 subject1.mhc.bam subject1.mhc.sorted
- bedtools bamtofastq -i subject1.mhc.sorted.bam -fq subject1.mhc-end1.fq -fq2 subject1.mhc-end2.fq
- User razers3 for filtering out reads as suggested
- razers3 --percent-identity 90 --max-hits 1 --distance-range 0
--output subject1.mhc-raz1.sam OptiType/data/hla_reference_dna.fasta subject1.mhc-end1.fq - razers3 --percent-identity 90 --max-hits 1 --distance-range 0
--output subject1.mhc-raz2.sam OptiType/data/hla_reference_dna.fasta subject1.mhc-end2.fq
- razers3 --percent-identity 90 --max-hits 1 --distance-range 0
- Converting SAM files to fastq
- cat subject1.mhc-raz1.sam grep -v ^@ | awk '{print "@"$1"\n"$10"\n+\n"$11}' > subject1.mhc-raz1.fastq
- cat subject1.mhc-raz2.sam grep -v ^@ | awk '{print "@"$1"\n"$10"\n+\n"$11}' > subject1.mhc-raz2.fastq
- Finally I'm trying to run the OptiType container on the two fastq files.
- docker run -v /home/biodocker:/data/ -t fred2/optitype -i /home/biodocker/subject1.mhc-raz1.fastq /home/biodocker/subject1.mhc-raz2.fastq -d -o /home/biodocker
Here is the error I'm getting:
docker run -v /home/biodocker:/data/ -t fred2/optitype -i /home/biodocker/subject1.mhc-raz1.fastq /home/biodocker/subject1.mhc-raz2.fastq -d -o /home/biodocker
Traceback (most recent call last):
File "/usr/local/bin/OptiType/OptiTypePipeline.py", line 299, in <module>
pos, read_details = ht.pysam_to_hdf(bam_paths[0])
File "/usr/local/bin/OptiType/hlatyper.py", line 186, in pysam_to_hdf
sam = pysam.AlignmentFile(samfile, sam_or_bam)
File "pysam/libcalignmentfile.pyx", line 397, in pysam.libcalignmentfile.AlignmentFile.__cinit__ (pysam/libcalignmentfile.c:5831)
File "pysam/libcalignmentfile.pyx", line 558, in pysam.libcalignmentfile.AlignmentFile._open (pysam/libcalignmentfile.c:7556)
IOError: file `/home/biodocker/2017_05_10_23_04_02/2017_05_10_23_04_02_1.bam` not found
Any thoughts on what I may be doing wrong? Any advice or suggestions would be greatly appreciated.
Hi,
In your docker call, change -i /home/biodocker/subject1.mhc-raz1.fastq /home/biodocker/subject1.mhc-raz2.fastq
to -i subject1.mhc-raz1.fastq subject1.mhc-raz2.fastq
or -i /data/subject1.mhc-raz1.fastq /data/subject1.mhc-raz2.fastq
and -o /home/biodocker
to -o /data
.
Inside the docker container you can no longer access the path /home/biodocker
, only /data
where you mounted it to.
It works now! Thank you!
Do I need to do the razers3 filtering step when using the docker container or is that done within the container?
The Docker container is just a virtualization of OptiTypePipline.py. So, no it does not perform the pre-filtering.
However, the filtering step is just to speed-up OptiType. You can run OptiType without filtering first; it will just run much longer...