FRED-2 / OptiType

Precision HLA typing from next-generation sequencing data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error at determining minimal set of non-overshadowed alleles

JaneMerlevede opened this issue · comments

Hello,

I am using OptiType with Python 2.7.10. After installing some modules, I could run the analysis:
python /OptiType/OptiTypePipeline.py -d -v -i $curDir/NeoEpitopePrediction/${Tumor}_1.fastq $curDir/NeoEpitopePrediction/${Tumor}_2.fastq -o $curDir/NeoEpitopePrediction/HLA/
until determining minimal set of non-overshadowed alleles step:

` 0:00:00.38 Mapping Sample_214310406_T-AL-O_1.fastq to GEN reference...

0:00:19.11 Mapping Sample_214310406_T-AL-O_2.fastq to GEN reference...

0:00:38.76 Generating binary hit matrix.
0:00:38.76 Loading alleles and read IDs from /data/Analysis/NeoEpitopePrediction/HLA/2017_04_25_11_59_59/2017_04_25_11_59_59_0.sam...
0:00:40.06 11179 alleles and 2016 reads found.
0:00:40.06 Initializing mapping matrix...
0:00:40.07 2016x11179 mapping matrix initialized. Populating 1077618 hits from SAM file...
10% completed
20% completed
30% completed
40% completed
50% completed
60% completed
70% completed
80% completed
90% completed
100% completed
0:03:35.02 1077618 elements filled. Matrix sparsity: 1 in 20.91
0:03:44.25 Loading alleles and read IDs from /data/Analysis/NeoEpitopePrediction/HLA/2017_04_25_11_59_59/2017_04_25_11_59_59_1.sam...
0:03:45.11 11179 alleles and 2177 reads found.
0:03:45.11 Initializing mapping matrix...
0:03:45.12 2177x11179 mapping matrix initialized. Populating 992781 hits from SAM file...
10% completed
20% completed
30% completed
40% completed
50% completed
60% completed
70% completed
80% completed
90% completed
100% completed
0:06:28.36 992781 elements filled. Matrix sparsity: 1 in 24.51

0:06:40.68 temporary pruning of identical rows and columns

0:06:40.71 Size of mtx with unique rows and columns: (312, 446)
0:06:40.71 determining minimal set of non-overshadowed alleles `

Could this problem be related with the solver? I tried both solvers cbc and glpk that I added to my $PATH.

Thank you in advance for your help

Hi Jane,

That step don't involve ILP solving so it must be something else. I noticed that according to your output log OptiType was using an old fallback method which it only does if the pysam Python-module is not available. Can you try installing pysam and re-run it? That would tell us whether the issue is associated with the old method or your data.

Hello Andras,

Thank you for your answer.
I tried to re-install pysam but it seems fine:
pip install --user pysam Requirement already satisfied: pysam in /cm/shared/bioinfo/python/2.7.10/lib/python2.7/site-packages

That looks good. Does it perhaps fail during the import? Try it with a file containing a single line "import pysam" or call the interpreter and just type it in. You're using the current version of OptiType, right?

Yes, I am using the current verion (1.0)
Well, with the output sam or test file:

 python -c "import pysam" /data/Analysis/NeoEpitopePrediction/HLA/2017_04_25_09_52_01/2017_04_25_09_52_01_0.sam
python -c "import pysam" /data/Analysis/NeoEpitopePrediction/HLA/2017_04_25_09_52_01/test

nothing happens...

Could it be related to HDF5?
I am checking in our cluster, I cannot find HDF5 and I have not set the variables in my bashrc.
I doubt, my files are not compressed

When running the command line outside a job, I get a more detailed error message:

...
0:09:01.07 Size of mtx with unique rows and columns: (312, 446)
0:09:01.07 determining minimal set of non-overshadowed alleles
Traceback (most recent call last):
  File "/home/Project/Immuno/OptiType/OptiTypePipeline.py", line 284, in <module>
    minimal_alleles = ht.prune_overshadowed_alleles(temp_pruned)
  File "/home/Project/Immuno/OptiType/hlatyper.py", line 399, in prune_overshadowed_alleles
    non_overshadowed = covariance.columns.diff(overshadowed)
AttributeError: 'Index' object has no attribute 'diff'

Don't know if it helps

I have now hdf5 1.10.0-patch1 but there is still the same error.
If you have any hint, it would help me a lot.

Hi Jane,

While it shouldn't happen in any OptiType version, you're not running the current one. Dependencies have changed since, those could have broken something. Can you try using the latest OptiType checkout from github?

Hello Andras,
Sorry for the delayed answer.
On our cluster, we have Version: 1.0 as indicated in README.md.
We will reinstall it and I will keep you informed.
Thank you