FRED-2 / OptiType

Precision HLA typing from next-generation sequencing data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

invalid literal for int() with base 10: 'c'

hartmaier opened this issue · comments

This is my first time running optitype, so forgive me if I missed something obvious. I am getting an odd parsing error. This also occurs with running the test dataset.

Command:
python OptiTypePipeline.py -i ./sample_opti.1.fq ./sample_opti.2.fq --dna -c ./config.ini -v -o ./optitype_work/

Output:

...
Problem data seem to be well scaled
Constructing initial basis...
Size of triangular part is 24641

Solving LP relaxation...
GLPK Simplex Optimizer, v4.59
24641 rows, 13057 columns, 387578 non-zeros
      0: obj =  -0.000000000e+00 inf =   6.000e+00 (6)
      6: obj =  -5.000000000e-02 inf =   0.000e+00 (0)
*   500: obj =   3.181086340e+04 inf =   2.501e-14 (5968)
*  1000: obj =   3.630040867e+04 inf =   2.998e-15 (5832) 1
*  1500: obj =   3.858366231e+04 inf =   1.865e-14 (5697) 1
*  2000: obj =   4.039086154e+04 inf =   8.253e-15 (5604)
*  2500: obj =   4.164897250e+04 inf =   8.357e-15 (5524) 1
*  3000: obj =   4.272261500e+04 inf =   5.965e-15 (5427) 1
*  3500: obj =   4.343465417e+04 inf =   8.882e-15 (5356) 2
*  4000: obj =   4.420333917e+04 inf =   6.217e-15 (5259) 1
*  4500: obj =   4.513628286e+04 inf =   1.753e-14 (5178) 4
*  5000: obj =   4.557089286e+04 inf =   0.000e+00 (5148) 1
*  5500: obj =   4.586324125e+04 inf =   0.000e+00 (5136) 1
*  6000: obj =   4.677771714e+04 inf =   4.199e-14 (5013) 1
*  6500: obj =   4.804854833e+04 inf =   1.110e-16 (4827) 3
*  7000: obj =   4.926832833e+04 inf =   0.000e+00 (4595) 2
*  7500: obj =   4.933832833e+04 inf =   0.000e+00 (4564)
*  8000: obj =   4.940799500e+04 inf =   3.331e-15 (4515)
*  8500: obj =   4.947966167e+04 inf =   0.000e+00 (4492) 1
*  9000: obj =   4.955066167e+04 inf =   0.000e+00 (4462)
*  9500: obj =   5.095554333e+04 inf =   0.000e+00 (4141) 2
* 10000: obj =   5.227048333e+04 inf =   0.000e+00 (3768) 1
* 10500: obj =   5.347663333e+04 inf =   0.000e+00 (3333) 1
* 11000: obj =   5.448885333e+04 inf =   0.000e+00 (2891)
* 11500: obj =   5.521412000e+04 inf =   0.000e+00 (2429) 1
* 12000: obj =   5.594569333e+04 inf =   0.000e+00 (1965)
* 12500: obj =   5.638794833e+04 inf =   0.000e+00 (1485)
* 13000: obj =   5.676556000e+04 inf =   0.000e+00 (1004)
* 13500: obj =   5.715105500e+04 inf =   0.000e+00 (516)
* 14000: obj =   5.752945500e+04 inf =   0.000e+00 (34) 1
* 14037: obj =   5.755625833e+04 inf =   0.000e+00 (0)
OPTIMAL LP SOLUTION FOUND
Integer optimization begins...
+ 14037: mip =     not found yet <=              +inf        (1; 0)
+ 14038: >>>>>   5.755594000e+04 <=   5.755594000e+04   0.0% (2; 0)
+ 14038: mip =   5.755594000e+04 <=     tree is empty   0.0% (0; 3)
INTEGER OPTIMAL SOLUTION FOUND
Time used:   12.8 secs
Memory used: 75.9 Mb (79626336 bytes)
Writing MIP solution to '/var/folders/76/zt8rzbc5077bnxl8pvhpw6d4wvz933/T/tmpH4rygQ.glpk.raw'...
37709 lines were written
invalid literal for int() with base 10: 'c'
Traceback (most recent call last):
  File "~/GitHub/OptiType/OptiTypePipeline.py", line 373, in <module>
    result = op.solve(args.enumerate)
  File "~/GitHub/OptiType/model.py", line 149, in solve
    res = self.__solver.solve(self.__instance, options={}, tee=self.__verbosity)
  File "~/anaconda2/lib/python2.7/site-packages/pyomo/opt/base/solvers.py", line 578, in solve
    result = self._postsolve()
  File "~/anaconda2/lib/python2.7/site-packages/pyomo/opt/solver/shellcmd.py", line 161, in _postsolve
    results = self.process_output(self._rc)
  File "~/anaconda2/lib/python2.7/site-packages/pyomo/opt/solver/shellcmd.py", line 220, in process_output
    self.process_soln_file(results)
  File "~/anaconda2/lib/python2.7/site-packages/pyomo/solvers/plugins/solvers/GLPK.py", line 445, in process_soln_file
    raise ValueError(msg)
ValueError: Error parsing solution data file, line 1

When I open up the MIP solution tmp file it has the format of:

c Problem:    
c Rows:       24642
c Columns:    13058
c Non-zeros:  387579
c Status:     INTEGER OPTIMAL
c Objective:  x13058 = 57555.94 (MAXimum)
c
s mip 24642 13058 o 57555.9400000032
i 1 2
i 2 2
i 3 2
i 4 2
...

I guess it has to do with the 'c' in the file being treated as integers but I am not sure why.
Any ideas?

I think I got it...I just saw the other issue discussing the different versions of glpk. I removed v4.59 and installed v4.55 and was able to run the test data without errors. Running my own data now but expect this to solve the problem. Closing issue.