Error when running with GLPK solver
opened this issue · comments
Hi,
When running against the test data, I'm getting the below error that I think traces back to GLPK but I'm not sure. Can someone help me possibly debug this issue?
Here's the shelll script I used to run it:
`
!/bin/bash
export SAMTOOLS=/Biomarker/ngs/software/samtools/samtools-1.2/bin
export GLPK=/Biomarker/ngs/software/glpk/glpk-4.59/bin
export PATH=$SAMTOOLS:$GLPK:$PATH
export HDF5_DIR=/Biomarker/ngs/software/HD5/hdf5-1.8.16-linux-centos7-x86_64-gcc483-shared
export LD_LIBRARY_PATH=/Biomarker/ngs/software/HD5/hdf5-1.8.16-linux-centos7-x86_64-gcc483-shared/lib
/Biomarker/ngs/software/bin/python OptiType-master/OptiTypePipeline.py -i OptiType-master/test/exome/NA11995_SRR766010_1_fished.fastq OptiType-master/test/exome/NA11995_SRR766010_2_fished.fastq --dna --verbose --config OptiType-master/config.ini -o OptiType-master/test/exome/
`
The head of the .raw file looks like this:
c Problem:
c Rows: 450
c Columns: 282
c Non-zeros: 1715
c Status: INTEGER OPTIMAL
c Objective: x282 = 1135.192 (MAXimum)
c
s mip 450 282 o 1135.192
i 1 1
i 2 2
i 3 2
i 4 1
i 5 1
i 6 1
i 7 1
i 8 2
i 9 2
i 10 1
ERROR (at the bottom):
0:00:01.08 Mapping NA11995_SRR766010_1_fished.fastq to GEN reference...
0:00:31.21 Mapping NA11995_SRR766010_2_fished.fastq to GEN reference...
0:00:57.64 Generating binary hit matrix.
0:00:57.66 Loading OptiType-master/test/exome/2016_03_23_16_57_45/2016_03_23_16_57_45_1.bam started. Number of HLA reads loaded (updated every thousand):
1K...
0:01:00.97 1909 reads loaded. Creating dataframe...
0:01:01.22 Dataframes created. Shape: 1909 x 11179, hits: 688669 (1249465), sparsity: 1 in 17.08
0:01:01.60 Loading OptiType-master/test/exome/2016_03_23_16_57_45/2016_03_23_16_57_45_2.bam started. Number of HLA reads loaded (updated every thousand):
1K...
0:01:04.73 1876 reads loaded. Creating dataframe...
0:01:04.92 Dataframes created. Shape: 1876 x 11179, hits: 657359 (1192811), sparsity: 1 in 17.58
0:01:05.67 Alignment pairing completed. 1681 paired, 359 unpaired, 32 discordant
0:01:11.14 temporary pruning of identical rows and columns
0:01:11.32 Size of mtx with unique rows and columns: (496, 776)
0:01:11.32 determining minimal set of non-overshadowed alleles
0:01:13.67 Keeping only the minimal number of required alleles (62,)
0:01:13.67 Creating compact model...
0:01:13.82 Initializing OptiType model...
GLPSOL: GLPK LP/MIP Solver, v4.59
Parameter(s) specified in the command line:
--write /tmp/tmpGZXIuT.glpk.raw --wglp /tmp/tmpmXCoNz.glpk.glp --cpxlp /tmp/tmpWPTOBn.pyomo.lp
Reading problem data from '/tmp/tmpWPTOBn.pyomo.lp'...
/tmp/tmpWPTOBn.pyomo.lp:3620: warning: lower bound of variable 'x1' redefined
/tmp/tmpWPTOBn.pyomo.lp:3620: warning: upper bound of variable 'x1' redefined
450 rows, 282 columns, 1715 non-zeros
171 integer variables, all of which are binary
3791 lines were read
Writing problem data to '/tmp/tmpmXCoNz.glpk.glp'...
3276 lines were written
GLPK Integer Optimizer, v4.59
450 rows, 282 columns, 1715 non-zeros
171 integer variables, all of which are binary
Preprocessing...
2 hidden packing inequaliti(es) were detected
95 hidden covering inequaliti(es) were detected
444 rows, 280 columns, 1705 non-zeros
170 integer variables, all of which are binary
Scaling...
A: min|aij| = 1.000e+00 max|aij| = 6.000e+00 ratio = 6.000e+00
Problem data seem to be well scaled
Constructing initial basis...
Size of triangular part is 444
Solving LP relaxation...
GLPK Simplex Optimizer, v4.59
444 rows, 280 columns, 1705 non-zeros
0: obj = -0.000000000e+00 inf = 5.000e+00 (5)
5: obj = -3.000000000e-02 inf = 0.000e+00 (0)
- 241: obj = 1.135192000e+03 inf = 3.064e-14 (0)
OPTIMAL LP SOLUTION FOUND
Integer optimization begins... - 241: mip = not found yet <= +inf (1; 0)
- 241: >>>>> 1.135192000e+03 <= 1.135192000e+03 0.0% (1; 0)
- 241: mip = 1.135192000e+03 <= tree is empty 0.0% (0; 1)
INTEGER OPTIMAL SOLUTION FOUND
Time used: 0.0 secs
Memory used: 0.7 Mb (722870 bytes)
Writing MIP solution to '/tmp/tmpGZXIuT.glpk.raw'...
741 lines were written
invalid literal for int() with base 10: 'c'
WARNING: Solver does not support multi-threading. Please change the config file accordingly. Falling back to single-threading.
GLPSOL: GLPK LP/MIP Solver, v4.59
Parameter(s) specified in the command line:
--write /tmp/tmpz_UceC.glpk.raw --wglp /tmp/tmpW8xrDS.glpk.glp --cpxlp /tmp/tmphE7GB3.pyomo.lp
Reading problem data from '/tmp/tmphE7GB3.pyomo.lp'...
/tmp/tmphE7GB3.pyomo.lp:3620: warning: lower bound of variable 'x1' redefined
/tmp/tmphE7GB3.pyomo.lp:3620: warning: upper bound of variable 'x1' redefined
450 rows, 282 columns, 1715 non-zeros
171 integer variables, all of which are binary
3791 lines were read
Writing problem data to '/tmp/tmpW8xrDS.glpk.glp'...
3276 lines were written
GLPK Integer Optimizer, v4.59
450 rows, 282 columns, 1715 non-zeros
171 integer variables, all of which are binary
Preprocessing...
2 hidden packing inequaliti(es) were detected
95 hidden covering inequaliti(es) were detected
444 rows, 280 columns, 1705 non-zeros
170 integer variables, all of which are binary
Scaling...
A: min|aij| = 1.000e+00 max|aij| = 6.000e+00 ratio = 6.000e+00
Problem data seem to be well scaled
Constructing initial basis...
Size of triangular part is 444
Solving LP relaxation...
GLPK Simplex Optimizer, v4.59
444 rows, 280 columns, 1705 non-zeros
0: obj = -0.000000000e+00 inf = 5.000e+00 (5)
5: obj = -3.000000000e-02 inf = 0.000e+00 (0) - 241: obj = 1.135192000e+03 inf = 3.064e-14 (0)
OPTIMAL LP SOLUTION FOUND
Integer optimization begins... - 241: mip = not found yet <= +inf (1; 0)
- 241: >>>>> 1.135192000e+03 <= 1.135192000e+03 0.0% (1; 0)
- 241: mip = 1.135192000e+03 <= tree is empty 0.0% (0; 1)
INTEGER OPTIMAL SOLUTION FOUND
Time used: 0.0 secs
Memory used: 0.7 Mb (722870 bytes)
Writing MIP solution to '/tmp/tmpz_UceC.glpk.raw'...
741 lines were written
invalid literal for int() with base 10: 'c'
Traceback (most recent call last):
File "OptiType-master/OptiTypePipeline.py", line 374, in
result = op.solve(args.enumerate)
File "/Biomarker/ngs/software/OptiType/OptiType-master/model.py", line 150, in solve
res = self.__solver.solve(self.__instance, options={}, tee=self.__verbosity)
File "/Biomarker/ngs/software/python/latest/lib/python2.7/site-packages/pyomo/opt/base/solvers.py", line 578, in solve
result = self._postsolve()
File "/Biomarker/ngs/software/python/latest/lib/python2.7/site-packages/pyomo/opt/solver/shellcmd.py", line 161, in _postsolve
results = self.process_output(self._rc)
File "/Biomarker/ngs/software/python/latest/lib/python2.7/site-packages/pyomo/opt/solver/shellcmd.py", line 220, in process_output
self.process_soln_file(results)
File "/Biomarker/ngs/software/python/latest/lib/python2.7/site-packages/pyomo/solvers/plugins/solvers/GLPK.py", line 445, in process_soln_file
raise ValueError(msg)
ValueError: Error parsing solution data file, line 1
Hi,
Have you looked at this post #28. It seems that newer versions of GLPK cause some problems with Pyomo. You also might try CBC as solver (https://projects.coin-or.org/Cbc). CBC is also free and open-source, but much much faster than GLPK.
Ugh. I forgot to search closed issues. Thanks. I'll downgrade for now and install CBC later once I've completed testing. Sorry for not seeing that post earlier.