2.10.9 update crashes in CbcBaseModel::waitForThreadsInTree
jschueller opened this issue · comments
since the cbc update 2.10.9, cbc crashes via bonmin on its simple c++ example (https://github.com/coin-or/Bonmin/tree/master/examples/CppExample)
******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
Ipopt is released as open source code under the Eclipse Public License (EPL).
For more information visit https://github.com/coin-or/Ipopt
******************************************************************************
NLP0012I
Num Status Obj It time Location
NLP0014I 1 OPT -2.618034 13 0.005785
NLP0014I 2 OPT -2.618034 9 0.003997
NLP0014I 3 OPT -2 10 0.004034
NLP0014I 4 OPT -1.7071068 8 0.003429
NLP0014I 5 OPT -2.5001414 20 0.006979
Cbc0010I After 0 nodes, 1 on tree, 1e+50 best solution, best possible -1.7976931e+308 (0.02 seconds)
NLP0014I 6 OPT -2.5001414 20 0.003155
NLP0014I 7 OPT -1.7071068 8 0.003414
NLP0014I 8 OPT -2.5001414 19 0.006966
Program received signal SIGSEGV, Segmentation fault.
CbcBaseModel::waitForThreadsInTree (this=0x5602947cca40, type=type@entry=2) at CbcThread.cpp:831
831 if (!baseModel->tree()->empty()) {
#0 CbcBaseModel::waitForThreadsInTree (this=0x5602947cca40, type=type@entry=2) at CbcThread.cpp:831
#1 0x00007fba92d30b15 in CbcModel::branchAndBound (this=0x7ffe141d0708, doStatistics=0) at CbcModel.cpp:5194
#2 0x00007fba9378f7ba in Bonmin::Bab::branchAndBound(Bonmin::BabSetupBase&) () from /usr/lib/libbonmin.so.4
#3 0x0000560294328a8b in main (argc=<optimized out>, argv=<optimized out>) at MyBonmin.cpp:80
Careful: dependent packages vol+bcp+bonmin have to be recompiled when cbc gets updated to see the exact same error, else it might crash elsewhere.
I used all latest cgl 0.60.7, clp 1.17.8, osi 0.108.8, coinutils 2.11.8, bonmin 1.8.9, vol 1.5.4, bcp 1.4.4, ipopt 3.14.12
It is fine if I revert to 2.10.8 (and rebuild vol/bcp/bonmin)
this is with gcc 12.2.1 / archlinux
CXXFLAGS=
-march=x86-64 -mtune=generic -O2 -pipe -fno-plt -fexceptions \
-Wp,-D_FORTIFY_SOURCE=2 -Wformat -Werror=format-security \
-fstack-clash-protection -fcf-protection -g
LDFLAGS="-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now"
Could you say more about exactly how you built the projects? What were your configure options? For example, it looks like you built Cbc with --enable-cbc-parallel
? I built Bonmin 1.8 with the tip of each dependent stable branch. For the Cbc stack and Bonmin itself, these are identical to the current latest releases. With that setup, everything worked fine on my machine. Can you try that also and see if it works? If so, that will give us a data point for narrowing down the issue.
coinbrew fetch Bonmin@1.8
coinbrew build Bonmin
I use the coin packages from archlinux:
https://archlinux.org/packages/?sort=&q=coin-or&maintainer=&flagged=
Indeed cbc has this option:
https://github.com/archlinux/svntogit-community/blob/packages/coin-or-cbc/trunk/PKGBUILD
Here are the other scripts:
https://github.com/archlinux/svntogit-community/blob/packages/coin-or-coinutils/trunk/PKGBUILD
https://github.com/archlinux/svntogit-community/blob/packages/coin-or-osi/trunk/PKGBUILD
https://github.com/archlinux/svntogit-community/blob/packages/coin-or-clp/trunk/PKGBUILD
https://github.com/archlinux/svntogit-community/blob/packages/coin-or-cgl/trunk/PKGBUILD
I tried running coinbrew but it failed to build mumps:
$ coinbrew build Bonmin@stable/1.8.9
...
##################################################
### Building ThirdParty/Mumps 1.6
##################################################
.../ThirdParty/Mumps/MUMPS/src/dmumps_comm_buffer.F:2667:23:
2658 | CALL MPI_PACK( WHAT, 1, MPI_INTEGER,
| 2
......
2667 | CALL MPI_PACK( LIST_SLAVES, NSLAVES, MPI_INTEGER,
| 1
Error: Rank mismatch between actual argument at (1) and actual argument at (2) (scalar and rank-1)
.../ThirdParty/Mumps/MUMPS/src/dmumps_comm_buffer.F:2670:23:
I will try to rebuild without --enable-cbc-parallel
and bisect cbc next week.
For the Mumps issue, I guess adding ADD_FFLAGS=-fallow-argument-mismatch
should fix the problem, per coin-or/coinbrew#47.
I now built the full stack with --enable-cbc-parallel
and it still works fine for me.
It also happens with default compilation flags, so this is not caused by archlinux compilation flags
it seems to come from 1e6e301 what does this do @jjhforrest ?
1e6e3016003941406c5a63dd022c53d958111c35 is the first bad commit
commit 1e6e3016003941406c5a63dd022c53d958111c35
Author: John Forrest <jjhforrest@gmail.com>
Date: Fri May 27 17:37:23 2022 +0100
to allow root symmetry orbital
Cbc/src/CbcModel.cpp | 388 ++++++++++++-
Cbc/src/CbcModel.hpp | 16 +
Cbc/src/CbcNode.cpp | 107 ++++
Cbc/src/CbcSymmetry.cpp | 1377 +++++++++++++++++++++++++++++++++++++++--------
Cbc/src/CbcSymmetry.hpp | 82 ++-
5 files changed, 1720 insertions(+), 250 deletions(-)
bisect found first bad commit
the error disappears if I disable nauty
@tkralphs what distro / compiler do you use ? do you enable nauty ?
I am unable to reproduce this error - with or without nauty. The symmetry breaking code is only entered if user has set option - and that is not set in example.
what distro / compiler are you using ?
Bonmin 1.8.9 and gcc 11.3
Is that the version from ubuntu ?
I'm using gcc 12.2
Ubuntu
I think I figured it out: COIN_HAS_NTY private macro is used in public header CbcModel.hpp
so 3rd party code dont see a class with the same size, I guess this causes the crash with gcc 12, an alignement issue or something
if I just drop the COIN_HAS_NTY from the definition of the attributes it fixes my crash locally
Yes, that could be it. The use of COIN_HAS_NTY
in public headers was indeed introduced with 1e6e301.
(On master, this is named CBC_HAS_NTY
and the define is exported in CbcConfig.h
, but not on this ancient stable branch.)
I proposed to drop it but maybe its best to define it in the config header as you mentionned
Problem with the name COIN_HAS_NTY
is that it is too generic. If someone still tries to build Couenne, then it could give a conflict there (probably just a compiler warning, though).
lets drop it then as in #593 ?
I missed that there were more than just bug fixes being introduced in this release. #593 looks to be the right fix. I will merge and make a new release unless @jjhforrest objects.
go ahead - I will try and remember not to put anything interesting into stable - just fixes.
yes, boring stuff only please :]