tpoechtrager / cctools-port

Apple cctools port for Linux and *BSD

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ld64 aborts with _GLIBCXX_ASSERTIONS enabled

Amavect opened this issue · comments

I'm trying to build osxcross for Artix Linux.
The build worked, but when it got to testing the compilers, it failed:

testing x86_64h-apple-darwin20.4-clang ... /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/11.1.0/../../../../include/c++/11.1.0/bits/stl_vector.h:1045: std::vector::reference std::vector<ld::Fixup, std::allocator<ld::Fixup>>::operator[](std::vector::size_type) [_Tp = ld::Fixup, _Alloc = std::allocator<ld::Fixup>]: Assertion '__n < this->size()' failed.
clang-11: error: unable to execute command: Aborted
clang-11: error: linker command failed due to signal (use -v to see invocation)
failed (ignored)
testing x86_64h-apple-darwin20.4-clang++ ... /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/11.1.0/../../../../include/c++/11.1.0/bits/stl_vector.h:1045: std::vector::reference std::vector<ld::Fixup, std::allocator<ld::Fixup>>::operator[](std::vector::size_type) [_Tp = ld::Fixup, _Alloc = std::allocator<ld::Fixup>]: Assertion '__n < this->size()' failed.
clang-11: error: unable to execute command: Aborted
clang-11: error: linker command failed due to signal (use -v to see invocation)
failed (ignored)

Not a very helpful error message. Something to do with std::vector.

Here's the output with -v to see invocation.

$ ./../target/bin/o64-clang -O2 -Wall -o test -v test.c
clang version 11.1.0
Target: x86_64-apple-darwin20.4
Thread model: posix
InstalledDir: /usr/bin
 "/usr/bin/clang-11" -cc1 -triple x86_64-apple-macosx10.9.0 -Wundef-prefix=TARGET_OS_ -Werror=undef-prefix -Wdeprecated-objc-isa-usage -Werror=deprecated-objc-isa-usage -emit-obj -disable-free -disable-llvm-verifier -discard-value-names -main-file-name test.c -mrelocation-model pic -pic-level 2 -mframe-pointer=all -fno-rounding-math -munwind-tables -faligned-alloc-unavailable -target-sdk-version=11.3 -fcompatibility-qualified-id-block-type-checking -target-cpu core2 -debugger-tuning=lldb -target-linker-version 609 -v -resource-dir /usr/lib/clang/11.1.0 -isystem /usr/bin/../lib/clang/11.1.0/include -isysroot /home/bird/src/aur/osxcross-git/src/osxcross/target/bin/../SDK/MacOSX11.3.sdk -cxx-isystem /home/bird/src/aur/osxcross-git/src/osxcross/target/bin/../SDK/MacOSX11.3.sdk/usr/include/c++/v1 -internal-isystem /home/bird/src/aur/osxcross-git/src/osxcross/target/bin/../SDK/MacOSX11.3.sdk/usr/local/include -internal-isystem /usr/lib/clang/11.1.0/include -internal-externc-isystem /home/bird/src/aur/osxcross-git/src/osxcross/target/bin/../SDK/MacOSX11.3.sdk/usr/include -O2 -Wno-liblto -Wall -fdebug-compilation-dir /home/bird/src/aur/osxcross-git/src/osxcross/oclang -ferror-limit 19 -stack-protector 1 -fblocks -fencode-extended-block-signature -fregister-global-dtors-with-atexit -fgnuc-version=4.2.1 -fmax-type-align=16 -fcolor-diagnostics -vectorize-loops -vectorize-slp -o /tmp/test-595334.o -x c test.c
clang -cc1 version 11.1.0 based upon LLVM 11.1.0 default target x86_64-pc-linux-gnu
ignoring nonexistent directory "/home/bird/src/aur/osxcross-git/src/osxcross/target/bin/../SDK/MacOSX11.3.sdk/usr/local/include"
ignoring nonexistent directory "/home/bird/src/aur/osxcross-git/src/osxcross/target/bin/../SDK/MacOSX11.3.sdk/Library/Frameworks"
ignoring duplicate directory "/usr/bin/../lib/clang/11.1.0/include"
#include "..." search starts here:
#include <...> search starts here:
 /usr/bin/../lib/clang/11.1.0/include
 /home/bird/src/aur/osxcross-git/src/osxcross/target/bin/../SDK/MacOSX11.3.sdk/usr/include
 /home/bird/src/aur/osxcross-git/src/osxcross/target/bin/../SDK/MacOSX11.3.sdk/System/Library/Frameworks (framework directory)
End of search list.
 "/home/bird/src/aur/osxcross-git/src/osxcross/target/bin/x86_64-apple-darwin20.4-ld" -demangle -lto_library /usr/lib/libLTO.dylib -dynamic -arch x86_64 -platform_version macos 10.9.0 11.3 -syslibroot /home/bird/src/aur/osxcross-git/src/osxcross/target/bin/../SDK/MacOSX11.3.sdk -o test /tmp/test-595334.o -lSystem
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/11.1.0/../../../../include/c++/11.1.0/bits/stl_vector.h:1045: std::vector::reference std::vector<ld::Fixup, std::allocator<ld::Fixup>>::operator[](std::vector::size_type) [_Tp = ld::Fixup, _Alloc = std::allocator<ld::Fixup>]: Assertion '__n < this->size()' failed.
clang-11: error: unable to execute command: Aborted
clang-11: error: linker command failed due to signal (use -v to see invocation)

The issue lies in the linker, ld64. It compiles, but calls an assert when running.

Compiling with debug symbols and running GDB to get a stack trace:

(gdb) run -demangle -lto_library /usr/lib/libLTO.dylib -dynamic -arch x86_64 -platform_version macos 10.9.0 11.3 -syslibroot /home/bird/src/aur/osxcross-git/src/osxcross/target/bin/../SDK/MacOSX11.3.sdk -o test test.o -lSystem
Starting program: /home/bird/src/aur/osxcross-git/src/osxcross/target/bin/x86_64-apple-darwin20.4-ld -demangle -lto_library /usr/lib/libLTO.dylib -dynamic -arch x86_64 -platform_version macos 10.9.0 11.3 -syslibroot /home/bird/src/aur/osxcross-git/src/osxcross/target/bin/../SDK/MacOSX11.3.sdk -o test test.o -lSystem
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/usr/lib/libthread_db.so.1".
[New Thread 0x7ffff7697640 (LWP 19455)]
[New Thread 0x7ffff6696640 (LWP 19456)]
/usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/11.1.0/../../../../include/c++/11.1.0/bits/stl_vector.h:1045: std::vector::reference std::vector<ld::Fixup, std::allocator<ld::Fixup>>::operator[](std::vector::size_type) [_Tp = ld::Fixup, _Alloc = std::allocator<ld::Fixup>]: Assertion '__n < this->size()' failed.

Thread 1 "x86_64-apple-da" received signal SIGABRT, Aborted.
0x00007ffff7763ef5 in raise () from /usr/lib/libc.so.6
(gdb) bt
#0  0x00007ffff7763ef5 in raise () from /usr/lib/libc.so.6
#1  0x00007ffff774d862 in abort () from /usr/lib/libc.so.6
#2  0x00005555555cc088 in std::__replacement_assert (__file=<optimized out>, __line=<optimized out>, __function=<optimized out>, 
    __condition=<optimized out>)
    at /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/11.1.0/../../../../include/c++/11.1.0/x86_64-pc-linux-gnu/bits/c++config.h:504
#3  0x00005555556f23a8 in std::vector<ld::Fixup, std::allocator<ld::Fixup> >::operator[] (this=<optimized out>, __n=<optimized out>)
    at /usr/bin/../lib64/gcc/x86_64-pc-linux-gnu/11.1.0/../../../../include/c++/11.1.0/bits/stl_vector.h:1045
#4  mach_o::relocatable::Atom<x86_64>::fixupsEnd (this=<optimized out>) at macho_relocatable_file.cpp:763
#5  0x000055555569c5ef in ld::tool::Resolver::convertReferencesToIndirect (this=0x7fffffffcb28, atom=...) at Resolver.cpp:848
#6  0x000055555569c1da in ld::tool::Resolver::doAtom (this=<optimized out>, atom=...) at Resolver.cpp:783
#7  0x00005555556ef06c in mach_o::relocatable::File<x86_64>::forEachAtom (this=0x7ffff0000ba0, handler=...) at macho_relocatable_file.cpp:4472
#8  0x00005555555c9b19 in ld::tool::InputFiles::forEachInitialAtom (this=0x7fffffffd7b8, handler=..., state=...) at InputFiles.cpp:1324
#9  0x000055555569f52a in ld::tool::Resolver::buildAtomList (this=0x7fffffffcb28) at Resolver.cpp:313
#10 ld::tool::Resolver::resolve (this=0x7fffffffcb28) at Resolver.cpp:2024
#11 0x00005555555d24d9 in main (argc=<optimized out>, argv=<optimized out>) at ld.cpp:1527

And finally we can see the issue.

The following code in mach_o::relocatable::Atom<x86_64>::fixupsEnd at macho_relocatable_file.cpp:763 has the issue:

	virtual ld::Fixup::iterator					fixupsEnd()	const	{ return &machofile()._fixups[_fixupsStartIndex+_fixupsCount]; }

Because C++ is annoying, this is not an array access. It calls the following in std::allocator<ld::Fixup> >::operator[] at stl_vector.h:1045

      reference
      operator[](size_type __n) _GLIBCXX_NOEXCEPT
      {
	__glibcxx_requires_subscript(__n);
	return *(this->_M_impl._M_start + __n);
      }

__glibcxx_requires_subscript is obviously a macro, in debug/assersions.h

#ifndef _GLIBCXX_ASSERTIONS
# define __glibcxx_requires_non_empty_range(_First,_Last)
# define __glibcxx_requires_nonempty()
# define __glibcxx_requires_subscript(_N)
#else

// Verify that [_First, _Last) forms a non-empty iterator range.
# define __glibcxx_requires_non_empty_range(_First,_Last)	\
  __glibcxx_assert(_First != _Last)
# define __glibcxx_requires_subscript(_N)	\
  __glibcxx_assert(_N < this->size())
// Verify that the container is nonempty
# define __glibcxx_requires_nonempty()		\
  __glibcxx_assert(!this->empty())
#endif

Which is the end of the problem.

So, what's going on is that my system defines _GLIBCXX_ASSERTIONS by default.
This issues a bounds check when fixupsEnd() calls the [] operator.
That obviously fails because _fixupsStartIndex+_fixupsCount is one-off the last valid index.

Of course, we're taking the & of it, so it's never been a problem.
However, the documentation says that out_of_range lookups are not defined.
A quick search gives this mailing-list discussion:
https://www.mail-archive.com/gcc-bugs@gcc.gnu.org/msg619891.html

I've found a workaround, namely replacing each &vec[n] with (vec.begin()+n).base() so it no longer asserts.
It feels very kludgy, though.
Also, this is especially a problem because the &vec[n] idiom is used everywhere in ld64's code.
And who knows when it's actually a pointer that just needs ptr+n.
file -type f | xargs grep '\&.*\[' | wc gets at least 1500 lines.

I'm not much of a C++ programmer, so before I try to manually replace each instance, what are your thoughts?
I would prefer to keep _GLIBCXX_ASSERTIONS enabled, since that is "correct".
Is there a better way than replacing with begin() and base()?
I'm willing to dig through and change everything.

Same on Arch when building from the cctools-git AUR package. I had to edit the PKGBUILD and add CFLAGS='-U_GLIBCXX_ASSERTIONS' and CXXFLAGS='-U_GLIBCXX_ASSERTIONS' to the configure line to get it to work properly (it would probably work without the CFLAGS change, leaving only CXXFLAGS modified).

Same on Arch when building from the cctools-git AUR package. I had to edit the PKGBUILD and add CFLAGS='-U_GLIBCXX_ASSERTIONS' and CXXFLAGS='-U_GLIBCXX_ASSERTIONS' to the configure line to get it to work properly (it would probably work without the CFLAGS change, leaving only CXXFLAGS modified).

Confirmed. With these envvars in place I managed to build osxcross on Arch. Thanks a lot!