tpoechtrager / cctools-port

Apple cctools port for Linux and *BSD

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ld64 + thin-lto wrongfully merges static data with non-static data

glandium opened this issue · comments

Consider the following testcase:

$ cat <<EOF > a.c
static const char foo[] = "foo";

char get_foo(int i) {
    return foo[i];
}
EOF
$ cat <<EOF > b.c
const char foo[] = "bar";

char get_foo2(int i) {
  return foo[i];
}
EOF
$ /builds/worker/workspace/build/src/clang/bin/clang -target x86_64-apple-darwin11 -B /builds/worker/workspace/build/src/cctools/bin -isysroot /builds/worker/workspace/build/src/MacOSX10.11.sdk -flto=thin -o libfoo.dylib -shared -fPIC [ab].c
$ /builds/worker/workspace/build/src/clang/bin/llvm-objdump -s libfoo.dylib

libfoo.dylib:   file format Mach-O 64-bit x86-64

Contents of section __text:
 0f70 554889e5 488d052d 00000089 7dfc4863  UH..H..-....}.Hc
 0f80 4dfc0fbe 04085dc3 90909090 90909090  M.....].........
 0f90 554889e5 488d050d 00000089 7dfc4863  UH..H.......}.Hc
 0fa0 4dfc0fbe 04085dc3                    M.....].
Contents of section __const:
 0fa8 62617200                             bar.
Contents of section __unwind_info:
 0fac 01000000 1c000000 00000000 1c000000  ................
 0fbc 00000000 1c000000 02000000 700f0000  ............p...
 0fcc 34000000 34000000 a90f0000 00000000  4...4...........
 0fdc 34000000 03000000 0c000100 10000100  4...............
 0fec 00000000 00000001                    ........

See how there is no data section containing the "foo" string, because it was "merged" with the "bar" string.

Interestingly, this does not happen with full lto:

$ /builds/worker/workspace/build/src/clang/bin/clang -target x86_64-apple-darwin11 -B /builds/worker/workspace/build/src/cctools/bin -isysroot /builds/worker/workspace/build/src/MacOSX10.11.sdk -flto -o libfoo.dylib -shared -fPIC [ab].c
$ /builds/worker/workspace/build/src/clang/bin/llvm-objdump -s libfoo.dylib

libfoo.dylib:   file format Mach-O 64-bit x86-64

Contents of section __text:
 0f70 554889e5 488d0531 00000089 7dfc4863  UH..H..1....}.Hc
 0f80 4dfc0fbe 04085dc3 0f1f8400 00000000  M.....].........
 0f90 554889e5 488d050d 00000089 7dfc4863  UH..H.......}.Hc
 0fa0 4dfc0fbe 04085dc3                    M.....].
Contents of section __const:
 0fa8 62617200                             bar.
Contents of section __cstring:
 0fac 666f6f00                             foo.
Contents of section __unwind_info:
 0fb0 01000000 1c000000 00000000 1c000000  ................
 0fc0 00000000 1c000000 02000000 700f0000  ............p...
 0fd0 34000000 34000000 a90f0000 00000000  4...4...........
 0fe0 34000000 03000000 0c000100 10000100  4...............
 0ff0 00000000 00000001                    ........

Enabling the logging code in lto_file.cpp says:

bitcode file: /tmp/a-88db1a.o
	0x000019A0 _get_foo
bitcode file: /tmp/b-cbe283.o
	0x000019A0 _get_foo2
	0x00001980 _foo
AtomSyncer, mach-o atom 0x55643d5d3900 synced to lto atom 0x7f6bbc01f190 (name=_get_foo)
  adjusting fixups in atom: _get_foo
    updating direct reference to 0x55643d5d3950 to be ref to 0x7f6bb400e9f8: _foo
AtomSyncer, mach-o atom 0x55643d5d3950 synced to lto atom 0x7f6bb400e9f8 (name=_foo)
  adjusting fixups in atom: _foo
AtomSyncer, mach-o atom 0x55643d5d39a0 is totally new (name=CIE)
  adjusting fixups in atom: CIE
AtomSyncer, mach-o atom 0x55643d5c4800 synced to lto atom 0x7f6bb400e9c0 (name=_get_foo2)
  adjusting fixups in atom: _get_foo2
    adding by-name symbol _foo
AtomSyncer, mach-o atom 0x55643d5c4850 synced to lto atom 0x7f6bb400e9f8 (name=_foo)
  adjusting fixups in atom: _foo
AtomSyncer, mach-o atom 0x55643d5c48a0 is totally new (name=CIE)
  adjusting fixups in atom: CIE

With full lto:

bitcode file: /tmp/a-9cfb57.o
	0x000019A0 _get_foo
bitcode file: /tmp/b-aeee6f.o
	0x000019A0 _get_foo2
	0x00001980 _foo
AtomSyncer, mach-o atom 0x5648887a01c0 synced to lto atom 0x7fef1401f190 (name=_get_foo)
  adjusting fixups in atom: _get_foo
AtomSyncer, mach-o atom 0x5648887a0210 synced to lto atom 0x7fef0c00e9c0 (name=_get_foo2)
  adjusting fixups in atom: _get_foo2
    adding by-name symbol _foo
AtomSyncer, mach-o atom 0x5648887a0260 is totally new (name=_foo.2)
  adjusting fixups in atom: _foo.2
AtomSyncer, mach-o atom 0x5648887a02b0 synced to lto atom 0x7fef0c00e9f8 (name=_foo)
  adjusting fixups in atom: _foo
AtomSyncer, mach-o atom 0x5648887a0300 is totally new (name=CIE)
  adjusting fixups in atom: CIE

And making foo static in b.c:

bitcode file: /tmp/a-70bbe5.o
	0x000019A0 _get_foo
bitcode file: /tmp/b-4778ce.o
	0x000019A0 _get_foo2
AtomSyncer, mach-o atom 0x557507baceb0 synced to lto atom 0x7f3ca401f190 (name=_get_foo)
  adjusting fixups in atom: _get_foo
AtomSyncer, mach-o atom 0x557507bacf00 is totally new (name=_foo)
  adjusting fixups in atom: _foo
AtomSyncer, mach-o atom 0x557507bacf50 is totally new (name=CIE)
  adjusting fixups in atom: CIE
AtomSyncer, mach-o atom 0x557507b9f580 synced to lto atom 0x7f3c9c00e9c0 (name=_get_foo2)
  adjusting fixups in atom: _get_foo2
AtomSyncer, mach-o atom 0x557507b9f5d0 is totally new (name=_foo)
  adjusting fixups in atom: _foo
AtomSyncer, mach-o atom 0x557507b9f620 is totally new (name=CIE)
  adjusting fixups in atom: CIE

FYI, I'm testing a possible fix.

FYI, I'm testing a possible fix.

This didn't work out. Some interesting fact: it's reproducible with Xcode 8, but not Xcode 9 (ld64-351.8, unfortunately not on https://opensource.apple.com/source/ld64/). I tried contacting ld64's listed author.

I tried contacting ld64's listed author.

Thank you. I can't do anything in this case. :(

I am not sure if Apple will continue releasing the source code of LD64. It's been a long time since the last release (> 1.5 years, I think).