root-project / cling

The cling C++ interpreter

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Invalid value in JIT land (leads to hard crash)

jeaye opened this issue · comments

  • Checked for duplicates

Cling singleton bug

This appears to be a bug related to static globals and an odd combination of unique_ptr<T> and T* fields. The result is that non-JIT code sees one value for the T* field, while JIT code sees another. All other fields show the same value.

I have created a standalone repo for this issue here, with a very minimal test case: https://github.com/jeaye/cling-singleton-bug

Building and running

$ mkdir build ; cd build
$ cmake .. -DCling_DIR=/path/to/cling-build/builddir
$ make
$ ./cling-demo
before: 0
cling: (int *) 0x1 <invalid memory address>
after: 0

As shown in the output above, this demo outputs the same value from non-JIT code and JIT code. It doesn't match up. However, if the unique_ptr field before the T* field is a T* or even a shared_ptr<T>, this issue doesn't manifest.

Environment

I have reproduced this in the following configurations:

  • Arch Linux, GCC 12.2.0, Cling stable debug build (from cpt)
  • Arch Linux, Clang 14.0.6, Cling stable debug build (from cpt)
  • Arch Linux, Clang 14.0.6, Cling llvm-13 debug build (from cpt)
  • Ubuntu 22.04, Clang 14, Cling stable release build (from cpt)

Curiously, I have not been able to reproduce it on:

  • macOS 12, AppleClang 12, Cling stable release build (from cpt)

Bug

I suspect that this is a bug, since I do not believe this to be UB and I haven't seen any Cling documentation stating this should not work. Also, the oddities around unique_ptr are perplexing. I'm hoping you folks can reproduce and have a better idea of what's going on.

Further info

I've also noted that, by adding a custom default value to b, I can really mix things up. For example, given this patch:

diff --git a/include/singleton.hpp b/include/singleton.hpp
index 8dc4197..4f7f750 100644
--- a/include/singleton.hpp
+++ b/include/singleton.hpp
@@ -9,7 +9,7 @@ struct wrapper
    * doesn't exist. Also, if it's shared_ptr, it doesn't exist. */
   std::unique_ptr<T> a;
   /* This is 0x0 outside of cling and 0x1 inside of Cling. */
-  T *b{};
+  T *b{ reinterpret_cast<T*>(0x2) };
 };
 
 template <typename C>

The output becomes this:

$ ./cling-demo 
before: 0x2
cling: (int *) 0x7fc7a48a53a0
after: 0x2

Hi! Any chance of getting this looked at, to see if my test case is reproducible for you and go from there? 🙂

Ok, following up on this with some additional information. I have done some more testing and found some interesting results.

  1. If I use https://github.com/Axel-Naumann/cling-all-in-one to build Cling, rather than my own cpt usage, the same problem exists
  2. If I use the precompiled Cling from Nix, everything works fine

So, that means I've tried building the following on Linux:

  1. Stable cling via cpt
  2. LLVM13 cling via cpt
  3. Stable cling via cling-all-in-one (tried this on Arch Linux with GCC 12.2.0)

All of them lead to this issue, but prebuilt binaries from Nix do not. I haven't been able to get Cling from anywhere else (like AUR) properly compiling and set up with cmake, so I'm not sure about how those would work yet.

Is there a chance to compare the compile flags for the working and non-working versions?

Nix is quite opaque, the way it builds things, so that would be tough. Looks like the Nix expr is here: https://github.com/NixOS/nixpkgs/blob/master/pkgs/development/interpreters/cling/default.nix

If the test case is working for you on Linux, would you mind sharing your distro and build details so I can try to set that up locally?

I figured the cling-all-in-one repo would solve everything, since that's maintained by Axel Naumann. Unfortunately not. Do you guys maintain any prebuilts I could try to see how they work? Or docker images for using Cling which work for you guys?

I see that nix has a version 0.7 is this the version of the package? Did the rest have the same version?

Hmmm, interesting. Everything I've built (cling-all-in-one and cling with cpt) has been Cling master (with LLVM 9), aside from the LLVM 13 branch. I will try building 0.7 locally.

I have built 0.7 (with LLVM 5) locally, using cpt. Required some small patches to the code to compile, but you got it, Vassil 🙂 . The issue is not reproduced on 0.7.

Naturally, I built 0.8 next (also LLVM 5). It also works correctly, given my test case.

Finally, I built the 0.9 tag next (with LLVM 9) and verified the problem shows up! So this is something between Cling 0.8 and 0.9 for me. I would hope that, if you can try with 0.9 or with master then you can also reproduce.

For all of these, I used cpt to build:

$ ./cpt.py --create-dev-env=Debug \
           --with-workdir=./cling-build/ \
           --allow-dirty \
           --with-cling-branch=v0.7 \
           $@

I had to hack in the LLVM version used for each branch by checking the LastKnownGoodLLVMSVNRevision.txt for that tag, though, since cpt doesn't handle it properly.

I've fixed this, Vassil. The issue was on my end all along. I was building jank with my system Clang and, even though I aimed to match the versions, apparently there were differences. If I instead build Cling first, and then use Cling's Clang to build jank, everything lines up better and this issue goes away.

I've updated my build system and docs to require building Cling first and then building jank with that very specific Clang.

It seems the fact that this worked on 0.8 and not 0.9 is just coincidental. Thanks so much for your support on this.

Glad to hear it is resolved!