[BUG]: stucking at 99% during building in NixOS
D3vil0p3r opened this issue · comments
Describe the bug
klee building stucks at 99%
Klee version: 3.0.
Here how it is built in NixOS.
Expected behavior
Building should continue after 99% without hanging.
Screenshots
If applicable, add screenshots to help explain your problem.
Additional context
The output stucks at:
89% [====================================================-------(B](B ETA: 00:00:01
90% [=====================================================------(B](B ETA: 00:00:01
90% [=====================================================------(B](B ETA: 00:00:01
90% [=====================================================------(B](B ETA: 00:00:01
90% [=====================================================------(B](B ETA: 00:00:01
90% [=====================================================------(B](B ETA: 00:00:01
91% [=====================================================------(B](B ETA: 00:00:01
91% [=====================================================------(B](B ETA: 00:00:01
91% [======================================================-----(B](B ETA: 00:00:01
91% [======================================================-----(B](B ETA: 00:00:01
92% [======================================================-----(B](B ETA: 00:00:01
92% [======================================================-----(B](B ETA: 00:00:01
92% [======================================================-----(B](B ETA: 00:00:01
92% [======================================================-----(B](B ETA: 00:00:01
93% [======================================================-----(B](B ETA: 00:00:01
93% [=======================================================----(B](B ETA: 00:00:01
93% [=======================================================----(B](B ETA: 00:00:01
93% [=======================================================----(B](B ETA: 00:00:01
94% [=======================================================----(B](B ETA: 00:00:01
94% [=======================================================----(B](B ETA: 00:00:01
94% [=======================================================----(B](B ETA: 00:00:00
94% [=======================================================----(B](B ETA: 00:00:00
95% [========================================================---(B](B ETA: 00:00:00
95% [========================================================---(B](B ETA: 00:00:00
95% [========================================================---(B](B ETA: 00:00:00
95% [========================================================---(B](B ETA: 00:00:00
96% [========================================================---(B](B ETA: 00:00:00
96% [========================================================---(B](B ETA: 00:00:00
96% [========================================================---(B](B ETA: 00:00:00
96% [=========================================================--(B](B ETA: 00:00:00
97% [=========================================================--(B](B ETA: 00:00:00
97% [=========================================================--(B](B ETA: 00:00:00
97% [=========================================================--(B](B ETA: 00:00:00
97% [=========================================================--(B](B ETA: 00:00:00
98% [=========================================================--(B](B ETA: 00:00:00
98% [=========================================================--(B](B ETA: 00:00:00
98% [==========================================================-(B](B ETA: 00:00:00
98% [==========================================================-(B](B ETA: 00:00:00
99% [==========================================================-(B](B ETA: 00:00:00
99% [==========================================================-(B](B ETA: 00:00:00
99% [==========================================================-(B](B ETA: 00:00:00
Thanks @D3vil0p3r for opening an issue. I'm not using NixOS and I'm not familiar with its detail. But it looks like it is stuck at running the test suite.
Still, it should time out eventually and report a trace of the error. Can you use some kind of a debug mode there?
And which LLVM version is used?
LLVM version is 16.0.6. Is there a debug mode option to add in the build command when building klee?
Well, that might explain it already. Currently, you cannot build KLEE 3.0 against LLVM 16. Appropriate support will be added hopefully soon. I guess it's more about using the debug mode of NixOS - but I don't know how it is enabled.
Which LLVM version is currently compatible with Klee? And is there an ETA for making it compatible with LLVM latest version?
For the released KLEE 3.0, we support LLVM 14. As soon as #1664 is merged, the main branch will support up to LLVM 16. It's just a matter of days.
Thank you very much. I keep this ticket opened as well as the one on NixOS repository in order to schedule a test on it on the new Klee version in the next days.
We can probably do a patch to get NixOS' KLEE building against the latest LLVM.
#1664 is now merged.
@MartinNowack @ccadar when the next stable release of Klee containing that change will be released?
We will release a new version of KLEE this month.
That works for us. Looks like we'll have to update uclibc as well.
@numinit @D3vil0p3r KLEE 3.1 has just been tagged.
When updating, please note that klee-uclibc is distinct from uclibc.
Thank you for your packaging efforts!
I made the KLEE nix package into a flake, and it seems the issue still occurs with KLEE 3.1. Two tests hang seemingly deterministically, and so far I have no idea why:
PID USER PRI NI VIRT RES SHR S CPU%▽MEM% TIME+ Command
1758077 nixbld01 20 0 1172G 104M 62296 R 100.0 0.1 7:53.50 /build/source/build/bin/klee --output-dir=/build/source/build/test/Runtime/Uclibc/Output/strcpy_chk.c.tmp.klee-out --libc=uclibc --exit-on-error /build/source/build/test/Runtime/Uclibc/Output/strcpy_chk.c.tmp2.bc
1758311 nixbld01 20 0 1172G 71040 53248 R 100.0 0.1 7:53.31 /build/source/build/bin/klee --output-dir=/build/source/build/test/Runtime/klee-libc/Output/strcat_chk.c.tmp.klee-out --libc=klee --exit-on-error /build/source/build/test/Runtime/klee-libc/Output/strcat_chk.c.tmp2.bc
In total, I got some 37 test failures, including the 2 hanging cases that have to be terminated manually. This is a log file of this build attempt.
Nine of those issues are due to lit
not forwarding PYTHONPATH
. I added forwarding for PYTHONPATH
and reran the build, which got me down to 28 test failures:
Failed Tests (28):
KLEE :: Feature/KleeStatsTermClasses.c
KLEE :: Feature/ubsan/ubsan_alignment-assumption.c
KLEE :: Feature/ubsan/ubsan_alignment-assumption_with_offset.c
KLEE :: Feature/ubsan/ubsan_alignment-type-mismatch.c
KLEE :: Feature/ubsan/ubsan_array_bounds.c
KLEE :: Feature/ubsan/ubsan_builtin.c
KLEE :: Feature/ubsan/ubsan_float_cast_overflow.c
KLEE :: Feature/ubsan/ubsan_float_divide_by_zero.c
KLEE :: Feature/ubsan/ubsan_implicit_integer_sign_change.c
KLEE :: Feature/ubsan/ubsan_implicit_signed_integer_truncation.c
KLEE :: Feature/ubsan/ubsan_implicit_unsigned_integer_truncation.c
KLEE :: Feature/ubsan/ubsan_nonnull_attribute.c
KLEE :: Feature/ubsan/ubsan_null.c
KLEE :: Feature/ubsan/ubsan_nullability_arg.c
KLEE :: Feature/ubsan/ubsan_nullability_assign.c
KLEE :: Feature/ubsan/ubsan_nullability_return.c
KLEE :: Feature/ubsan/ubsan_pointer_overflow-applying_nonzero_offset_to_nonnull_pointer.c
KLEE :: Feature/ubsan/ubsan_pointer_overflow-applying_nonzero_offset_to_null_pointer.c
KLEE :: Feature/ubsan/ubsan_pointer_overflow-applying_zero_offset_to_null_pointer.c
KLEE :: Feature/ubsan/ubsan_pointer_overflow-pointer_arithmetic.c
KLEE :: Feature/ubsan/ubsan_return.cpp
KLEE :: Feature/ubsan/ubsan_returns_nonnull_attribute.c
KLEE :: Feature/ubsan/ubsan_signed_integer_overflow.c
KLEE :: Feature/ubsan/ubsan_unreachable.c
KLEE :: Feature/ubsan/ubsan_unsigned_integer_overflow.c
KLEE :: Feature/ubsan/ubsan_vla_bound.c
KLEE :: Runtime/Uclibc/strcpy_chk.c
KLEE :: Runtime/klee-libc/strcat_chk.c
There is a bunch of KLEE: WARNING: unimplemented intrinsic: llvm.load.relative.i64
, which may be related to the KLEE: ERROR: runtime/Sanitizer/ubsan/ubsan_handlers.cpp:26: unimplemented intrinsic
coming right after in those examples.
The KleeStatsTermClasses.c may be unrelated to the other issues.
Just compiling one of the tests (I used Feature/ubsan/ubsan_alignment-assumption.c) with a nix clang does not introduce llvm.load.relative
.
Further investigation shows that our UBSan runtime looks completely different than what I built w/o nix. I have tracked down all remaining issues to the "fortify" hardening, which enables -O2
behind our backs. This causes (at least) the UBSan runtime to be miscompiled, which causes all other issues including the hangs originally reported here.
Disabling this hardening in the derivation of my test flake and using the PYTHONPATH
passing makes all tests pass.
The tabulate issues can be resolved better by ensuring that the "right" python is first in PATH
, such that patchShebangs
picks up a version of the python interpreter that has access to tabulate
without requiring environment variables.
That also ensures that klee-stats
can find tabulate
in the final environment (e.g., nix shell github:danielschemmel/nix-klee
). To check your final package, just run klee-stats --help
and see if the list of supported table formats is empty (bad) or contains a list like {klee,csv,readable-csv,simple,plain,...}
(good).
Wow, thanks for the detective work @danielschemmel! If you have time, do you mind submitting a PR and adding me to the reviewers?
Should be fixed in nixpkgs.
Thanks, @numinit, let's close this then.
Thank you guys
No worries! Appreciate the work on the project.