Tests crashing on Ubuntu 20.04
mfikes opened this issue · comments
If you update to Ubuntu 20.04 and run the unit tests, you get a failure
script/test: line 9: 26808 Segmentation fault (core dumped) script/test-unit
This happens with both fast and regular builds.
These crashes don't occur on Ubuntu 18.04.
To get more info,
export ASAN_OPTIONS=detect_leaks=0
and then revise planck-c/CMakeLists.txt
to set(CMAKE_BUILD_TYPE Debug)
and uncomment the CMAKE_C_FLAGS
setting line with -fsanitize=address
With this:
script/test
Running unit tests...
AddressSanitizer:DEADLYSIGNAL
=================================================================
==485996==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000005 (pc 0x7fb83c7913a0 bp 0x62d000043b00 sp 0x7ffe9add4ea0 T0)
==485996==The signal is caused by a READ memory access.
==485996==Hint: address points to the zero page.
#0 0x7fb83c79139f in JSC::JSFunction::getOwnPropertySlot(JSC::JSObject*, JSC::JSGlobalObject*, JSC::PropertyName, JSC::PropertySlot&) (/lib/x86_64-linux-gnu/libjavascriptcoregtk-4.0.so.18+0x112339f)
#1 0x7fb83c47e379 (/lib/x86_64-linux-gnu/libjavascriptcoregtk-4.0.so.18+0xe10379)
#2 0x7fb7f4a63a5c (<unknown module>)
AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV (/lib/x86_64-linux-gnu/libjavascriptcoregtk-4.0.so.18+0x112339f) in JSC::JSFunction::getOwnPropertySlot(JSC::JSObject*, JSC::JSGlobalObject*, JSC::PropertyName, JSC::PropertySlot&)
==485996==ABORTING
Reproducing with Docker
Ok, I know you have vagrant stuff setup, but I'm guessing it might be a little out of date? Maybe? Maybe not, what do I know? But I think docker might be a more CI-friendly way to go (this might be reusable work).
So to reproduce locally, I whipped up the following Dockerfile
:
FROM ubuntu:20.04
ARG CLOJURE_CLI_VERSION=1.11.1.1105
# Set timezone to avoid interactive prompts when installing packages:
RUN ln -snf /usr/share/zoneinfo/$CONTAINER_TIMEZONE /etc/localtime && echo $CONTAINER_TIMEZONE > /etc/timezone
RUN apt-get -y update
RUN apt-get install -y \
cmake xxd git curl libjavascriptcoregtk-4.0 libglib2.0-dev libzip-dev libcurl4-gnutls-dev libicu-dev unzip
RUN apt install -y build-essential apt-utils openjdk-8-jdk
RUN curl -OL "https://download.clojure.org/install/linux-install-${CLOJURE_CLI_VERSION}.sh" \
&& chmod +x linux-install-${CLOJURE_CLI_VERSION}.sh \
&& /linux-install-${CLOJURE_CLI_VERSION}.sh \
&& rm /linux-install-${CLOJURE_CLI_VERSION}.sh
And then from the same dir, built the docker image via:
docker build -t lread:planck-test .
And ran the docker image via:
docker run -it lread:planck-test
Planck master against Ubuntu 20.04.4 seg faults
And then ran tests for planck master:
$ git clone https://github.com/planck-repl/planck.git
$ cd planck
$ script/build -Werror --fast
$ ./planck-c/build/planck --version
2.26.0
$ script/test
And I see a seg fault:
script/test: line 9: 24095 Segmentation fault script/test-unit
Planck 2.25.0 against Ubuntu 20.04.4 seg faults
Kill docker session and restart session:
$ git clone https://github.com/planck-repl/planck.git
$ cd planck
$ git reset 2.25.0 --hard
HEAD is now at 4b61b2a 2.25.0
$ script/build -Werror --fast
$ ./planck-c/build/planck --version
2.25.0
$ script/test
And again we see a seg fault:
script/test: line 9: 23525 Segmentation fault script/test-unit
Next Up
I'll continue to poke around.
Feel free to point me in a direction, if you think of one.
@lread Also, it is possible to attach gdb
to the Planck process, and, while a little tricky since this involves tests provoked by a unit test, if things crash while gdb
is attached then of course things are easy to see.
Thanks @mfikes, I have foggy memories of gdb
from a previous life. 🙂
More data points (this is probably known to you, but I wanted to witness it):
Both 2.25.0
and master
succeed for me on Ubuntu 18.04.6
On a whim, I decided to try running jsc
, webkit's cli JavaScript interpreter on Ubuntu 20.04.
It wasn't a rousing success.
$ jsc
Segmentation fault
Oh. Same deal for Ubuntu 18.04.
Oh. Thought I was onto something here with these planck build warnings from Ubuntu 22:
In function 'maybe_load_user_file',
inlined from 'maybe_load_user_file' at /planck/planck-c/engine.c:414:6:
/planck/planck-c/engine.c:418:9: error: 'JSObjectCallAsFunction' reading 8 bytes from a region of size 0 [-Werror=stringop-overread]
418 | JSObjectCallAsFunction(ctx, get_function("planck.repl", "maybe-load-user-file"), JSContextGetGlobalObject(ctx),
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
419 | 0, arguments, &ex);
| ~~~~~~~~~~~~~~~~~~
/planck/planck-c/engine.c: In function 'maybe_load_user_file':
/planck/planck-c/engine.c:418:9: note: referencing argument 5 of type 'const struct OpaqueJSValue * const*'
And this note that arguments
should be NULL when argumentCount is 0:
arguments
A JSValueRef array of arguments to pass to the function. Pass NULL ifargumentCount
is 0.
And things do pass after correcting this on Ubuntu 22, but still seg fault on Ubuntu 20.
I'll continue to poke around.
Hiya @mfikes! After some distractions, I was about to roll up my sleeves and take another go at this issue.
But I see that planck tests are now passing off master (even without addressing the potential issue above) against the current Ubuntu 20.04 and libjavascriptcoregtk
library.
My original testing was against libjavascriptcoregtk
2.34.6
.
Tests are passing against 2.36.0
.
So... I'm wondering what you recommend.
Should take a stab testing against various versions of libjavascriptcoregtk
on Ubuntu?
Hmm... As far as I can tell it is not terribly easy to install arbitrary previous package versions. So, as I understand it (could be wrong, dunno), if we wanted to test against various versions of libjavascriptcoregtk
we'd be building those from source.
For example, if I do an apt-cache showpkg libjavascriptcoregtk-4.0
, I only see the current installed version 2.36.0
and 2.28.1
. Same results for apt list -a libjavascriptcoregtk-4.0-18
.
Since master passes against current libjavascriptcoregtk
and current Unbuntu 20.04, I'd be tempted to release the binaries. With perhaps a tip to apt-get update
then apt-get upgrade
to current. Whaddya think?
While digging around I did notice the following and could help to address as separate issues if any of them make sense to you too:
- Make build output verbose. The Planck build is pretty terse on output. There are even some build steps that route output to
/dev/null
. This made it difficult for me to understand the details of what was happening during the build. What do you think of going verbose for the build? - Optionally look at JS exceptions. I noticed in several JavaScriptCore APIs there is an
exception
parameter. The API supports passingNULL
to discard the exception. Planck often passes inNULL
here. I often wondered if these exceptions might tell us something valuable. (Maybe you found otherwise? Maybe they are mostly useless noise?) I was thinking maybe a thin wrapper over the JavaScriptCore APIs we use might be useful. This thin wrapper could optionally log all exceptions. - Make a note on lldb. I found that
lldb
gave more details thangdb
when listing backtraces. I think it might be thatlldb
better supports cpp? Would a note in the dev guide docs be helpful here? - Generate binaries from CI. Would some CI support to generate all planck binaries be useful? Not sure how you do this currently.
- Release from CI. Would some CI support to release all planck binaries would be useful? I don't even know how this would work, but might be interesting to be able to release Ubuntu and macOS binaries from CI?
@mfikes, when you find some time and interest, lemme know if you'd like me to raise issues for all or some of #1087 (comment).
@mfikes Ubuntu 22.04 is now GA on Github Actions.
Past experiments showed binaries building and passing on 22.04.
So maybe we can skip 20.04 and build binaries for 22.04?
Would that make sense?