Segfaults in concurrent-ruby test suite on TruffleRuby 23.1 Oracle GraalVM Native
eregon opened this issue · comments
For example https://github.com/ruby-concurrency/concurrent-ruby/actions/runs/7546112438/job/20543196288
[ [ SegfaultHandler caught a segfault in thread 0x00007f126c000b80 ] ]
siginfo: si_signo: 11, si_code: 1, si_addr: 0x0000000000000000
...
Stacktrace for the failing thread 0x00007f126c000b80 (A=AOT compiled, J=JIT compiled, D=deoptimized, i=inlined):
SP 0x00007f1275ffc8e8 IP 0x0000000000000000 deoptFrame=null IP is not within Java code. Aborting stack trace printing.
Warning: stack pointer is not aligned to 16 bytes.
Starting the stack walk in a possible caller:
i SP 0x00007f1275ffc8f0 IP 0x00007f13b5009d91 size=144 [image code] java.util.HashMap.hash(HashMap.java:338)
i SP 0x00007f1275ffc8f0 IP 0x00007f13b5009d91 size=144 [image code] java.util.HashMap.put(HashMap.java:618)
i SP 0x00007f1275ffc8f0 IP 0x00007f13b5009d91 size=144 [image code] com.oracle.svm.core.deopt.SubstrateSpeculationLog.collectFailedSpeculations(SubstrateSpeculationLog.java:90)
A SP 0x00007f1275ffc8f0 IP 0x00007f13b5009d91 size=144 [image code] org.graalvm.compiler.truffle.compiler.TruffleTierContext.getSpeculationLog(TruffleTierContext.java:150)
A SP 0x00007f1275ffc980 IP 0x00007f13b4ff0570 size=192 [image code] org.graalvm.compiler.truffle.compiler.TruffleCompilerImpl.truffleTier(TruffleCompilerImpl.java:540)
A SP 0x00007f1275ffca40 IP 0x00007f13b4fe767f size=176 [image code] org.graalvm.compiler.truffle.compiler.TruffleCompilerImpl.compileAST(TruffleCompilerImpl.java:483)
A SP 0x00007f1275ffcaf0 IP 0x00007f13b4fe6a1e size=48 [image code] org.graalvm.compiler.truffle.compiler.TruffleCompilerImpl$TruffleCompilationWrapper.performCompilation(TruffleCompilerImpl.java:738)
i SP 0x00007f1275ffcb20 IP 0x00007f13b4820c8c size=32 [image code] org.graalvm.compiler.truffle.compiler.TruffleCompilerImpl$TruffleCompilationWrapper.performCompilation(TruffleCompilerImpl.java:672)
A SP 0x00007f1275ffcb20 IP 0x00007f13b4820c8c size=32 [image code] org.graalvm.compiler.core.CompilationWrapper.run(CompilationWrapper.java:222)
A SP 0x00007f1275ffcb40 IP 0x00007f13b4fead2e size=112 [image code] org.graalvm.compiler.truffle.compiler.TruffleCompilerImpl.doCompile(TruffleCompilerImpl.java:231)
A SP 0x00007f1275ffcbb0 IP 0x00007f13b4feb50d size=32 [image code] org.graalvm.compiler.truffle.compiler.TruffleCompilerImpl.doCompile(TruffleCompilerImpl.java:205)
A SP 0x00007f1275ffcbd0 IP 0x00007f13b2f963d8 size=64 [image code] com.oracle.truffle.runtime.OptimizedTruffleRuntime.compileImpl(OptimizedTruffleRuntime.java:902)
A SP 0x00007f1275ffcc10 IP 0x00007f13b2f97af9 size=96 [image code] com.oracle.truffle.runtime.OptimizedTruffleRuntime.doCompile(OptimizedTruffleRuntime.java:885)
i SP 0x00007f1275ffcc70 IP 0x00007f13b2f3e387 size=64 [image code] com.oracle.truffle.runtime.CompilationTask$1.accept(CompilationTask.java:63)
i SP 0x00007f1275ffcc70 IP 0x00007f13b2f3e387 size=64 [image code] com.oracle.truffle.runtime.CompilationTask$1.accept(CompilationTask.java:57)
i SP 0x00007f1275ffcc70 IP 0x00007f13b2f3e387 size=64 [image code] com.oracle.truffle.runtime.CompilationTask.call(CompilationTask.java:232)
A SP 0x00007f1275ffcc70 IP 0x00007f13b2f3e387 size=64 [image code] com.oracle.truffle.runtime.CompilationTask.call(CompilationTask.java:55)
A SP 0x00007f1275ffccb0 IP 0x00007f13b42d19d7 size=80 [image code] java.util.concurrent.FutureTask.run(FutureTask.java:317)
A SP 0x00007f1275ffcd00 IP 0x00007f13b431cee2 size=96 [image code] java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
A SP 0x00007f1275ffcd60 IP 0x00007f13b4319ee9 size=16 [image code] java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
i SP 0x00007f1275ffcd70 IP 0x00007f13b3c34a0e size=32 [image code] java.lang.Thread.runWith(Thread.java:1596)
A SP 0x00007f1275ffcd70 IP 0x00007f13b3c34a0e size=32 [image code] java.lang.Thread.run(Thread.java:1583)
A SP 0x00007f1275ffcd90 IP 0x00007f13b2f2dcec size=32 [image code] com.oracle.truffle.runtime.BackgroundCompileQueue$TruffleCompilerThreadFactory$1.run(BackgroundCompileQueue.java:303)
A SP 0x00007f1275ffcdb0 IP 0x00007f13b08a4e7b size=48 [image code] com.oracle.svm.core.thread.PlatformThreads.threadStartRoutine(PlatformThreads.java:833)
A SP 0x00007f1275ffcde0 IP 0x00007f13b081c11b size=32 [image code] com.oracle.svm.core.posix.thread.PosixPlatformThreads.pthreadStartRoutine(PosixPlatformThreads.java:211)
A SP 0x00007f1275ffce00 IP 0x00007f13b0696e40 size=96 [image code] com.oracle.svm.core.code.IsolateEnterStub.PosixPlatformThreads_pthreadStartRoutine_38d96cbc1a188a6051c29be1299afe681d67942e(IsolateEnterStub.java:0)
It also happens in other cases than concurrent-ruby, but the concurrent-ruby test suite seems prone to hit this bug.
It occurs more often when using a smaller heap (or smaller RAM which implies smaller default heap), like TRUFFLERUBYOPT=--vm.Xmx1G
.
This is fixed on master in oracle/graal@d7ee198.
A 100 runs of the concurrent-ruby test suite did not segfault a single time with that fix, but failed in 6 out of 100 runs on 23.1.2.
The fix will be part of the 24.0 release.
Internal issues: GR-48811, GR-51437