oracle / truffleruby

A high performance implementation of the Ruby programming language, built on GraalVM.

Home Page:https://www.graalvm.org/ruby/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Segfaults in concurrent-ruby test suite on TruffleRuby 23.1 Oracle GraalVM Native

eregon opened this issue · comments

For example https://github.com/ruby-concurrency/concurrent-ruby/actions/runs/7546112438/job/20543196288

[ [ SegfaultHandler caught a segfault in thread 0x00007f126c000b80 ] ]
siginfo: si_signo: 11, si_code: 1, si_addr: 0x0000000000000000
...
Stacktrace for the failing thread 0x00007f126c000b80 (A=AOT compiled, J=JIT compiled, D=deoptimized, i=inlined):
  SP 0x00007f1275ffc8e8 IP 0x0000000000000000  deoptFrame=null  IP is not within Java code. Aborting stack trace printing.
  
  Warning: stack pointer is not aligned to 16 bytes.
  
  Starting the stack walk in a possible caller:
  i  SP 0x00007f1275ffc8f0 IP 0x00007f13b5009d91 size=144   [image code] java.util.HashMap.hash(HashMap.java:338)
  i  SP 0x00007f1275ffc8f0 IP 0x00007f13b5009d91 size=144   [image code] java.util.HashMap.put(HashMap.java:618)
  i  SP 0x00007f1275ffc8f0 IP 0x00007f13b5009d91 size=144   [image code] com.oracle.svm.core.deopt.SubstrateSpeculationLog.collectFailedSpeculations(SubstrateSpeculationLog.java:90)
  A  SP 0x00007f1275ffc8f0 IP 0x00007f13b5009d91 size=144   [image code] org.graalvm.compiler.truffle.compiler.TruffleTierContext.getSpeculationLog(TruffleTierContext.java:150)
  A  SP 0x00007f1275ffc980 IP 0x00007f13b4ff0570 size=192   [image code] org.graalvm.compiler.truffle.compiler.TruffleCompilerImpl.truffleTier(TruffleCompilerImpl.java:540)
  A  SP 0x00007f1275ffca40 IP 0x00007f13b4fe767f size=176   [image code] org.graalvm.compiler.truffle.compiler.TruffleCompilerImpl.compileAST(TruffleCompilerImpl.java:483)
  A  SP 0x00007f1275ffcaf0 IP 0x00007f13b4fe6a1e size=48    [image code] org.graalvm.compiler.truffle.compiler.TruffleCompilerImpl$TruffleCompilationWrapper.performCompilation(TruffleCompilerImpl.java:738)
  i  SP 0x00007f1275ffcb20 IP 0x00007f13b4820c8c size=32    [image code] org.graalvm.compiler.truffle.compiler.TruffleCompilerImpl$TruffleCompilationWrapper.performCompilation(TruffleCompilerImpl.java:672)
  A  SP 0x00007f1275ffcb20 IP 0x00007f13b4820c8c size=32    [image code] org.graalvm.compiler.core.CompilationWrapper.run(CompilationWrapper.java:222)
  A  SP 0x00007f1275ffcb40 IP 0x00007f13b4fead2e size=112   [image code] org.graalvm.compiler.truffle.compiler.TruffleCompilerImpl.doCompile(TruffleCompilerImpl.java:231)
  A  SP 0x00007f1275ffcbb0 IP 0x00007f13b4feb50d size=32    [image code] org.graalvm.compiler.truffle.compiler.TruffleCompilerImpl.doCompile(TruffleCompilerImpl.java:205)
  A  SP 0x00007f1275ffcbd0 IP 0x00007f13b2f963d8 size=64    [image code] com.oracle.truffle.runtime.OptimizedTruffleRuntime.compileImpl(OptimizedTruffleRuntime.java:902)
  A  SP 0x00007f1275ffcc10 IP 0x00007f13b2f97af9 size=96    [image code] com.oracle.truffle.runtime.OptimizedTruffleRuntime.doCompile(OptimizedTruffleRuntime.java:885)
  i  SP 0x00007f1275ffcc70 IP 0x00007f13b2f3e387 size=64    [image code] com.oracle.truffle.runtime.CompilationTask$1.accept(CompilationTask.java:63)
  i  SP 0x00007f1275ffcc70 IP 0x00007f13b2f3e387 size=64    [image code] com.oracle.truffle.runtime.CompilationTask$1.accept(CompilationTask.java:57)
  i  SP 0x00007f1275ffcc70 IP 0x00007f13b2f3e387 size=64    [image code] com.oracle.truffle.runtime.CompilationTask.call(CompilationTask.java:232)
  A  SP 0x00007f1275ffcc70 IP 0x00007f13b2f3e387 size=64    [image code] com.oracle.truffle.runtime.CompilationTask.call(CompilationTask.java:55)
  A  SP 0x00007f1275ffccb0 IP 0x00007f13b42d19d7 size=80    [image code] java.util.concurrent.FutureTask.run(FutureTask.java:317)
  A  SP 0x00007f1275ffcd00 IP 0x00007f13b431cee2 size=96    [image code] java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
  A  SP 0x00007f1275ffcd60 IP 0x00007f13b4319ee9 size=16    [image code] java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
  i  SP 0x00007f1275ffcd70 IP 0x00007f13b3c34a0e size=32    [image code] java.lang.Thread.runWith(Thread.java:1596)
  A  SP 0x00007f1275ffcd70 IP 0x00007f13b3c34a0e size=32    [image code] java.lang.Thread.run(Thread.java:1583)
  A  SP 0x00007f1275ffcd90 IP 0x00007f13b2f2dcec size=32    [image code] com.oracle.truffle.runtime.BackgroundCompileQueue$TruffleCompilerThreadFactory$1.run(BackgroundCompileQueue.java:303)
  A  SP 0x00007f1275ffcdb0 IP 0x00007f13b08a4e7b size=48    [image code] com.oracle.svm.core.thread.PlatformThreads.threadStartRoutine(PlatformThreads.java:833)
  A  SP 0x00007f1275ffcde0 IP 0x00007f13b081c11b size=32    [image code] com.oracle.svm.core.posix.thread.PosixPlatformThreads.pthreadStartRoutine(PosixPlatformThreads.java:211)
  A  SP 0x00007f1275ffce00 IP 0x00007f13b0696e40 size=96    [image code] com.oracle.svm.core.code.IsolateEnterStub.PosixPlatformThreads_pthreadStartRoutine_38d96cbc1a188a6051c29be1299afe681d67942e(IsolateEnterStub.java:0)

It also happens in other cases than concurrent-ruby, but the concurrent-ruby test suite seems prone to hit this bug.
It occurs more often when using a smaller heap (or smaller RAM which implies smaller default heap), like TRUFFLERUBYOPT=--vm.Xmx1G.

This is fixed on master in oracle/graal@d7ee198.
A 100 runs of the concurrent-ruby test suite did not segfault a single time with that fix, but failed in 6 out of 100 runs on 23.1.2.

The fix will be part of the 24.0 release.

Internal issues: GR-48811, GR-51437