jhass / crystal-gobject

gobject-introspection for Crystal

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GtkSourceView crash on set_language method.

hugopl opened this issue · comments

I'm experiencing a crash with GtkSourceView, looks like a memory corruption, a bad cast... not sure.
I wrote a similar line-by-line C version and it works perfectly, the Crystal version crash with

$ ./bin/crash
Markup/Markdown
Invalid memory access (signal 11) at address 0x0
[0x56443a8eab76] *CallStack::print_backtrace:Int32 +118
[0x56443a8dd64e] __crystal_sigfault_handler +286
[0x7f52fcd17800] ???
[0x7f52fcf016e4] pcre_exec +3156
[0x7f52fcfa331d] g_match_info_next +157
[0x7f52fcfa4510] g_regex_match_full +128
[0x7f52fcfa46b9] g_regex_replace_eval +201
[0x7f52fde1941d] ???
[0x7f52fde25ab7] ???
[0x7f52fde24c4f] ??? (5 times)
[0x7f52fde4ff4d] ???
[0x7f52fde6a6d6] gtk_source_buffer_set_language +294
[0x56443a941623] *GtkSource::Buffer#language=<(GtkSource::Language | Nil)>:Nil +67
[0x56443a8c8da7] __crystal_main +1767
[0x56443a942446] *Crystal::main_user_code<Int32, Pointer(Pointer(UInt8))>:Nil +6
[0x56443a8dd6bc] main +60
[0x7f52fcae3023] __libc_start_main +243
[0x56443a8c85ee] _start +46
[0x0] ???

Crystal code:

require "gobject/gtk/autorun"
require_gobject "GtkSource"

GtkSource.init

builder = Gtk::Builder.new_from_file("#{__DIR__}/main.glade")
builder.connect_signals

# Get editor
editor = GtkSource::View.cast(builder["editor"])
buffer = GtkSource::Buffer.cast(editor.buffer)
buffer.set_text("Some contents", -1)

# Here the problem, seems to be just with markdown syntax.
lang = GtkSource::LanguageManager.default.guess_language("README.md", nil)
puts "#{lang.try(&.section)}/#{lang.try(&.name)}"
buffer.language = lang

# Show main window
main_window = Gtk::Window.cast(builder["main_window"])
main_window.show_all

C version

#include <gtk/gtk.h>
#include <gtksourceview/gtksource.h>

int main(int argc, char *argv[]) {
  gtk_init(&argc, &argv);
  gtk_source_init();

  GtkBuilder* builder = gtk_builder_new_from_file("./src/main.glade");
  gtk_builder_connect_signals(builder, NULL);
  
  // get editor
  GtkSourceView* editor = GTK_SOURCE_VIEW(gtk_builder_get_object(builder, "editor"));
  GtkTextBuffer* buffer = gtk_text_view_get_buffer(GTK_TEXT_VIEW(editor));
  gtk_text_buffer_set_text(buffer, "Some contents", -1);

  // Sets markdown syntax
  GtkSourceLanguageManager* manager = gtk_source_language_manager_get_default();
  GtkSourceLanguage* lang = gtk_source_language_manager_guess_language(manager, "README.md", NULL);
  if (lang)
    printf("%s/%s\n", gtk_source_language_get_section(lang), gtk_source_language_get_name(lang));
  gtk_source_buffer_set_language(GTK_SOURCE_BUFFER(buffer), lang);


  // Show main window
  GtkWidget* window = GTK_WIDGET(gtk_builder_get_object(builder, "main_window"));
  gtk_widget_show_all(GTK_WIDGET(window));
  gtk_main();  
  gtk_source_finalize();
}

XML used by GTK builder in the examples above.

<?xml version="1.0" encoding="UTF-8"?>
<!-- Generated with glade 3.22.2 -->
<interface>
  <requires lib="gtk+" version="3.20"/>
  <requires lib="gtksourceview" version="4.0"/>
  <object class="GtkApplicationWindow" id="main_window">
    <property name="can_focus">False</property>
    <property name="default_width">800</property>
    <property name="default_height">600</property>
    <signal name="destroy" handler="gtk_main_quit" swapped="no"/>
    <child type="titlebar">
      <placeholder/>
    </child>
    <child>
      <object class="GtkBox">
        <property name="visible">True</property>
        <property name="can_focus">False</property>
        <child>
          <object class="GtkSourceView" id="editor">
            <property name="can_focus">False</property>
          </object>
          <packing>
            <property name="expand">True</property>
            <property name="fill">True</property>
            <property name="position">0</property>
          </packing>
        </child>
      </object>
    </child>
  </object>
</interface>

Expected behavior

Works like the C version, i.e. just show a blank window with the text "Some Contents"

Current behavior

A crash.

More info

The weird info is that if I change the file name used to guess the language to e.g. "foo.yaml", it doesn't crash.

Tested on ArchLinux GTK package gtk3 1:3.24.20-1, GTK source view package gtksourceview4 4.6.0-1.

C version compiled with:

gcc `pkg-config --cflags gtk+-3.0` `pkg-config --cflags gtksourceview-4` crash.c -o crash_c `pkg-config --libs gtk+-3.0` `pkg-config --libs gtksourceview-4`

Tested with crystal-gobject master branch at 34890cad190bf7da65228676ef5d9af1372827f2.

Curious, I can reproduce on Arch but on macOS it works just fine!

Okay, probably by chance, it seems to be a GC issue. crystal run -Dgc_none gtk_source_view.cr works fine.

Or maybe that's just another side effect, because this looks not good at all:

+ BufferPrivate (struct, GtkSource)
  * size = 0
  * gtype = false

IIRC the buffer is lazy created by GtkSourceView. The weird thing is that if you change the GtkLanguage by passing e.g. "foo.yaml" it works.

Probably by chance. I rebuild gtksourceview4 locally to have debug symbols and then it also works, just throws a metric ton of assertion errors. Something is corrupting the memory layout, so probably some struct is too big or too small. The weird thing is the Crystal side doesn't actually allocate any of them, it's all pointers...

So by now I'm pretty sure the GC is to blame.

Just running the binary with GC_DONT_GC=1 set already avoids any weirdness. If we run with GC_PRINT_VERBOSE_STATS=1, right before the mayhem we can see

Initiating full world-stop collection!
0 bytes in heap blacklisted for interior pointers

--> Marking for collection #2 after 338144 allocated bytes
Pushed 1 thread stacks
Starting marking for mark phase number 1
Starting mark helper 0
Starting mark helper 1
Starting mark helper 2
Finished mark helper 2
Finished mark helper 0
Finished mark helper 1
Finished marking for mark phase number 1
GC #2 freed 0 bytes, heap 356 KiB (+ 0 KiB unmapped)
World-stopped marking took 0 msecs (0 in average)
Bytes recovered before sweep - f.l. count = -74880
In-use heap: 4% (15 KiB pointers + 0 KiB other)
Immediately reclaimed 205328 bytes, heapsize: 364544 bytes (0 unmapped)
7 finalization entries; 0/0 short/long disappearing links alive
0 finalization-ready objects; 0/0 short/long links cleared
Finalize plus initiate sweep took 0 + 0 msecs
Complete collection took 0 msecs

It seems the GC knows there's memory allocated but it doesn't know it's referenced and cleans it up....

I haven't found a way to tell the GC about that memory (partly because I'm unsure where that memory even is).

In the ancient past some people had luck hooking the GC up to GLib through g_mem_set_vtable, which these days.... is just a no-op. Sigh.

I don't know :(

I can reproduce this crash on this clean example without any shards, using the C calls directly, so I think the issue is related to Crystal GC as you said, do you think is worth to report this to Crystal?

@[Link("gtk-3")]
lib LibGtk
  fun gtk_init(argc : Int32*, argv : UInt8***) : Void
  fun gtk_main : Void
  fun gtk_builder_new_from_string(string : UInt8*, length : Int64) : Void*
  fun gtk_builder_connect_signals(this : Void*, user_data : Void*) : Void
  fun gtk_builder_get_object(this : Void*, name : UInt8*) : Void*
  fun gtk_widget_show_all(this : Void*) : Void
  fun gtk_text_view_get_buffer(this : Void*) : Void*
end

@[Link("gtksourceview-4")]
lib LibGtkSource
  fun init = gtk_source_init : Void
  fun gtk_source_language_manager_get_default : Void*
  fun gtk_source_language_manager_guess_language(this : Void*, filename : UInt8*, content_type : UInt8*) : Void*
  fun gtk_source_buffer_set_language(this : Void*, language : Void*) : Void
end

xml = <<-EOF
<?xml version="1.0" encoding="UTF-8"?>
<!-- Generated with glade 3.22.2 -->
<interface>
  <requires lib="gtk+" version="3.20"/>
  <requires lib="gtksourceview" version="4.0"/>
  <object class="GtkApplicationWindow" id="main_window">
    <property name="can_focus">False</property>
    <property name="default_width">800</property>
    <property name="default_height">600</property>
    <signal name="destroy" handler="gtk_main_quit" swapped="no"/>
    <child type="titlebar">
      <placeholder/>
    </child>
    <child>
      <object class="GtkBox">
        <property name="visible">True</property>
        <property name="can_focus">False</property>
        <child>
          <object class="GtkSourceView" id="editor">
            <property name="can_focus">False</property>
          </object>
          <packing>
            <property name="expand">True</property>
            <property name="fill">True</property>
            <property name="position">0</property>
          </packing>
        </child>
      </object>
    </child>
  </object>
</interface>
EOF

LibGtk.gtk_init pointerof(ARGC_UNSAFE), pointerof(ARGV_UNSAFE)
LibGtkSource.init

# builder = LibGtk.gtk_builder_new_from_file("#{__DIR__}/main.glade")
builder = LibGtk.gtk_builder_new_from_string(xml, -1)
LibGtk.gtk_builder_connect_signals(builder, nil)

# Get editor
editor = LibGtk.gtk_builder_get_object(builder, "editor")
buffer = LibGtk.gtk_text_view_get_buffer(editor)

# Here the problem, seems to be just with markdown syntax.
lang_manager = LibGtkSource.gtk_source_language_manager_get_default
lang = LibGtkSource.gtk_source_language_manager_guess_language(lang_manager, "README.md", nil)
LibGtkSource.gtk_source_buffer_set_language(buffer, lang)

# Show main window
main_window = LibGtk.gtk_builder_get_object(builder, "main_window")
LibGtk.gtk_widget_show_all(main_window)
LibGtk.gtk_main

Maybe, though I can't pinpoint it to something that Crystal would be doing wrong. It just integrates https://github.com/ivmai/bdwgc, which is so popular, most distros just package it as gc or libgc. But then just initializing the GC in the C version of the program and maybe doing a GC_malloc or two is not enough to reproduce the issue.

Some possible hints in https://stackoverflow.com/q/43141659/2199687 since the example is essentially using GtkTextView (a subclass).

What's weird is that this example GTK doesn't call any Crystal code, like a signal, etc.. and if after disable the GC just for the set_language call it works.

GC.disable
LibGtkSource.gtk_source_buffer_set_language(buffer, lang)
GC.enable

I'll submit the issue there, if it isn't a problem with Crystal at least maybe someone can discover something more about the issue.

I was looking for other languages habing a GTK binding and using libgc, only found Mono, but seems they changed their GC already.

if after disable the GC just for the set_language call it works.

Unfortunately looking at the GC debug output that seems to just delay the second GC collection cycle causing the issues, adding a GC.collect after the GC.enable for example brings some issues back for me.

So after staring on https://github.com/ivmai/bdwgc/blob/master/doc/README.macros for a while and with reading somewhere else that bdwgc needs to know about new pthreads (and as such it even provides #defines to shadow the original), and with reading somewhere that GtkTextView creates new pthreads, I recompiled bdwgc with CFLAGS="-DUSE_PROC_FOR_LIBRARIES" and that does seem to avoid the crash.

But it doesn't feel like a really workable solution either.

Only if the shard provides a static version of libgc compiled with this flag. But this is why I think this problem is on Crystal side, not on GTK side, the same issue may happen with other libraries as well, and similar report will happen when the Crystal use base increase.
BTW I reported this on crystal to see if we get some help crystal-lang/crystal#9226

To be honest I don't quite see what Crystal can do here. It's GLib that provides no means to hook their allocation and thread spawning functions, so being unable to provide an interface that bdwgc wants. And it's bdwgc that requires this kind of cooperation from libraries, I guess that's where other bindings to GCed languages have better luck, they have GCs that keep garbage collection to allocations made by them? Or maybe bdwgc is not aware of malloc'ed memory and reuses it (given they allocate with sbrk)? I don't know, I still don't understand enough of all the interactions there.

You remind me something, Inkscape uses GTK3 (gtkmm3) and libgc, at least this is what Archlinux package dependency says... and it works.

Good catch, unfortunately just setting the same options as it does at https://gitlab.com/inkscape/inkscape/-/blob/master/src/inkgc/gc.cpp#L32-34 does not seem to be enough :/

Well, this rabbit hole is deep. Very deep.

After fixing three bugs in the compiler, working around three more and working around that we removed global variables, I have something that seems to make some things work. Adding this shim code seems to fix it for me: https://gist.github.com/jhass/53b56308b1d4013c32cf2325d800e130

Well, until I turn on release mode that is. Then:

(lldb) r
Process 11033 launched: '/home/jhass/projects/crystal-gobject/samples/gtk_source_view_pointers' (x86_64)
Process 11033 stopped
* thread #1, name = 'gtk_source_view', stop reason = signal SIGSEGV: invalid address (fault address: 0x7f0)
    frame #0: 0x00007ffff7076489 libglib-2.0.so.0`g_slice_alloc at gslice.c:520:23
   517
   518 	      n_magazines = MAX_SLAB_INDEX (allocator);
   519 	      tmem = g_private_set_alloc0 (&private_thread_memory, sizeof (ThreadMemory) + sizeof (Magazine) * 2 * n_magazines);
-> 520 	      tmem->magazine1 = (Magazine*) (tmem + 1);
   521 	      tmem->magazine2 = &tmem->magazine1[n_magazines];
   522 	    }
   523 	  return tmem;
(lldb) bt
* thread #1, name = 'gtk_source_view', stop reason = signal SIGSEGV: invalid address (fault address: 0x7f0)
  * frame #0: 0x00007ffff7076489 libglib-2.0.so.0`g_slice_alloc at gslice.c:520:23
    frame #1: 0x00007ffff707643c libglib-2.0.so.0`g_slice_alloc at gslice.c:505
    frame #2: 0x00007ffff7076420 libglib-2.0.so.0`g_slice_alloc(mem_size=96) at gslice.c:1002
    frame #3: 0x00007ffff70aee0f libglib-2.0.so.0`g_hash_table_new_full(hash_func=(libglib-2.0.so.0`g_str_hash at ghash.c:2333:15), key_equal_func=(libglib-2.0.so.0`g_str_equal at ghash.c:2299:1), key_destroy_func=0x0000000000000000, value_destroy_func=0x0000000000000000) at ghash.c:1071:16
    frame #4: 0x00007ffff708cf5c libglib-2.0.so.0`g_quark_init at gquark.c:61:14
    frame #5: 0x00007ffff7fe209a ld-2.31.so`call_init.part.0 + 154
    frame #6: 0x00007ffff7fe21a1 ld-2.31.so`_dl_init + 129
    frame #7: 0x00007ffff7fd313a ld-2.31.so`_dl_start_user + 50

Sigh. Maybe I'm in the wrong tunnel of the rabbit hole after all. I don't know.

Well... I could debug it enough to suspect calloc allocating not enough memory. I added some debug printing to it... and the crash went away. I removed the debug printing and but kept compiling with debug symbols still, no crash. I recompiled without debug symbols and it's back. I suspect there's some bad codegen still somewhere lurking :/

Okay, I guess I was kinda tired, I found the remaining issue from above: My calloc override didn't have a return type annotation and apparently for funs that makes crystal type it as void, so it never returned the pointer 😆

I updated the gist above, @hugopl do you want to give it a try? I suppose you also have a bigger thingy to stress test it a bit. If it works for you I might package it up as a shard. I don't feel confident enough about it yet, both in approach and implementation, to just make it a dependency here, but it might be something to point people to if they get trouble like this.

Yes, I can try it, I will find some time later today (GMT+3) , try it and report here.

It worked! 🎉 🎉

Failed to compile with current crystal and worked without any crashes with your crystal branch.

Okay cool. Alright, so battle plan:

  • Get the compiler fixes upstreamed
  • Turn the gist into a shard
  • Update this project's readme with some explanation blob referring to that shard
  • Release 0.7.0

More good news... as my pet project got bigger, I experienced different crashes on it (all related to GtkSourceView4) including a crash at exit due to a double deallocation, but all of them is fixed with crystal from git + the malloc/pthread shim 🎉

Thanks very much!

I'm still having crashes... however much less often as before. I'm trying to find a good way to reproduce them yet, once I find a way I submit a new issue with more details, for now I only have a backtrace + GC_STATS output... that may no be too useful.

$ GC_PRINT_VERBOSE_STATS=1 ./bin/tijolo .
Grow heap to 64 KiB after 0 bytes allocated
Number of processors = 8
Started 7 mark helper threads
Initiating full world-stop collection!
0 bytes in heap blacklisted for interior pointers

--> Marking for collection #1 after 0 allocated bytes
Pushed 1 thread stacks
Starting marking for mark phase number 0
Starting mark helper 0
Starting mark helper 1
Starting mark helper 2
Starting mark helper 3
Starting mark helper 4
Starting mark helper 5
Starting mark helper 6
Starting mark helper 7
Finished mark helper 0
Finished mark helper 1
Finished mark helper 5
Finished mark helper 3
Finished mark helper 4
Finished mark helper 6
Finished mark helper 7
Finished mark helper 2
Finished marking for mark phase number 0
GC #1 freed 0 bytes, heap 64 KiB (+ 0 KiB unmapped)
World-stopped marking took 1 msecs (1 in average)
Bytes recovered before sweep - f.l. count = 0
In-use heap: 0% (0 KiB pointers + 0 KiB other)
Immediately reclaimed 0 bytes, heapsize: 65536 bytes (0 unmapped)
0 finalization entries; 0/0 short/long disappearing links alive
0 finalization-ready objects; 0/0 short/long links cleared
Finalize plus initiate sweep took 0 + 0 msecs
Complete collection took 1 msecs
Adding block map for size of 3 granules (48 bytes)
Grow heap to 156 KiB after 48 bytes allocated
Adding block map for size of 0 granules (0 bytes)
Adding block map for size of 1 granules (16 bytes)
Adding block map for size of 16 granules (256 bytes)
Adding block map for size of 32 granules (512 bytes)
Adding block map for size of 128 granules (2048 bytes)
Adding block map for size of 2 granules (32 bytes)
Adding block map for size of 4 granules (64 bytes)
Adding block map for size of 6 granules (96 bytes)
Adding block map for size of 9 granules (144 bytes)
Adding block map for size of 5 granules (80 bytes)
Adding block map for size of 8 granules (128 bytes)
Adding block map for size of 10 granules (160 bytes)
Adding block map for size of 12 granules (192 bytes)
Grow heap to 220 KiB after 114704 bytes allocated
Adding block map for size of 64 granules (1024 bytes)
Grew fo table to 1 entries
Adding block map for size of 15 granules (240 bytes)
Grew fo table to 2 entries
Grow heap to 296 KiB after 140464 bytes allocated
Adding block map for size of 11 granules (176 bytes)
Grew fo table to 4 entries
Adding block map for size of 7 granules (112 bytes)
Adding block map for size of 17 granules (272 bytes)
Grew fo table to 8 entries
Grew dl table to 1 entries
Adding block map for size of 20 granules (320 bytes)
Grow heap to 400 KiB after 185744 bytes allocated
Adding block map for size of 14 granules (224 bytes)
Initiating full world-stop collection!
0 bytes in heap blacklisted for interior pointers

--> Marking for collection #2 after 237360 allocated bytes
Pushed 1 thread stacks
Starting marking for mark phase number 1
Starting mark helper 0
Starting mark helper 1
Starting mark helper 2
Starting mark helper 3
Starting mark helper 4
Starting mark helper 5
Starting mark helper 6
Starting mark helper 7
Finished mark helper 5
Finished mark helper 1
Finished mark helper 0
Finished mark helper 4
Finished mark helper 2
Finished mark helper 3
Finished mark helper 6
Finished mark helper 7
Finished marking for mark phase number 1
GC #2 freed 0 bytes, heap 400 KiB (+ 0 KiB unmapped)
World-stopped marking took 1 msecs (1 in average)
Bytes recovered before sweep - f.l. count = -109440
In-use heap: 52% (207 KiB pointers + 5 KiB other)
Immediately reclaimed -109440 bytes, heapsize: 409600 bytes (0 unmapped)
8 finalization entries; 1/0 short/long disappearing links alive
1 finalization-ready objects; 0/0 short/long links cleared
Finalize plus initiate sweep took 0 + 0 msecs
Complete collection took 1 msecs
Grow heap to 584 KiB after 0 bytes allocated
Adding block map for size of 42 granules (672 bytes)
Adding block map for size of 24 granules (384 bytes)
Adding block map for size of 13 granules (208 bytes)
Adding block map for size of 50 granules (800 bytes)
Adding block map for size of 84 granules (1344 bytes)
Grow heap to 780 KiB after 361072 bytes allocated
Initiating full world-stop collection!
0 bytes in heap blacklisted for interior pointers

--> Marking for collection #3 after 806192 allocated bytes
Pushed 1 thread stacks
Starting marking for mark phase number 2
Starting mark helper 0
Starting mark helper 1
Starting mark helper 2
Starting mark helper 3
Starting mark helper 4
Starting mark helper 5
Starting mark helper 6
Starting mark helper 7
Finished mark helper 3
Finished mark helper 0
Finished mark helper 7
Finished mark helper 1
Finished mark helper 2
Finished mark helper 5
Finished mark helper 4
Finished mark helper 6
Finished marking for mark phase number 2
Recycle 65536/65536 scratch-allocated bytes at 0x7fdc78e54000
Grew mark stack to 8192 frames
GC #3 freed -33696 bytes, heap 844 KiB (+ 0 KiB unmapped)
World-stopped marking took 1 msecs (1 in average)
Bytes recovered before sweep - f.l. count = -95184
In-use heap: 68% (575 KiB pointers + 7 KiB other)
Immediately reclaimed -82896 bytes, heapsize: 864256 bytes (0 unmapped)
7 finalization entries; 1/0 short/long disappearing links alive
2 finalization-ready objects; 0/0 short/long links cleared
Finalize plus initiate sweep took 0 + 0 msecs
Complete collection took 1 msecs
Grow heap to 1136 KiB after 414112 bytes allocated
Adding block map for size of 18 granules (288 bytes)
Adding block map for size of 19 granules (304 bytes)
Adding block map for size of 21 granules (336 bytes)
Adding block map for size of 23 granules (368 bytes)
Grow heap to 1532 KiB after 1335664 bytes allocated
Adding block map for size of 22 granules (352 bytes)
Initiating full world-stop collection!
0 bytes in heap blacklisted for interior pointers

--> Marking for collection #4 after 1853200 allocated bytes
Pushed 3 thread stacks
Starting marking for mark phase number 3
Starting mark helper 0
Starting mark helper 1
Starting mark helper 2
Starting mark helper 3
Starting mark helper 4
Starting mark helper 5
Starting mark helper 6
Starting mark helper 7
Finished mark helper 5
Finished mark helper 0
Finished mark helper 1
Finished mark helper 4
Finished mark helper 2
Finished mark helper 6
Finished mark helper 3
Finished mark helper 7
Finished marking for mark phase number 3
Recycle 131072/131072 scratch-allocated bytes at 0x7fdc704e3000
Grew mark stack to 16384 frames
GC #4 freed -3024 bytes, heap 1660 KiB (+ 0 KiB unmapped)
World-stopped marking took 2 msecs (1 in average)
Bytes recovered before sweep - f.l. count = -191552
In-use heap: 66% (1099 KiB pointers + 5 KiB other)
Immediately reclaimed -175168 bytes, heapsize: 1699840 bytes (0 unmapped)
7 finalization entries; 1/0 short/long disappearing links alive
0 finalization-ready objects; 0/0 short/long links cleared
Finalize plus initiate sweep took 0 + 0 msecs
Complete collection took 2 msecs
Grow heap to 2216 KiB after 348608 bytes allocated
Grow heap to 2956 KiB after 951696 bytes allocated
Initiating full world-stop collection!
12288 bytes in heap blacklisted for interior pointers

--> Marking for collection #5 after 2024992 allocated bytes
Pushed 3 thread stacks
Starting marking for mark phase number 4
Starting mark helper 0
Starting mark helper 1
Starting mark helper 2
Starting mark helper 3
Starting mark helper 4
Starting mark helper 5
Starting mark helper 6
Starting mark helper 7
Finished mark helper 0
Finished mark helper 2
Finished mark helper 3
Finished mark helper 4
Finished mark helper 5
Finished mark helper 6
Finished mark helper 7
Finished mark helper 1
Finished marking for mark phase number 4
Recycle 262144/262144 scratch-allocated bytes at 0x7fdc6f0ee000
Grew mark stack to 32768 frames
GC #5 freed -83264 bytes, heap 3212 KiB (+ 0 KiB unmapped)
World-stopped marking took 4 msecs (1 in average)
Bytes recovered before sweep - f.l. count = -75792
In-use heap: 77% (2498 KiB pointers + 5 KiB other)
Immediately reclaimed -75792 bytes, heapsize: 3289088 bytes (0 unmapped)
7 finalization entries; 1/0 short/long disappearing links alive
0 finalization-ready objects; 0/0 short/long links cleared
Finalize plus initiate sweep took 0 + 0 msecs
Complete collection took 4 msecs
Grow heap to 4284 KiB after 529744 bytes allocated
Grow heap to 5716 KiB after 1923664 bytes allocated
Grew fo table to 16 entries
2020-05-16T20:51:58.792820000Z   INFO - Project root: /home/hugo/src/tijolo
2020-05-16T20:51:58.794442000Z   INFO - files scan: 00:00:00.001559238
Initiating full world-stop collection!
12288 bytes in heap blacklisted for interior pointers

--> Marking for collection #6 after 7868080 allocated bytes
Pushed 3 thread stacks
Starting marking for mark phase number 5
Starting mark helper 0
Starting mark helper 1
Starting mark helper 2
Starting mark helper 3
Starting mark helper 4
Starting mark helper 5
Starting mark helper 6
Starting mark helper 7
Finished mark helper 2
Finished mark helper 3
Finished mark helper 5
Finished mark helper 4
Finished mark helper 6
Finished mark helper 0
Finished mark helper 1
Finished mark helper 7
Finished marking for mark phase number 5
Recycle 524288/524288 scratch-allocated bytes at 0x7fdc6ee66000
Grew mark stack to 65536 frames
GC #6 freed 61632 bytes, heap 6228 KiB (+ 0 KiB unmapped)
World-stopped marking took 7 msecs (2 in average)
Bytes recovered before sweep - f.l. count = -237632
In-use heap: 79% (4897 KiB pointers + 81 KiB other)
Immediately reclaimed -194608 bytes, heapsize: 6377472 bytes (0 unmapped)
7 finalization entries; 1/0 short/long disappearing links alive
9 finalization-ready objects; 0/0 short/long links cleared
Finalize plus initiate sweep took 0 + 0 msecs
Complete collection took 7 msecs
Grew fo table to 32 entries
Grew fo table to 64 entries
Grew fo table to 128 entries
Grew fo table to 256 entries
Grew fo table to 512 entries
Grew fo table to 1024 entries
Initiating full world-stop collection!
12288 bytes in heap blacklisted for interior pointers

--> Marking for collection #7 after 646801 allocated bytes
Pushed 3 thread stacks
Starting marking for mark phase number 6
Starting mark helper 0
Starting mark helper 1
Starting mark helper 2
Starting mark helper 3
Starting mark helper 4
Starting mark helper 5
Starting mark helper 6
Starting mark helper 7
Finished mark helper 6
Finished mark helper 7
Finished mark helper 3
Finished mark helper 1
Finished mark helper 5
Finished mark helper 0
Finished mark helper 4
Finished mark helper 2
Finished marking for mark phase number 6
Recycle 1048576/1048576 scratch-allocated bytes at 0x7fdc6e4ee000
Grew mark stack to 131072 frames
GC #7 freed -105488 bytes, heap 7252 KiB (+ 0 KiB unmapped)
World-stopped marking took 8 msecs (3 in average)
Bytes recovered before sweep - f.l. count = -68448
In-use heap: 73% (5267 KiB pointers + 90 KiB other)
Immediately reclaimed -25392 bytes, heapsize: 7426048 bytes (0 unmapped)
8 finalization entries; 1/0 short/long disappearing links alive
613 finalization-ready objects; 0/0 short/long links cleared
Finalize plus initiate sweep took 0 + 0 msecs
Complete collection took 8 msecs
Initiating full world-stop collection!
16384 bytes in heap blacklisted for interior pointers

--> Marking for collection #8 after 2844064 allocated bytes
Pushed 3 thread stacks
Starting marking for mark phase number 7
Starting mark helper 0
Starting mark helper 1
Starting mark helper 2
Starting mark helper 3
Starting mark helper 4
Starting mark helper 5
Starting mark helper 6
Starting mark helper 7
Finished mark helper 5
Finished mark helper 3
Finished mark helper 0
Finished mark helper 2
Finished mark helper 6
Finished mark helper 7
Finished mark helper 4
Finished mark helper 1
Finished marking for mark phase number 7
GC #8 freed 154944 bytes, heap 7252 KiB (+ 0 KiB unmapped)
World-stopped marking took 10 msecs (4 in average)
Bytes recovered before sweep - f.l. count = -264992
In-use heap: 84% (6015 KiB pointers + 99 KiB other)
Immediately reclaimed -203552 bytes, heapsize: 7426048 bytes (0 unmapped)
7 finalization entries; 1/0 short/long disappearing links alive
514 finalization-ready objects; 0/0 short/long links cleared
Finalize plus initiate sweep took 0 + 0 msecs
Complete collection took 10 msecs
Grow heap to 9676 KiB after 64416 bytes allocated
2020-05-16T20:52:07.523985000Z   INFO - Locator found 50 results in 00:00:00.012240419
Invalid memory access (signal 11) at address 0x21
[0x56253b233196] *Exception::CallStack::print_backtrace:Int32 +118
[0x56253b220e8e] __crystal_sigfault_handler +286
[0x56253b344271] sigfault_handler +40
[0x7fdc77caa800] ???
[0x7fdc77ccf009] GC_free +137
[0x56253b216f12] free +82
[0x7fdc77f26ecf] ???
[0x7fdc77f2757d] g_slice_free1 +413
[0x7fdc780231fe] g_type_free_instance +462
[0x7fdc78956eee] ???
[0x7fdc77f60c32] ???
[0x7fdc77f60d90] g_hash_table_unref +48
[0x7fdc78956efd] ???
[0x7fdc7895c515] ???
[0x7fdc7895c71d] ???
[0x7fdc7895c76b] ??? (6 times)
[0x7fdc78971b09] ???
[0x7fdc7802d0a0] g_signal_emit_valist +4800
[0x7fdc7802e6b0] g_signal_emit +144
[0x7fdc785ac32b] ???
[0x7fdc785bacec] ???
[0x7fdc77f4cc04] ???
[0x7fdc77f4d58f] g_main_context_dispatch +335
[0x7fdc77f4f531] ???
[0x7fdc77f4f571] g_main_context_iteration +49
[0x7fdc781063fe] g_application_run +526
[0x56253b2f2de2] *Gtk::Application +50
[0x56253b2f28d9] *Application#run:(Array(Log::Entry) | IO+ | Int32 | Nil) +25
[0x56253b2086d7] __crystal_main +1863
[0x56253b3437f6] *Crystal::main_user_code<Int32, Pointer(Pointer(UInt8))>:Nil +6
[0x56253b34368c] *Crystal::main<Int32, Pointer(Pointer(UInt8))>:Int32 +44
[0x56253b2137d6] main +6
[0x7fdc77a78023] __libc_start_main +243
[0x56253b207ebe] _start +46
[0x0] ???
code 11

This was compiled with Crystal:

Crystal 0.35.0-dev [b977a952e] (2020-05-14)

LLVM: 10.0.0
Default target: x86_64-pc-linux-gnu

and using jhass/crystal-malloc_pthread_shim

Mmmh, it hits the free shim

[0x7fdc77ccf009] GC_free +137
[0x56253b216f12] free +82

maybe GC_is_heap_ptr is unreliable?

https://github.com/jhass/crystal-malloc_pthread_shim/blob/6b85e95981fc4a1d3311cf3b4ba28d9611a1011c/src/malloc_pthread_shim.cr#L103

No idea..., the shim code seems ok to me (despite of alignment stuff I don't fully understand)... or there should be more C function that should be wrapped like you did with strndup, etc... BTW why this function need to be wrapped? if you already wrap malloc, or these functions doesn't use malloc?

Got another similar stack trace... again on free calling GC_free instead of "real" free.

Invalid memory access (signal 11) at address 0x21
[0x56089c4b3426] *Exception::CallStack::print_backtrace:Int32 +118
[0x56089c4a0f4e] __crystal_sigfault_handler +286
[0x56089c5c6081] sigfault_handler +40
[0x7f30b9923960] ???
[0x7f30b9948009] GC_free +137
[0x56089c496fd2] free +82
[0x7f30b9b9eecf] ???
[0x7f30b9b9f57d] g_slice_free1 +413
[0x7f30b9c9b1fe] g_type_free_instance +462
[0x7f30ba5ceeee] ???
[0x7f30b9bd8c32] ???
[0x7f30b9bd8d90] g_hash_table_unref +48

also I have no idea why the address is always 0x21.

I added the stupid code:

  if gc_initialized?
    if ptr == 0x21
      puts "********************       CRASH!?"
      return LibC.real_free(ptr)
    end
    GC.is_heap_ptr(ptr) ? enter_gc { LibGC.free(ptr) } : LibC.real_free(ptr)
  end

Just to check if it will still crashing... if so, there's always the chance of bad GTK code written by me.

why this function need to be wrapped? if you already wrap malloc, or these functions doesn't use malloc?

I just wrapped everything I found a GC_ equivalent to.

Is this crash when closing the application or something?

It may also help to recompile at least glib with debug symbols to look for similarities in the stacktraces.

diff --git a/PKGBUILD b/PKGBUILD
index f87b1d9..9a6d6d2 100644
--- a/PKGBUILD
+++ b/PKGBUILD
@@ -10,6 +10,7 @@ url="https://wiki.gnome.org/Projects/GLib"
 license=(LGPL2.1)
 arch=(x86_64)
 depends=(pcre libffi libutil-linux zlib)
+options=(!strip)
 makedepends=(gettext gtk-doc shared-mime-info python libelf git util-linux
              meson dbus)
 checkdepends=(desktop-file-utils)
@@ -35,7 +36,7 @@ prepare() {
 }

 build() {
-  CFLAGS+=" -DG_DISABLE_CAST_CHECKS"
+  CFLAGS+=" -g -DG_DISABLE_CAST_CHECKS"
   arch-meson glib build \
     -D selinux=disabled \
     -D man=true \

Is this crash when closing the application or something?

Sometimes when I switch back to my application on ALT+TAB, but I not found a pattern yet.

It may also help to recompile at least glib with debug symbols to look for similarities in the stacktraces.

It's on my todo list. get arch original packages for glib, gtk and gtksourceview4, add debug, remove strip and let then install on /opt, then run my app using them. I'll post more info when I get them... and I'll also probably be releasing the code that is causing these bug reports this week... but I'm really trying to figure out steps to cause the crash with 100% of chance.

IME stuff works fine if you just replace the package, no need to battle with /opt :)

It's because I use gnome as desktop, so if I just replace the file all my system will be slow as hell.

anyway, I already compiled glib in debug mode and installed the package... I'll be pasting better errors here soon.

You tried? Symbols should just blow up the filesize of the so a bit, doesn't mean you should use a not optimized build :)

Didn't measure anything, was just a precaution to not mess with the system.

I'm a good amount of time without crashes by using:

ENV["G_SLICE"] = "always-malloc"
ENV["G_DEBUG"] = "gc-friendly"

at the very top of my main.cr file.

Huh, there's another allocator than malloc in GSlice? I only noticed some abstraction for that in its code, but didn't see another one implemented. Mmh, or maybe it makes it not use posix_memalign, which is easily the most hacky part. Actually that might be it, getting a free for the offset pointer, which is invalid. But then the traces you shared didn't really look like that.

Could you do me a favor and try this branch without the environment variables? https://github.com/jhass/crystal-malloc_pthread_shim/tree/track_alignments

I kinda doubt it will help, given you somehow ended up with just 0x21 in a free call, which certainly is anything but a heap pointer of any kind. But then it feels more correct to do it this way, so if it doesn't make stuff worse for you, I think I'll push that to master anyways.

Btw. in my tests shared library initialization code often runs before main, so there's actually a race to setting these environment variables in the program (in main). They might not been set yet when the shared library initialization runs, and it looks like GLib does initialize GSlice there early on.

Huh, there's another allocator than malloc in GSlice? I only noticed some abstraction for that in its code, but didn't see another one implemented. Mmh, or maybe it makes it not use posix_memalign, which is easily the most hacky part. Actually that might be it, getting a free for the offset pointer, which is invalid. But then the traces you shared didn't really look like that.

Maybe it was just coincidence the env. vars avoiding the crash, since I experienced a g_slice crash yesterday :-(.

Sure, I'm gonna try https://github.com/jhass/crystal-malloc_pthread_shim/tree/track_alignments.

BTW, I think this now deserves its own issue, since the original one was solved, and now the issue seems very specific to g_slice, despite of still being a GC-GTK issue as well.

How did you try GC_set_all_interior_pointers(1)?