jhass / crystal-gobject

gobject-introspection for Crystal

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Objects never dies due to references in ClosureDataManager

hugopl opened this issue · comments

Hi,

In the following example I was expecting the same error as in #88, however the error doesn't happen because a reference to @window object as a Void* is stored in the ClosureDataManager and the @window object is never destroyed by the GC. The solution I found was to just use WeakRef on ClosureDataManager, bellow is the code that I expected to have the same problems of #88

require "gobject/gtk/autorun"

class Foo
  def initialize
    @window = Gtk::Window.new(title: "Hello World!", border_width: 10)
    # Uncomment this and a ref to Foo will be held in the closure and the GC will never run
    # this can be solved by `@window.disconnect(id)`, however it's too error prone.
    id = @window.on_activate_focus(&->hold_a_reference_in_a_closure(Gtk::Window))

    @window.connect("destroy") do
      puts "destroyed!"
      GLib.idle_add do
        puts "real goodbye!"
        Gtk.main_quit
        false
      end
    end
    @window.show
  end

  def hold_a_reference_in_a_closure(window)
    puts "I got focus"
  end

  def finalize
    LibC.printf("finalize!\n")
  end
end

# need to be on a block, otherwise for some reason GC wont collect
1.times do
  puts "Foo.new"
  Foo.new
end
puts "collect 1"
GC.collect
puts "collect 2"
GC.collect

Here's the proposed patch for discussion:

diff --git a/src/closure_data_manager.cr b/src/closure_data_manager.cr
index 4b6ad3c5..1a9e6c09 100644
--- a/src/closure_data_manager.cr
+++ b/src/closure_data_manager.cr
@@ -18,18 +33,19 @@ module GObject
     end
 
     private def initialize
-      @closure_data = Hash(Void*, Int32).new { |h, k| h[k] = 0 }
+      @closure_data = Hash(::WeakRef(Void*), Int32).new(0)
     end
 
     def register(data)
-      @closure_data[data] += 1 if data
+      @closure_data[::WeakRef.new(data)] += 1
       data
     end
 
     def deregister(data)
-      @closure_data[data] -= 1
-      if @closure_data[data] <= 0
-        @closure_data.delete data
+      weak_ptr = ::WeakRef.new(data)
+      @closure_data[weak_ptr] -= 1
+      if @closure_data[weak_ptr] <= 0
+        @closure_data.delete(weak_ptr)
       end
     end
   end

This patch has 3 problems:

  1. WeakRef.new("hey") != WeakRef.new("hey")
  2. WeakRef.new("hey").hash != WeakRef.new("hey").hash
  3. Creating a WeakRef(Pointer(Void)) raises a compiler error when calling the .value method
Error: instantiating 'value()'


In /usr/lib/crystal/weak_ref.cr:30:5

 30 | @target.as(T?)
      ^
Error: can't cast Pointer(Void) to (Pointer(Void) | Nil)

My solution was patching WeakRef itself, however I'm not sure if such patch would be acceptable for Crystal 1.x, so if you think this patch deserves some discussion on Crystal I can do that there, here's the monkey patch I did to test the fix:

class WeakRef(T)
  delegate :hash, to: value

  def ==(other)
    value == other.value
  end

  def value
    @target.as(T) if @target
  end
end

I think this would be a valid behavior for WeakRef since other classes like String behave this way, e.g.:

a = "foo".reverse
b = "oof"
pointerof(a) != pointerof(b) # => true
a == b # => true
a.hash == b.hash # => true

A solution independent from patching Crystal would be creating our own WeakRef class, maybe by just inheriting from the original WeakRef and override these methods.

Well, the entire point of ClosureDataManager is to prevent the GC from collecting these pointers. The original problem it solves is that when passing a pointer to a closure to Gtk land the pointer would no longer be visible in GC allocated memory and thus the GC would go on to clean it up, leaving a dangling pointer in place on the Gtk side of things. On registration there always should also be a callback registered that removes the pointer from ClosureDataManager again, allowing for collection of the closure when the object it was attached to is destroyed. Maybe this is missing somewhere. In any case turning the internal ClosureDataManager references to WeakReference seems counter-productive to its original goal.

🤔 you are right, I forgot the main point... I was considering that will always exist a pointer for the GTK object somewhere if it's alive, this is true, but sometimes the pointer will be only on GTK-owned memory so GC can't see it.

I'll try the ruby bindings to check how it behave... otherwise the user need to disconnect all signals to allow object destruction what is basically manual memory management.... another workaround is to never use closures on signals, but this makes the code a bit uglier.

long story short, the problem is: How to let my objects get automatically destroyed by GC when using signals with closures?

Ruby seems to not destroy the GTK object either or I failed miserably to check if it does, since even removing the signal_connect call I can't see the finalizer being called.

# frozen_string_literal: true
require "gtk3"

class Foo
  def initialize
    ObjectSpace.define_finalizer(self, ->{ puts "collected" })

    @window = Gtk::Window.new("Hello World!")
    @window.signal_connect("destroy") do
      puts "destroyed!"
      GLib::Idle.add do
        puts "real goodbye!"
        Gtk.main_quit
        false
      end
    end
    @window.show
  end
end

def start_foo_and_hope_for_gc_collect
  puts "Foo.new"
  Foo.new
  nil
end

start_foo_and_hope_for_gc_collect
puts "collect 1"
GC.start
puts "collect 2"
GC.start

Gtk.main

Ruby has a generational GC so a single GC.start or GC.collect cycle may not evict the object yet.

I was wondering if having an extra parameter or a different version of connect that doesn't store any references would be enough to solve this, something like:

@window.on_activate_focus(&->slot(Gtk::Window), :dont_hold_closure)

Being the default behavior, the current one.

I'm not sure, seems like a foot gun. Looking at this again I'm actually surprised Proc#closure_data is not nilable, I guess it might return a null pointer if there's no closure? Anyways, fixing up the ref counting ought to be enough, but also easier said than done :/

yes, the :dont_hold_closure could led to crashes if there's no Crystal code referencing the object, without it the code will never crashes, but will leak memory unless you manage the memory by yourself.

I could manage the memory by myself calling unref on GTK object, but then when Crystal GC collect the Crystal object (Foo instance) it will call the finalizer of the GTK wrapper (@window in the example) and call unref on an already destroyed object.

All this is hard because sometimes we want the reference to be alive and not collected by the GC, so no crashes happen, and sometimes we want the GC to collect the Crystal objects and automatically call the unref on GTK objects freeing up memory.

IMO both are valid use cases.

Mmh, when do we want an object to be alive without any reference to it, either in Crystal or in Gtk land?

No, it will always have a ref, in Crystal or in GTK. If the ref is only in GTK the ref count should guarantee it will not be destroyed.

Anyway, I was thinking if the ClosureManager could be just removed, and if the object connected to some signal has no reference in Crystal code the user need to ref it. I think this is basically with you said in a comment above.

Connecting a signal handler to some object should return a handler ID. This ID should be used to disconnect the handler again in finalize so that gtk can drop the ref count.

ClosureDataManager prevents collection of a proc's closure data, a Crystal allocated pointer, when the only reference to it stays within Gtk allocated memory. There's nothing to ref here.

Apparently other GC languages came across the same issues of memory management. We could draw some lessons from their success/failures.

https://gjs.guide/guides/gjs/memory-management.html#basics

https://feaneron.com/2018/04/20/the-infamous-gnome-shell-memory-leak/

After reading a bit more on the subject I'll close this issue since there's no way to let objects being auto-freed while having a connected signal, i.e. the user need to disconnect the signals if he/she wants the object to have any chance of being freed by GC.