gilzoide / godot-dispatch-queue

Threaded and synchronous Dispatch Queues for Godot

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Destruction of object while method is in `yield` state

tavurth opened this issue · comments

commented

When the task is in a yielding state, and the task object is queue_freed there can be weird errors and crashes.

Any pointers as to where you check for task state?

Hi @tavurth!

When you say "task object is queue_freed", do you mean the actual Task class, that is returned from the dispatch method?
Since Task extends Reference, it doesn't have a queue_free method. Also, you should never free Reference objects directly, it's best to let Godot manage the reference counting and free the objects when it's the right time.

Now, regarding yield, the current implementation does not take that into account, it gets the result from the method call and assumes right away that the task is completed (see Task.Execute). To be honest, I've never tried using yield in these tasks, so that's probably why I didn't implement support for it, thanks for pointing that out!

Anyway, do you have a reproduction project where I could see how you are using yield in the tasks? This way I could have a better idea of what you're trying to do and test any fixes against this reproduction project.

commented

Hi @gilzoide!

Sorry for the misunderstanding, by task object I mean the object on which the task is calling the method.

Actually seems like it's not even a queue free issue, something weird going on with yielding inside the function that's called by the thread. I created a sample repository.

dispatch-queue-test.zip

Running this project causes a crash for me 1 time in every 3 or so. Details below:

~/src/dispatch-queue-test                                                                                                                                                                                 ⍉
▶ godot --verbose
arguments
0: /Applications/Godot.app/Contents/MacOS/Godot
1: --verbose
Current path: /Users/will/src/dispatch-queue-test
Godot Engine v3.4.3.stable.official.242c05d12 - https://godotengine.org
Using GLES2 video driver
OpenGL debugging not supported!
OpenGL ES 2.0 Renderer: Intel(R) Iris(TM) Plus Graphics 655
OpenGL ES Batching: ON
	OPTIONS
	max_join_item_commands 16
	colored_vertex_format_threshold 0.25
	batch_buffer_size 16384
	light_scissor_area_threshold 1
	item_reordering_lookahead 4
	light_max_join_items 32
	single_rect_fallback False
	debug_flash False
	diagnose_frame False
CoreAudio: detected 2 channels
CoreAudio: audio buffer frames: 512 calculated latency: 11ms

Registered camera FaceTime HD Camera (Built-in) with id 1 position 0 at index 0
CORE API HASH: 15296446336143176771
EDITOR API HASH: 4915204304684122520
Loading resource: res://default_env.tres
Loading resource: res://addons/dispatch_queue/dispatch_queue_node.gd
Loading resource: res://addons/dispatch_queue/dispatch_queue.gd
Loaded builtin certs
Loading resource: res://Main.tscn
Loading resource: res://Main.gd
Loading resource: res://TaskWhichIsFreed.gd
ERROR: Condition "p_I->data != this" is true. Returned: false
   at: erase (./core/list.h:150)
Godot(18521,0x118f90600) malloc: Heap corruption detected, free list is damaged at 0x6000015e0240
*** Incorrect guard value: 140704381972240
Godot(18521,0x118f90600) malloc: *** set a breakpoint in malloc_error_break to debug
[1]    18521 abort      /Applications/Godot.app/Contents/MacOS/Godot --verbose
commented

Other times I get the following:

ERROR: Error calling method from signal 'idle_frame': 'GDScriptFunctionState::': Method not found..
   at: emit_signal (core/object.cpp:1236)
ERROR: Disconnecting nonexistent signal 'idle_frame', slot: 1302:.
   at: _disconnect (core/object.cpp:1538)
ERROR: Condition "p_I->data != this" is true. Returned: false
   at: erase (./core/list.h:150)

Could this be a bug with the godot threading side of things?

Interestingly if I change the code in Main.gd to delay even 500ms the project runs every time, perhaps because the _ready function is now also yielding to the parent.

extends Node2D

var TaskWhichIsFreed = preload("res://TaskWhichIsFreed.gd")

func _ready():
	for _i in range(20):
		self.add_child(TaskWhichIsFreed.new())
	
	yield(get_tree().create_timer(0.5), "timeout")
	
	for child in self.get_children():
		Threads.dispatch(child, "task")

Btw, I love this plugin it's super helpful and clean. Great job!

Ok, now I got it!
Running some tests here, both problems happened and it was not 100% reproduction, so sometimes it works, sometimes not, just like how you mentioned.

For the crashes, it's most likely some race condition happening, since in multithreaded code a lot can go wrong if more than one thread accesses the same memory. I can't tell yet if the problem is when yielding or after resuming, probably the later.

For the second problem (Method not found..), I think that when the Task object is destroyed, the GDScriptFunctionState is not valid anymore (GDScriptFunctionState.is_valid docs mention the object must exist for the resume to happen) and the error occurs.
So if we are going to support yielding from Tasks, we'll need to keep a reference to the Task alive until it completes.
This shouldn't be too hard: first we check if Task.execute returns a GDScriptFunctionState (here and here). If so, we keep it in an Array or Dictionary or something like that and listen to it's finished signal to remove from there afterwards.

if I change the code in Main.gd to delay even 500ms the project runs every time

Hmm, that's really weird... It happens a lot less, but if you try enough, it may still crash. I've been able to reproduce it with a 0.5s delay after lots of runs.

Btw, I love this plugin it's super helpful and clean. Great job!

Thanks! I'm really glad you like it ^^

commented

Thank you for the pointers! With your help I managed to write a solution which seems to function well.

I've tested it on my sample repository using yield inside the child (and no yield) and everything seems to be running well 🥳

Please see the attached PR if you would like to merge it with your repo. I tried to keep the code isolated so you can easily change it for 4.x.