godotengine / godot-cpp

C++ bindings for the Godot script API

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Calling Variant() dereferences freed and invalid object pointers, leading to potential use-after-frees.

id01 opened this issue · comments

commented

Godot version

4.2.1.stable

godot-cpp version

4.2.1.stable

System information

Arch Linux, Windows 10

Issue description

Hey y'all, hope you're ready for this one because it's a banger!

The Problem

I started getting some weird crashes on Windows exports; no stack trace, no nothing. Opened it in WinDbg and dumped the stack trace. Happened in a callback to the Godot binary in the free() function of my GDExtension. (relevant portion of WinDbg traceback; the rest is trash about embree due to lack of debug symbols in the Godot binary used):

libshipgame_simulation_windows_template_debug_x86_64!ZN28ShipBuildBlockInfoCollection4freeEPvS0_+0x74

If you open this portion up in IDA, the return address points to the function here:

call    _ZN5godot7VariantC2EPKNS_6ObjectE

Here, a godot::Variant() is called before passing it into a UtilityFunctions::is_instance_valid() in the compiled code.

Here's the relevant portion of the GDCLASS macro which defines free() for all classes. After executing the destructor, it calls free_static on cls, which according to the disassembler, seems to have an implicit cast converting cls into a Variant.

	static void free(void *data, GDExtensionClassInstancePtr ptr) {                                                                                                                    \
		if (ptr) {                                                                                                                                                                     \
			m_class *cls = reinterpret_cast<m_class *>(ptr);                                                                                                                           \
			cls->~m_class();                                                                                                                                                           \
			::godot::Memory::free_static(cls);                                                                                                                                         \
		}                                                                                                                                                                              \
	}

On Linux, this had no issue, but on Windows, I just happened to have a class destructor that called queue_free() on some elements that also happened to be children of the class. It seems that on Windows, the freeing happened during the call, which meant that this issue was causing the free() function to be called twice, and the failsafe with the is_instance_valid is dereferencing the freed memory, causing a crash. As to why queue_free seems to be freeing the memory here, I'll have to look into further.

Looking at the godot-cpp code further, it seems like we're passing this over to the main Godot binary, which matches my observations, but the Variant() constructor in the main Godot binary dereferences the Variant object without actually checking if it is valid.

Variant constructor in godot-cpp:

Variant::Variant(const Object *v) {
	if (v) {
		from_type_constructor[OBJECT](_native_ptr(), const_cast<GodotObject **>(&v->_owner));
	} else {
		GodotObject *nullobject = nullptr;
		from_type_constructor[OBJECT](_native_ptr(), &nullobject);
	}
}

Variant setter for object in main Godot binary:

	_FORCE_INLINE_ static void object_assign(Variant *v, const Variant *o) {
		object_assign(v, o->_get_obj().obj);
	}

Due to this, Variant() was dereferencing an invalid (freed) pointer which led to a memory access violation.

Impact

First of all, the double-free on Windows is an issue that would be patched by this, but we also have issues with when users of godot-cpp want to handle freed objects. In GDScript, a freed object can be manipulated, but will cause errors when referenced, and say [previously freed] when printed. In godot-cpp, we probably don't want casting a freed node to a Variant to crash the application, particularly because UtilityFunctions::is_instance_valid() casts the freed node to a Variant.

Possible Solutions

Possible solutions for this problem depend on how you want to handle invalid pointers. If it's okay to convert invalid pointers to null pointers in godot-cpp, a workaround could simply be to add a UtilityFunctions::is_instance_valid-esque call to the verification (note that is_instance_valid takes a Variant pointer), but this would mean that [previously freed] instances become null instances, which may not be intuitive for application logic.

The other solution is to add the check in the main Godot binary, creating an invalid instance with the [previously freed] that we recognize from GDScript. I'm not particularly sure about the internals of previously freed instances, but this would be more consistent with GDScript.

Thank you for listening to my TED talk

Steps to reproduce

You can create an invalid pointer using reinterpret_cast. Freed pointers are just invalid pointers which may or may not point to valid memory.

class Test : public Node {
	GDCLASS(Test, Node);
private:
	Variant ptr;

protected:
	static void _bind_methods() {
		ClassDB::bind_method(D_METHOD("test"), &Test::test);
	}

public:
	Test() {}
	~Test() {}

	void test() { ptr = Variant(reinterpret_cast<Node*>(1)); UtilityFunctions::print(ptr); }
};

Expected result:
test() should print [invalid object], [previously freed], or <null>

Actual result:
Program crashes

This may cause issues in scenarios like the below:

Node* ptr = reinterpret_cast<Node*>(1); // Invalid ptr, from free() or otherwise
if (UtilityFunctions::is_instance_valid(ptr)) { // Crash!! Implicit cast to Variant()
   ...do things...
}

Minimal reproduction project

Compile the code in Steps to Reproduce and add it to a GDExtension. Call Test.test(). The program crashes.

commented

This seems to be a combination between two issues - a double freeing issue on Windows that could be mitigated through a change to Variant() and the double freeing issue itself. If the first one is an issue with the laws of physics (after all, we don't want dangling pointers pointing to other objects to be said to be valid), I might need to debug the double free issue.

Just deleted the previous message about the freeing issue - what I thought was the cause of the double free on Windows wasn't the cause. Still looking at it.

call    _ZN5godot7VariantC1EPKNS_6ObjectE ; godot::Variant::Variant(godot::Object const*)
mov     rcx, rbp
call    _ZN5godot16UtilityFunctions17is_instance_validERKNS_7VariantE ; godot::UtilityFunctions::is_instance_valid(godot::Variant const&)
mov     rcx, rbp
mov     esi, eax        ; godot::Variant *
call    _ZN5godot7VariantD1Ev ; godot::Variant::~Variant()

Full context of the crash was here. Seems like the pointer being passed to free() was invalid or already freed for some reason. I never used free(), only queue_free() in my code, so I'm not sure why this would be, nor why this only happens on Windows.

Might try to run windbg on it with a breakpoint on this specific class' free() function to see if it's called twice, and where it's called from if so.

commented

Alright, I found the other issue. I'm going to split this up into two bug reports with hopefully-better readability instead of autistic debug ramblings.