sugarme / gotch

Go binding for Pytorch C++ API (libtorch)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Possible Memory Leak From C.malloc(0)

bobbyluig opened this issue · comments

Hello! I was playing around with gotch and ran a few tests to make sure I was dropping tensors and freeing memory correctly. However, even when I ran MustDrop(), memory usage always slowly creeped up. I'm running go 1.20 on Ubuntu 22.04 in WSL2. Here is a minimal example of when this happens.

func main() {
	for i := 0; i < 100000000; i++ {
		a := ts.MustRand([]int64{1}, gotch.Float, gotch.CPU)
		a.MustDrop()
	}
}

I took a look at the underlying interface with libtorch and saw that most functions handled outputs of C functions by calling C.malloc(0).

func Rand(...) (...) { 
	ptr := (*lib.Ctensor)(unsafe.Pointer(C.malloc(0)))
	// Some C call that stores an allocated tensor at *ptr.
	retVal = &Tensor{ctensor: *ptr}
	return retVal, err
} 

However, while we do free *ptr using MustDrop(), ptr is not freed. I think this is what is causing the memory leak.
By changing the above function to one of the two versions below, I was able to run the simple example without any memory leaks. I think the second one results in less Cgo calls, but I'm not sure if the usage is safe.

func Rand(...) (...) { 
	ptr := (*lib.Ctensor)(unsafe.Pointer(C.malloc(0)))
	defer func() { C.free(unsafe.Pointer(ptr)) }()
	// Some C call that stores an allocated tensor at *ptr.
	retVal = &Tensor{ctensor: *ptr}
	return retVal, err
} 
func Rand(...) (...) {
	var untypedPtr uintptr
	ptr := (*lib.Ctensor)(unsafe.Pointer(&untypedPtr))
	// Some C call that stores an allocated tensor at *ptr.
	retVal = &Tensor{ctensor: *ptr}
	return retVal, err
} 

Same problem

Seems to be a problem. Do we know how those x-generated.go are generated? 🤔 Maybe we can help create a fix?

upgraded with a new switch to use Go garbage collection.