garbage collection reference error while running tests
shagler opened this issue · comments
Shawn Hagler commented
Just got a M1 Pro (16 GB) macbook and set it up. Cloned tinygrad and installed it, ran the tests and got:
========================================== FAILURES ==========================================
_______________________________________ TestGC.test_gc _______________________________________
self = <test.test_gc.TestGC testMethod=test_gc>
def test_gc(self):
a = Tensor.rand(4, 4, requires_grad=True)
b = Tensor.zeros(4, 4, requires_grad=True)
(a*b).mean().backward()
> assert(tensors_allocated() > 0)
test/test_gc.py:16:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
def tensors_allocated():
> return sum([isinstance(x, Tensor) for x in gc.get_objects()])
E ReferenceError: weakly-referenced object no longer exists
test/test_gc.py:8: ReferenceError
___________________________________ TestGC.test_gc_complex ___________________________________
self = <test.test_gc.TestGC testMethod=test_gc_complex>
def test_gc_complex(self):
a = Tensor(np.zeros((4, 4), dtype=np.float32), requires_grad=True)
b = Tensor.rand(4, 4, requires_grad=True)
> assert(tensors_allocated() == 3)
test/test_gc.py:23:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
def tensors_allocated():
> return sum([isinstance(x, Tensor) for x in gc.get_objects()])
E ReferenceError: weakly-referenced object no longer exists
test/test_gc.py:8: ReferenceError
====================================== warnings summary ======================================
test/models/test_whisper.py::TestWhisper::test_transcribe_batch12
/opt/homebrew/lib/python3.12/site-packages/librosa/core/intervals.py:15: DeprecationWarning: path is deprecated. Use files() instead. Refer to https://importlib-resources.readthedocs.io/en/latest/using.html#migrating-from-legacy for migration advice.
with resources.path("librosa.core", "intervals.msgpack") as imsgpack:
test/models/test_whisper.py::TestWhisper::test_transcribe_batch12
/opt/homebrew/lib/python3.12/site-packages/audioread/rawread.py:16: DeprecationWarning: 'aifc' is deprecated and slated for removal in Python 3.13
import aifc
test/models/test_whisper.py::TestWhisper::test_transcribe_batch12
/opt/homebrew/lib/python3.12/site-packages/audioread/rawread.py:17: DeprecationWarning: 'audioop' is deprecated and slated for removal in Python 3.13
import audioop
test/models/test_whisper.py::TestWhisper::test_transcribe_batch12
/opt/homebrew/lib/python3.12/site-packages/audioread/rawread.py:19: DeprecationWarning: 'sunau' is deprecated and slated for removal in Python 3.13
import sunau
test/test_dtype.py::TestHalfDType::test_casts_to
/Users/shagler/build/tinygrad/test/test_dtype.py:55: RuntimeWarning: overflow encountered in cast
_test_op(lambda: a.cast(target_dtype), target_dtype, list(a.numpy().astype(_to_np_dtype(target_dtype))))
test/test_dtype.py::TestHalfDType::test_casts_to
test/test_dtype.py::TestFloatDType::test_casts_to
/Users/shagler/build/tinygrad/tinygrad/tensor.py:135: RuntimeWarning: invalid value encountered in cast
else: data = _fromnp(data.astype(npdtype) if dtype is not None and (npdtype:=_to_np_dtype(dtype)) is not None else data)
test/test_dtype_alu.py::TestDTypeALU::test_float16
test/test_dtype_alu.py::TestDTypeALU::test_float16
test/test_dtype_alu.py::TestDTypeALU::test_float16
test/test_dtype_alu.py::TestDTypeALU::test_float32
test/test_dtype_alu.py::TestDTypeALU::test_float32
/Users/shagler/build/tinygrad/test/test_dtype_alu.py:62: RuntimeWarning: invalid value encountered in subtract
numpy_value = op[1](np.array([a]).astype(_to_np_dtype(dtype)), np.array([b]).astype(_to_np_dtype(dtype)))
test/test_dtype_alu.py::TestDTypeALU::test_float16
test/test_dtype_alu.py::TestDTypeALU::test_float16
test/test_dtype_alu.py::TestDTypeALU::test_float16
test/test_dtype_alu.py::TestDTypeALU::test_float32
/Users/shagler/build/tinygrad/test/test_dtype_alu.py:62: RuntimeWarning: invalid value encountered in add
numpy_value = op[1](np.array([a]).astype(_to_np_dtype(dtype)), np.array([b]).astype(_to_np_dtype(dtype)))
test/test_dtype_alu.py::TestDTypeALU::test_float16_unary
test/test_dtype_alu.py::TestDTypeALU::test_float16_unary
test/test_dtype_alu.py::TestDTypeALU::test_float16_unary
test/test_dtype_alu.py::TestDTypeALU::test_float16_unary
test/test_dtype_alu.py::TestDTypeALU::test_float16_unary
test/test_dtype_alu.py::TestDTypeALU::test_float32_unary
test/test_dtype_alu.py::TestDTypeALU::test_float32_unary
test/test_dtype_alu.py::TestDTypeALU::test_float32_unary
test/test_dtype_alu.py::TestDTypeALU::test_float32_unary
/Users/shagler/build/tinygrad/test/test_dtype_alu.py:73: RuntimeWarning: invalid value encountered in sin
numpy_value = op[1](np.array([a]).astype(_to_np_dtype(dtype)))
test/test_dtype_alu.py: 26 warnings
/Users/shagler/build/tinygrad/test/test_dtype_alu.py:73: RuntimeWarning: invalid value encountered in sqrt
numpy_value = op[1](np.array([a]).astype(_to_np_dtype(dtype)))
test/test_dtype_alu.py: 27 warnings
/Users/shagler/build/tinygrad/test/test_dtype_alu.py:73: RuntimeWarning: invalid value encountered in log
numpy_value = op[1](np.array([a]).astype(_to_np_dtype(dtype)))
test/test_dtype_alu.py::TestDTypeALU::test_float16_unary
test/test_dtype_alu.py::TestDTypeALU::test_float16_unary
test/test_dtype_alu.py::TestDTypeALU::test_float16_unary
test/test_dtype_alu.py::TestDTypeALU::test_float32_unary
/Users/shagler/build/tinygrad/test/test_dtype_alu.py:73: RuntimeWarning: divide by zero encountered in reciprocal
numpy_value = op[1](np.array([a]).astype(_to_np_dtype(dtype)))
test/test_dtype_alu.py::TestDTypeALU::test_float16_unary
test/test_dtype_alu.py::TestDTypeALU::test_float32_unary
test/test_dtype_alu.py::TestDTypeALU::test_float32_unary
/Users/shagler/build/tinygrad/test/test_dtype_alu.py:73: RuntimeWarning: divide by zero encountered in log
numpy_value = op[1](np.array([a]).astype(_to_np_dtype(dtype)))
test/test_dtype_alu.py::TestDTypeALU::test_float32
test/test_dtype_alu.py::TestDTypeALU::test_float32
/Users/shagler/build/tinygrad/test/test_dtype_alu.py:62: RuntimeWarning: overflow encountered in multiply
numpy_value = op[1](np.array([a]).astype(_to_np_dtype(dtype)), np.array([b]).astype(_to_np_dtype(dtype)))
test/test_dtype_alu.py::TestDTypeALU::test_float32
/Users/shagler/build/tinygrad/test/test_dtype_alu.py:62: RuntimeWarning: invalid value encountered in multiply
numpy_value = op[1](np.array([a]).astype(_to_np_dtype(dtype)), np.array([b]).astype(_to_np_dtype(dtype)))
test/test_dtype_alu.py::TestDTypeALU::test_float32_unary
test/test_dtype_alu.py::TestDTypeALU::test_float32_unary
test/test_dtype_alu.py::TestDTypeALU::test_float32_unary
test/test_dtype_alu.py::TestDTypeALU::test_float32_unary
/Users/shagler/build/tinygrad/test/test_dtype_alu.py:73: RuntimeWarning: overflow encountered in exp
numpy_value = op[1](np.array([a]).astype(_to_np_dtype(dtype)))
test/test_dtype_alu.py: 65 warnings
/Users/shagler/build/tinygrad/test/test_dtype_alu.py:92: RuntimeWarning: invalid value encountered in cast
numpy_value = op2[1](op1[1](an, bn).astype(_to_np_dtype(d2)), cn)
test/test_dtype_alu.py::TestDTypeALU::test_float_midcast_int32
test/test_dtype_alu.py::TestDTypeALU::test_float_midcast_int32
test/test_dtype_alu.py::TestDTypeALU::test_float_midcast_int32
test/test_dtype_alu.py::TestDTypeALU::test_float_midcast_int32
test/test_dtype_alu.py::TestDTypeALU::test_float_midcast_int32
test/test_dtype_alu.py::TestDTypeALU::test_float_midcast_int32
/Users/shagler/build/tinygrad/test/test_dtype_alu.py:92: RuntimeWarning: invalid value encountered in subtract
numpy_value = op2[1](op1[1](an, bn).astype(_to_np_dtype(d2)), cn)
test/test_gc.py::TestGC::test_gc
test/test_gc.py::TestGC::test_gc_complex
/opt/homebrew/lib/python3.12/site-packages/torch/distributed/distributed_c10d.py:366: UserWarning: torch.distributed.reduce_op is deprecated, please use torch.distributed.ReduceOp instead
warnings.warn(
test/test_ops.py::TestOps::test_std_one_in_axis
/Users/shagler/build/tinygrad/test/test_ops.py:896: UserWarning: std(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
helper_test_op([(1,2,3,1,5)], lambda x: x.std(axis=(0,3)), forward_only=True)
test/test_ops.py::TestOps::test_std_one_in_axis
/Users/shagler/build/tinygrad/test/test_ops.py:898: UserWarning: std(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
helper_test_op([(1,2,3,1,5)], lambda x: x.std(axis=(0,3), correction=5), forward_only=True)
test/test_ops.py::TestOps::test_std_one_in_axis
/Users/shagler/build/tinygrad/test/test_ops.py:901: UserWarning: std(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
helper_test_op([(1,2,3,1,5)], lambda x: x.std(axis=(0,4), correction=5))
test/test_ops.py::TestOps::test_std_zero_in_axis
/Users/shagler/build/tinygrad/test/test_ops.py:891: UserWarning: std(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
helper_test_op([(1,0,3,0,5)], lambda x: x.std(axis=(1,3)))
test/test_ops.py::TestOps::test_std_zero_in_axis
/Users/shagler/build/tinygrad/test/test_ops.py:892: UserWarning: std(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
helper_test_op([(1,0,3,0,5)], lambda x: x.std(axis=(1,3), correction=0))
test/test_ops.py::TestOps::test_std_zero_in_axis
/Users/shagler/build/tinygrad/test/test_ops.py:893: UserWarning: std(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
helper_test_op([(1,0,3,0,5)], lambda x: x.std(axis=(1,3), correction=5))
test/test_ops.py::TestOps::test_var_one_in_axis
/Users/shagler/build/tinygrad/test/test_ops.py:869: UserWarning: var(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
helper_test_op([(1,2,3,1,5)], lambda x: x.var(axis=(0,3)), forward_only=True)
test/test_ops.py::TestOps::test_var_one_in_axis
/Users/shagler/build/tinygrad/test/test_ops.py:871: UserWarning: var(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
helper_test_op([(1,2,3,1,5)], lambda x: x.var(axis=(0,3), correction=5), forward_only=True)
test/test_ops.py::TestOps::test_var_one_in_axis
/Users/shagler/build/tinygrad/test/test_ops.py:874: UserWarning: var(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
helper_test_op([(1,2,3,1,5)], lambda x: x.var(axis=(0,4), correction=5), forward_only=True)
test/test_ops.py::TestOps::test_var_zero_in_axis
/Users/shagler/build/tinygrad/test/test_ops.py:864: UserWarning: var(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
helper_test_op([(1,0,3,0,5)], lambda x: x.var(axis=(1,3)))
test/test_ops.py::TestOps::test_var_zero_in_axis
/Users/shagler/build/tinygrad/test/test_ops.py:865: UserWarning: var(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
helper_test_op([(1,0,3,0,5)], lambda x: x.var(axis=(1,3), correction=0))
test/test_ops.py::TestOps::test_var_zero_in_axis
/Users/shagler/build/tinygrad/test/test_ops.py:866: UserWarning: var(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
helper_test_op([(1,0,3,0,5)], lambda x: x.var(axis=(1,3), correction=5))
test/test_tensor.py::TestTinygrad::test_tensor_list_special_values
/Users/shagler/build/tinygrad/tinygrad/tensor.py:131: RuntimeWarning: overflow encountered in cast
else: data = _fromnp(np.array(data).astype(_to_np_dtype(dtype)))
test/test_tensor.py::TestTinygrad::test_tensor_list_special_values
/Users/shagler/build/tinygrad/test/test_tensor.py:310: RuntimeWarning: overflow encountered in cast
np.testing.assert_allclose(Tensor(data, dtype=dtypes.float16).numpy(), np.array(data).astype(np.float16))
test/unit/test_disk_tensor.py::TestTorchLoad::test_load_resnet
/opt/homebrew/lib/python3.12/site-packages/torch/serialization.py:1129: DeprecationWarning: Python 3.14 will, by default, filter extracted tar archives and reject files or modify their metadata. Use the filter argument to control this behavior.
tar.extract('storages', path=tmpdir)
test/unit/test_disk_tensor.py::TestTorchLoad::test_load_resnet
/opt/homebrew/lib/python3.12/site-packages/torch/serialization.py:1157: DeprecationWarning: Python 3.14 will, by default, filter extracted tar archives and reject files or modify their metadata. Use the filter argument to control this behavior.
tar.extract('tensors', path=tmpdir)
-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================== short test summary info ===================================
FAILED test/test_gc.py::TestGC::test_gc - ReferenceError: weakly-referenced object no longer exists
FAILED test/test_gc.py::TestGC::test_gc_complex - ReferenceError: weakly-referenced object no longer exists
===== 2 failed, 1668 passed, 194 skipped, 36 xfailed, 181 warnings in 267.75s (0:04:27) ======
Not sure if this is something related to my setup, or an issue with tinygrad
.
Just making an issue for tracking.
George Hotz commented
Can't repro on my machine. Often the gc tests fail if another test fails, but I don't see that here. If you are still hitting an issue, feel free to reopen with a full repro that I might be able to replicate
George Hotz commented
Oh, actually I can reproduce this. It looks like it's caused by contiguous_child