tinygrad / tinygrad

You like pytorch? You like micrograd? You love tinygrad! ❤️

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

garbage collection reference error while running tests

shagler opened this issue · comments

Just got a M1 Pro (16 GB) macbook and set it up. Cloned tinygrad and installed it, ran the tests and got:

========================================== FAILURES ==========================================
_______________________________________ TestGC.test_gc _______________________________________

self = <test.test_gc.TestGC testMethod=test_gc>

    def test_gc(self):
      a = Tensor.rand(4, 4, requires_grad=True)
      b = Tensor.zeros(4, 4, requires_grad=True)
      (a*b).mean().backward()
>     assert(tensors_allocated() > 0)

test/test_gc.py:16:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    def tensors_allocated():
>     return sum([isinstance(x, Tensor) for x in gc.get_objects()])
E     ReferenceError: weakly-referenced object no longer exists

test/test_gc.py:8: ReferenceError
___________________________________ TestGC.test_gc_complex ___________________________________

self = <test.test_gc.TestGC testMethod=test_gc_complex>

    def test_gc_complex(self):
      a = Tensor(np.zeros((4, 4), dtype=np.float32), requires_grad=True)
      b = Tensor.rand(4, 4, requires_grad=True)
>     assert(tensors_allocated() == 3)

test/test_gc.py:23:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

    def tensors_allocated():
>     return sum([isinstance(x, Tensor) for x in gc.get_objects()])
E     ReferenceError: weakly-referenced object no longer exists

test/test_gc.py:8: ReferenceError
====================================== warnings summary ======================================
test/models/test_whisper.py::TestWhisper::test_transcribe_batch12
  /opt/homebrew/lib/python3.12/site-packages/librosa/core/intervals.py:15: DeprecationWarning: path is deprecated. Use files() instead. Refer to https://importlib-resources.readthedocs.io/en/latest/using.html#migrating-from-legacy for migration advice.
    with resources.path("librosa.core", "intervals.msgpack") as imsgpack:

test/models/test_whisper.py::TestWhisper::test_transcribe_batch12
  /opt/homebrew/lib/python3.12/site-packages/audioread/rawread.py:16: DeprecationWarning: 'aifc' is deprecated and slated for removal in Python 3.13
    import aifc

test/models/test_whisper.py::TestWhisper::test_transcribe_batch12
  /opt/homebrew/lib/python3.12/site-packages/audioread/rawread.py:17: DeprecationWarning: 'audioop' is deprecated and slated for removal in Python 3.13
    import audioop

test/models/test_whisper.py::TestWhisper::test_transcribe_batch12
  /opt/homebrew/lib/python3.12/site-packages/audioread/rawread.py:19: DeprecationWarning: 'sunau' is deprecated and slated for removal in Python 3.13
    import sunau

test/test_dtype.py::TestHalfDType::test_casts_to
  /Users/shagler/build/tinygrad/test/test_dtype.py:55: RuntimeWarning: overflow encountered in cast
    _test_op(lambda: a.cast(target_dtype), target_dtype, list(a.numpy().astype(_to_np_dtype(target_dtype))))

test/test_dtype.py::TestHalfDType::test_casts_to
test/test_dtype.py::TestFloatDType::test_casts_to
  /Users/shagler/build/tinygrad/tinygrad/tensor.py:135: RuntimeWarning: invalid value encountered in cast
    else: data = _fromnp(data.astype(npdtype) if dtype is not None and (npdtype:=_to_np_dtype(dtype)) is not None else data)

test/test_dtype_alu.py::TestDTypeALU::test_float16
test/test_dtype_alu.py::TestDTypeALU::test_float16
test/test_dtype_alu.py::TestDTypeALU::test_float16
test/test_dtype_alu.py::TestDTypeALU::test_float32
test/test_dtype_alu.py::TestDTypeALU::test_float32
  /Users/shagler/build/tinygrad/test/test_dtype_alu.py:62: RuntimeWarning: invalid value encountered in subtract
    numpy_value = op[1](np.array([a]).astype(_to_np_dtype(dtype)), np.array([b]).astype(_to_np_dtype(dtype)))

test/test_dtype_alu.py::TestDTypeALU::test_float16
test/test_dtype_alu.py::TestDTypeALU::test_float16
test/test_dtype_alu.py::TestDTypeALU::test_float16
test/test_dtype_alu.py::TestDTypeALU::test_float32
  /Users/shagler/build/tinygrad/test/test_dtype_alu.py:62: RuntimeWarning: invalid value encountered in add
    numpy_value = op[1](np.array([a]).astype(_to_np_dtype(dtype)), np.array([b]).astype(_to_np_dtype(dtype)))

test/test_dtype_alu.py::TestDTypeALU::test_float16_unary
test/test_dtype_alu.py::TestDTypeALU::test_float16_unary
test/test_dtype_alu.py::TestDTypeALU::test_float16_unary
test/test_dtype_alu.py::TestDTypeALU::test_float16_unary
test/test_dtype_alu.py::TestDTypeALU::test_float16_unary
test/test_dtype_alu.py::TestDTypeALU::test_float32_unary
test/test_dtype_alu.py::TestDTypeALU::test_float32_unary
test/test_dtype_alu.py::TestDTypeALU::test_float32_unary
test/test_dtype_alu.py::TestDTypeALU::test_float32_unary
  /Users/shagler/build/tinygrad/test/test_dtype_alu.py:73: RuntimeWarning: invalid value encountered in sin
    numpy_value = op[1](np.array([a]).astype(_to_np_dtype(dtype)))

test/test_dtype_alu.py: 26 warnings
  /Users/shagler/build/tinygrad/test/test_dtype_alu.py:73: RuntimeWarning: invalid value encountered in sqrt
    numpy_value = op[1](np.array([a]).astype(_to_np_dtype(dtype)))

test/test_dtype_alu.py: 27 warnings
  /Users/shagler/build/tinygrad/test/test_dtype_alu.py:73: RuntimeWarning: invalid value encountered in log
    numpy_value = op[1](np.array([a]).astype(_to_np_dtype(dtype)))

test/test_dtype_alu.py::TestDTypeALU::test_float16_unary
test/test_dtype_alu.py::TestDTypeALU::test_float16_unary
test/test_dtype_alu.py::TestDTypeALU::test_float16_unary
test/test_dtype_alu.py::TestDTypeALU::test_float32_unary
  /Users/shagler/build/tinygrad/test/test_dtype_alu.py:73: RuntimeWarning: divide by zero encountered in reciprocal
    numpy_value = op[1](np.array([a]).astype(_to_np_dtype(dtype)))

test/test_dtype_alu.py::TestDTypeALU::test_float16_unary
test/test_dtype_alu.py::TestDTypeALU::test_float32_unary
test/test_dtype_alu.py::TestDTypeALU::test_float32_unary
  /Users/shagler/build/tinygrad/test/test_dtype_alu.py:73: RuntimeWarning: divide by zero encountered in log
    numpy_value = op[1](np.array([a]).astype(_to_np_dtype(dtype)))

test/test_dtype_alu.py::TestDTypeALU::test_float32
test/test_dtype_alu.py::TestDTypeALU::test_float32
  /Users/shagler/build/tinygrad/test/test_dtype_alu.py:62: RuntimeWarning: overflow encountered in multiply
    numpy_value = op[1](np.array([a]).astype(_to_np_dtype(dtype)), np.array([b]).astype(_to_np_dtype(dtype)))

test/test_dtype_alu.py::TestDTypeALU::test_float32
  /Users/shagler/build/tinygrad/test/test_dtype_alu.py:62: RuntimeWarning: invalid value encountered in multiply
    numpy_value = op[1](np.array([a]).astype(_to_np_dtype(dtype)), np.array([b]).astype(_to_np_dtype(dtype)))

test/test_dtype_alu.py::TestDTypeALU::test_float32_unary
test/test_dtype_alu.py::TestDTypeALU::test_float32_unary
test/test_dtype_alu.py::TestDTypeALU::test_float32_unary
test/test_dtype_alu.py::TestDTypeALU::test_float32_unary
  /Users/shagler/build/tinygrad/test/test_dtype_alu.py:73: RuntimeWarning: overflow encountered in exp
    numpy_value = op[1](np.array([a]).astype(_to_np_dtype(dtype)))

test/test_dtype_alu.py: 65 warnings
  /Users/shagler/build/tinygrad/test/test_dtype_alu.py:92: RuntimeWarning: invalid value encountered in cast
    numpy_value = op2[1](op1[1](an, bn).astype(_to_np_dtype(d2)), cn)

test/test_dtype_alu.py::TestDTypeALU::test_float_midcast_int32
test/test_dtype_alu.py::TestDTypeALU::test_float_midcast_int32
test/test_dtype_alu.py::TestDTypeALU::test_float_midcast_int32
test/test_dtype_alu.py::TestDTypeALU::test_float_midcast_int32
test/test_dtype_alu.py::TestDTypeALU::test_float_midcast_int32
test/test_dtype_alu.py::TestDTypeALU::test_float_midcast_int32
  /Users/shagler/build/tinygrad/test/test_dtype_alu.py:92: RuntimeWarning: invalid value encountered in subtract
    numpy_value = op2[1](op1[1](an, bn).astype(_to_np_dtype(d2)), cn)

test/test_gc.py::TestGC::test_gc
test/test_gc.py::TestGC::test_gc_complex
  /opt/homebrew/lib/python3.12/site-packages/torch/distributed/distributed_c10d.py:366: UserWarning: torch.distributed.reduce_op is deprecated, please use torch.distributed.ReduceOp instead
    warnings.warn(

test/test_ops.py::TestOps::test_std_one_in_axis
  /Users/shagler/build/tinygrad/test/test_ops.py:896: UserWarning: std(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
    helper_test_op([(1,2,3,1,5)], lambda x: x.std(axis=(0,3)), forward_only=True)

test/test_ops.py::TestOps::test_std_one_in_axis
  /Users/shagler/build/tinygrad/test/test_ops.py:898: UserWarning: std(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
    helper_test_op([(1,2,3,1,5)], lambda x: x.std(axis=(0,3), correction=5), forward_only=True)

test/test_ops.py::TestOps::test_std_one_in_axis
  /Users/shagler/build/tinygrad/test/test_ops.py:901: UserWarning: std(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
    helper_test_op([(1,2,3,1,5)], lambda x: x.std(axis=(0,4), correction=5))

test/test_ops.py::TestOps::test_std_zero_in_axis
  /Users/shagler/build/tinygrad/test/test_ops.py:891: UserWarning: std(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
    helper_test_op([(1,0,3,0,5)], lambda x: x.std(axis=(1,3)))

test/test_ops.py::TestOps::test_std_zero_in_axis
  /Users/shagler/build/tinygrad/test/test_ops.py:892: UserWarning: std(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
    helper_test_op([(1,0,3,0,5)], lambda x: x.std(axis=(1,3), correction=0))

test/test_ops.py::TestOps::test_std_zero_in_axis
  /Users/shagler/build/tinygrad/test/test_ops.py:893: UserWarning: std(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
    helper_test_op([(1,0,3,0,5)], lambda x: x.std(axis=(1,3), correction=5))

test/test_ops.py::TestOps::test_var_one_in_axis
  /Users/shagler/build/tinygrad/test/test_ops.py:869: UserWarning: var(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
    helper_test_op([(1,2,3,1,5)], lambda x: x.var(axis=(0,3)), forward_only=True)

test/test_ops.py::TestOps::test_var_one_in_axis
  /Users/shagler/build/tinygrad/test/test_ops.py:871: UserWarning: var(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
    helper_test_op([(1,2,3,1,5)], lambda x: x.var(axis=(0,3), correction=5), forward_only=True)

test/test_ops.py::TestOps::test_var_one_in_axis
  /Users/shagler/build/tinygrad/test/test_ops.py:874: UserWarning: var(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
    helper_test_op([(1,2,3,1,5)], lambda x: x.var(axis=(0,4), correction=5), forward_only=True)

test/test_ops.py::TestOps::test_var_zero_in_axis
  /Users/shagler/build/tinygrad/test/test_ops.py:864: UserWarning: var(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
    helper_test_op([(1,0,3,0,5)], lambda x: x.var(axis=(1,3)))

test/test_ops.py::TestOps::test_var_zero_in_axis
  /Users/shagler/build/tinygrad/test/test_ops.py:865: UserWarning: var(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
    helper_test_op([(1,0,3,0,5)], lambda x: x.var(axis=(1,3), correction=0))

test/test_ops.py::TestOps::test_var_zero_in_axis
  /Users/shagler/build/tinygrad/test/test_ops.py:866: UserWarning: var(): degrees of freedom is <= 0. Correction should be strictly less than the reduction factor (input numel divided by output numel). (Triggered internally at /Users/runner/work/pytorch/pytorch/pytorch/aten/src/ATen/native/ReduceOps.cpp:1807.)
    helper_test_op([(1,0,3,0,5)], lambda x: x.var(axis=(1,3), correction=5))

test/test_tensor.py::TestTinygrad::test_tensor_list_special_values
  /Users/shagler/build/tinygrad/tinygrad/tensor.py:131: RuntimeWarning: overflow encountered in cast
    else: data = _fromnp(np.array(data).astype(_to_np_dtype(dtype)))

test/test_tensor.py::TestTinygrad::test_tensor_list_special_values
  /Users/shagler/build/tinygrad/test/test_tensor.py:310: RuntimeWarning: overflow encountered in cast
    np.testing.assert_allclose(Tensor(data, dtype=dtypes.float16).numpy(), np.array(data).astype(np.float16))

test/unit/test_disk_tensor.py::TestTorchLoad::test_load_resnet
  /opt/homebrew/lib/python3.12/site-packages/torch/serialization.py:1129: DeprecationWarning: Python 3.14 will, by default, filter extracted tar archives and reject files or modify their metadata. Use the filter argument to control this behavior.
    tar.extract('storages', path=tmpdir)

test/unit/test_disk_tensor.py::TestTorchLoad::test_load_resnet
  /opt/homebrew/lib/python3.12/site-packages/torch/serialization.py:1157: DeprecationWarning: Python 3.14 will, by default, filter extracted tar archives and reject files or modify their metadata. Use the filter argument to control this behavior.
    tar.extract('tensors', path=tmpdir)

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html
================================== short test summary info ===================================
FAILED test/test_gc.py::TestGC::test_gc - ReferenceError: weakly-referenced object no longer exists
FAILED test/test_gc.py::TestGC::test_gc_complex - ReferenceError: weakly-referenced object no longer exists
===== 2 failed, 1668 passed, 194 skipped, 36 xfailed, 181 warnings in 267.75s (0:04:27) ======

Not sure if this is something related to my setup, or an issue with tinygrad.

Just making an issue for tracking.

Can't repro on my machine. Often the gc tests fail if another test fails, but I don't see that here. If you are still hitting an issue, feel free to reopen with a full repro that I might be able to replicate

Oh, actually I can reproduce this. It looks like it's caused by contiguous_child