bloomberg / memray

Bug Report

I've recently tried combining flaky and pytest-memray with the leak analysers. On GitHub Actions, the pytest process was being issued SIGKILL from the kernel. Which is weird.

The behaviour is that when pytest is running, the memory usage slowly escalates (toward 10GBi or the machine runs out of memory).

So I looked and looked at which tests are leaking memory. Yes, there were some. But then it crossed my mind to take my library out of the equation and just test Stdlib modules instead.

Yes, I appreciate this is a ridiculous issue to raise. But I think that either memray or flaky is leaking significant memory. I might be totally wrong or doing something silly in these tests and we can all ignore this issue :-)

memray 1.5.0
pytest 7.4.2
flaky 3.7.0
Python 3.10.12 (main, Jun 7 2023, 19:32:10) [GCC 10.2.1 20210110]

Current Behavior A clear and concise description of the behavior.

Input Code

REPL or Repo link if applicable:

import sys
import pytest
import logging

def filter_gc(stack):
    for frame in stack.frames[:4]:
        if "bananacore" in frame.filename:
            return True

    return False


levels = [
    (logging.DEBUG, "DEBUG"),
    (logging.INFO, "INFO"),
    (logging.WARNING, "WARNING"),
    (logging.ERROR, "ERROR"),
    (logging.CRITICAL, "CRITICAL"),
    (logging.NOTSET, "NOTSET"),
]


@pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
@pytest.mark.parametrize("level, level_name", levels)
def test_getlevelname(level, level_name):
    assert logging.getLevelName(level) == level_name
    assert logging.getLevelName(level_name) == level


@pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
def test_value_error_invalid_string_names():
    with pytest.raises(ValueError):
        assert logging.getLevelName("EXample") == "Level EXample"


junk_level_names = [None, 3.2, (), [], {}, 100]


@pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
@pytest.mark.parametrize("level", junk_level_names)
def test_getlevelname_invalid_level(level):
    with pytest.raises((TypeError, ValueError)):
        logging.getLevelName(level)


@pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
def test_root_logger_critical(capsys):
    logging.root.handlers = []
    logging.critical("test")

    cap = capsys.readouterr()
    assert cap.out == ""
    assert cap.err == "CRITICAL:root:test\n"


@pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
def test_root_logger_fatal(capsys):
    logging.root.handlers = []
    logging.fatal("test")

    cap = capsys.readouterr()
    assert cap.out == ""
    assert cap.err == "CRITICAL:root:test\n"


@pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
def test_root_logger_error(capsys):
    logging.root.handlers = []
    logging.error("test")

    cap = capsys.readouterr()
    assert cap.out == ""
    assert cap.err == "ERROR:root:test\n"


@pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
def test_root_logger_exception(capsys):
    logging.root.handlers = []
    logging.exception("test", exc_info=Exception("bork bork bork"))

    cap = capsys.readouterr()
    assert cap.out == ""
    assert cap.err == "ERROR:root:test\nException: bork bork bork\n"


@pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
def test_root_logger_warning(capsys):
    logging.root.handlers = []
    logging.warning("test")

    cap = capsys.readouterr()
    assert cap.out == ""
    assert cap.err == "WARNING:root:test\n"


@pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
def test_root_logger_warn(capsys):
    logging.root.handlers = []
    logging.warn("test")

    cap = capsys.readouterr()
    assert cap.out == ""
    assert cap.err == "WARNING:root:test\n"


@pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
def test_root_logger_info(capsys):
    logging.root.handlers = []
    logging.info("test")

    cap = capsys.readouterr()
    assert cap.out == ""
    assert cap.err == ""


@pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
def test_root_logger_debug(capsys):
    logging.root.handlers = []
    logging.debug("test")

    cap = capsys.readouterr()
    assert cap.out == ""
    assert cap.err == ""


@pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
def test_root_logger_log():
    logging.root.handlers = []
    logging.log(logging.DEBUG, "test")


@pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
def test_basic_config_with_stream_and_filename_without_handlers():
    logging.root.handlers = []

    with pytest.raises(ValueError):
        logging.basicConfig(stream=sys.stderr, filename="log.txt")


@pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
def test_basic_config_with_stream_or_filename_with_handlers():
    handler = logging.StreamHandler(sys.stderr)

    with pytest.raises(ValueError):
        logging.basicConfig(handlers=[handler], stream=sys.stdout)


@pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
def test_basic_config_invalid_style():
    with pytest.raises(ValueError):
        logging.basicConfig(style="!")


@pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
def test_basic_config_with_level():
    logging.basicConfig(level=logging.INFO)
    assert logging.root.level == logging.INFO


@pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
def test_basic_config_invalid_arguments():
    logging.root.handlers = []
    with pytest.raises(ValueError):
        logging.basicConfig(invalid_argument="value")


@pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
def test_make_log_record():
    log_record = logging.makeLogRecord({"levelno": logging.WARNING})

    assert log_record.levelno == logging.WARNING


@pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
@pytest.mark.parametrize("encoding", ["utf-8", None])
def test_basic_config_encoding(encoding):
    logging.basicConfig(filename="test.txt", encoding=encoding)

Running this code with memray, natives, trace python allocators and forcing flaky runs for 9-10 runs:

python -m pytest testproject/ --memray --stacks=7 --native --trace-python-allocators  
--force-flaky --min-passes=9 --max-runs=10
========================================================= test session starts =========================================================
platform linux -- Python 3.10.12, pytest-7.4.2, pluggy-1.3.0
rootdir: /workspaces/picologging
plugins: repeat-0.9.1, flaky-3.7.0, Faker-19.6.2, hypothesis-6.87.0, memray-1.5.0, anyio-4.0.0, cov-4.1.0
collected 30 items                                                                                                                    

testproject/test_logging.py ......FFF...F..........FFF...                                                                       [100%] [100%] [100%] [100%] [100%] [100%] [100%] [100%]. [100%]

============================================================== FAILURES ===============================================================
________________________________________________ test_value_error_invalid_string_names ________________________________________________

    @pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
    def test_value_error_invalid_string_names():
>       with pytest.raises(ValueError):
E       Failed: DID NOT RAISE <class 'ValueError'>

testproject/test_logging.py:32: Failed
________________________________________________ test_getlevelname_invalid_level[None] ________________________________________________

level = None

    @pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
    @pytest.mark.parametrize("level", junk_level_names)
    def test_getlevelname_invalid_level(level):
>       with pytest.raises((TypeError, ValueError)):
E       Failed: DID NOT RAISE (<class 'TypeError'>, <class 'ValueError'>)

testproject/test_logging.py:42: Failed
________________________________________________ test_getlevelname_invalid_level[3.2] _________________________________________________

level = 3.2

    @pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
    @pytest.mark.parametrize("level", junk_level_names)
    def test_getlevelname_invalid_level(level):
>       with pytest.raises((TypeError, ValueError)):
E       Failed: DID NOT RAISE (<class 'TypeError'>, <class 'ValueError'>)

testproject/test_logging.py:42: Failed
________________________________________________ test_getlevelname_invalid_level[100] _________________________________________________

level = 100

    @pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
    @pytest.mark.parametrize("level", junk_level_names)
    def test_getlevelname_invalid_level(level):
>       with pytest.raises((TypeError, ValueError)):
E       Failed: DID NOT RAISE (<class 'TypeError'>, <class 'ValueError'>)

testproject/test_logging.py:42: Failed
_______________________________________ test_basic_config_with_stream_or_filename_with_handlers _______________________________________

    @pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
    def test_basic_config_with_stream_or_filename_with_handlers():
        handler = logging.StreamHandler(sys.stderr)
    
>       with pytest.raises(ValueError):
E       Failed: DID NOT RAISE <class 'ValueError'>

testproject/test_logging.py:144: Failed
___________________________________________________ test_basic_config_invalid_style ___________________________________________________

    @pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
    def test_basic_config_invalid_style():
>       with pytest.raises(ValueError):
E       Failed: DID NOT RAISE <class 'ValueError'>

testproject/test_logging.py:150: Failed
____________________________________________________ test_basic_config_with_level _____________________________________________________

    @pytest.mark.limit_leaks("64B", filter_fn=filter_gc)
    def test_basic_config_with_level():
        logging.basicConfig(level=logging.INFO)
>       assert logging.root.level == logging.INFO
E       assert 30 == 20
E        +  where 30 = <RootLogger root (WARNING)>.level
E        +    where <RootLogger root (WARNING)> = logging.root
E        +  and   20 = logging.INFO

testproject/test_logging.py:157: AssertionError
========================================================== warnings summary ===========================================================
testproject/test_logging.py::test_root_logger_warn
testproject/test_logging.py::test_root_logger_warn
testproject/test_logging.py::test_root_logger_warn
testproject/test_logging.py::test_root_logger_warn
testproject/test_logging.py::test_root_logger_warn
testproject/test_logging.py::test_root_logger_warn
testproject/test_logging.py::test_root_logger_warn
testproject/test_logging.py::test_root_logger_warn
testproject/test_logging.py::test_root_logger_warn
  /workspaces/picologging/testproject/test_logging.py:99: DeprecationWarning: The 'warn' function is deprecated, use 'warning' instead
    logging.warn("test")

-- Docs: https://docs.pytest.org/en/stable/how-to/capture-warnings.html

============================================================ MEMRAY REPORT ============================================================
Allocation results for testproject/test_logging.py::test_root_logger_exception at the high watermark

         📦 Total memory allocated: 3.8KiB
         📏 Total allocations: 348
         📊 Histogram of allocation sizes: |  ▂ ▇ █ ▄|
         🥇 Biggest allocating functions:
                - __init__:/home/vscode/.local/lib/python3.10/site-packages/exceptiongroup/_formatting.py:47 -> 216.0B
                - format:/home/vscode/.local/lib/python3.10/site-packages/exceptiongroup/_formatting.py:248 -> 208.0B
                - wrapper:/home/vscode/.local/lib/python3.10/site-packages/pytest_memray/plugin.py:191 -> 168.0B
                - error:/usr/local/lib/python3.10/logging/__init__.py:1506 -> 168.0B
                - error:/usr/local/lib/python3.10/logging/__init__.py:2105 -> 168.0B


Allocation results for testproject/test_logging.py::test_root_logger_warn at the high watermark

         📦 Total memory allocated: 2.5KiB
         📏 Total allocations: 222
         📊 Histogram of allocation sizes: |▁ ▇ █ ▁  |
         🥇 Biggest allocating functions:
                - validate:/usr/local/lib/python3.10/logging/__init__.py:428 -> 1.1KiB
                - validate:/usr/local/lib/python3.10/logging/__init__.py:428 -> 168.0B
                - wrapper:/home/vscode/.local/lib/python3.10/site-packages/pytest_memray/plugin.py:191 -> 168.0B
                - _showwarnmsg:/usr/local/lib/python3.10/warnings.py:112 -> 128.0B
                - validate:/usr/local/lib/python3.10/logging/__init__.py:428 -> 80.0B


Allocation results for testproject/test_logging.py::test_basic_config_invalid_arguments at the high watermark

         📦 Total memory allocated: 2.2KiB
         📏 Total allocations: 149
         📊 Histogram of allocation sizes: |  █ ▅    |
         🥇 Biggest allocating functions:
                - validate:/usr/local/lib/python3.10/logging/__init__.py:428 -> 1.1KiB
                - validate:/usr/local/lib/python3.10/logging/__init__.py:428 -> 168.0B
                - raises:/home/vscode/.local/lib/python3.10/site-packages/_pytest/python_api.py:937 -> 83.0B
                - validate:/usr/local/lib/python3.10/logging/__init__.py:428 -> 80.0B
                - __init__:/usr/local/lib/python3.10/logging/__init__.py:790 -> 80.0B


Allocation results for testproject/test_logging.py::test_root_logger_critical at the high watermark

         📦 Total memory allocated: 2.1KiB
         📏 Total allocations: 212
         📊 Histogram of allocation sizes: |▂ █ ▅ ▂  |
         🥇 Biggest allocating functions:
                - validate:/usr/local/lib/python3.10/logging/__init__.py:428 -> 1.1KiB
                - wrapper:/home/vscode/.local/lib/python3.10/site-packages/pytest_memray/plugin.py:191 -> 168.0B
                - validate:/usr/local/lib/python3.10/logging/__init__.py:428 -> 168.0B
                - __init__:/usr/local/lib/python3.10/logging/__init__.py:790 -> 80.0B
                - validate:/usr/local/lib/python3.10/logging/__init__.py:428 -> 80.0B


Allocation results for testproject/test_logging.py::test_root_logger_fatal at the high watermark

         📦 Total memory allocated: 2.1KiB
         📏 Total allocations: 212
         📊 Histogram of allocation sizes: |▂ █ ▅ ▂  |
         🥇 Biggest allocating functions:
                - validate:/usr/local/lib/python3.10/logging/__init__.py:428 -> 1.1KiB
                - validate:/usr/local/lib/python3.10/logging/__init__.py:428 -> 168.0B
                - wrapper:/home/vscode/.local/lib/python3.10/site-packages/pytest_memray/plugin.py:191 -> 168.0B
                - __init__:/usr/local/lib/python3.10/logging/__init__.py:790 -> 80.0B
                - validate:/usr/local/lib/python3.10/logging/__init__.py:428 -> 80.0B


===Flaky Test Report===

test_getlevelname[10-DEBUG] passed 1 out of the required 9 times. Running test again until it passes 9 times.
test_getlevelname[10-DEBUG] passed 2 out of the required 9 times. Running test again until it passes 9 times.
...
test_basic_config_encoding[None] passed 9 out of the required 9 times. Success!

===End Flaky Test Report===
======================================================= short test summary info =======================================================
FAILED testproject/test_logging.py::test_value_error_invalid_string_names - Failed: DID NOT RAISE <class 'ValueError'>
FAILED testproject/test_logging.py::test_getlevelname_invalid_level[None] - Failed: DID NOT RAISE (<class 'TypeError'>, <class 'ValueError'>)
FAILED testproject/test_logging.py::test_getlevelname_invalid_level[3.2] - Failed: DID NOT RAISE (<class 'TypeError'>, <class 'ValueError'>)
FAILED testproject/test_logging.py::test_getlevelname_invalid_level[100] - Failed: DID NOT RAISE (<class 'TypeError'>, <class 'ValueError'>)
FAILED testproject/test_logging.py::test_basic_config_with_stream_or_filename_with_handlers - Failed: DID NOT RAISE <class 'ValueError'>
FAILED testproject/test_logging.py::test_basic_config_invalid_style - Failed: DID NOT RAISE <class 'ValueError'>
FAILED testproject/test_logging.py::test_basic_config_with_level - assert 30 == 20
============================================== 7 failed, 23 passed, 9 warnings in 56.35s ==============================================

$ python -m pytest testproject --memray --stacks=7 --native --trace-python-allocators

Also uses significant memory but not the same total amount.

I'm not able to run my entire test suite because it gets killed by OOMKiller.

Who memory profiles the memory profilers?

Well, to my immense surprise, I can reproduce this. I'll need to spend some time figuring out what is going on. Our working set size shouldn't be nearly this large, so clearly more is being kept alive from one test to the next than we intended.

Some initial notes:

It's not related to flaky; I see the same thing with pytest-repeat, and I'm confident we'd see the same with tests that are just parametrized to run many times
Memray 1.5.0 is fairly old, but the issue reproduces even with the latest version, so it's not something we've fixed already
While the process is chugging through memory, I've spotted 65k entries in /proc/$pid/maps, which is... a lot. We've got many copies of every shared library loaded in memory, presumably from either the unwinding or symbolifying that we need to do to support native traces...

So, yep, real issue. We're leaking the state associated with libbacktrace's unwinding for each test case. We have to leak it - the API of libbacktrace is designed so that's the only option, surprisingly:

memray/src/vendor/libbacktrace/backtrace.h

Lines 81 to 85 in 98032ae

    
              Calling this function allocates resources that cannot be freed. 
        
              There is no backtrace_free_state function.  The state is used to 
        
              cache information that is expensive to recompute.  Programs are 
        
              expected to call this function at most once and to save the return 
        
              value for all later calls to backtrace functions.  */

But, we ought to be reusing it from one test case to the next, instead of leaking it between each test and creating a new one for the next test. This has been wrong for a long time, but we never noticed because only pytest-memray reads many different capture files during the lifetime of the program - it's not possible to trigger this misbehavior through the public Memray APIs or the memray CLI entry point, because they only ever operate on one single capture file.

It's easy enough to fix - we just need to move some state to be explicitly global rather than recreating it each time a new capture file is read, and wrap a lock around it. While looking at the surrounding code I spotted some other things that I don't like, though, so I'm gonna try to clean up a few different things here...

So, yep, real issue

Is it wrong that I breathed a sigh of relief?

Thanks for figuring it out. I'll update and retest when there's a patch.

@tonybaloney the latest release, includes a bug fix for this. Hopefully it works this time. 😊

	Calling this function allocates resources that cannot be freed.
	There is no backtrace_free_state function. The state is used to
	cache information that is expensive to recompute. Programs are
	expected to call this function at most once and to save the return
	value for all later calls to backtrace functions. */

Memray leaks native backtrace state

Bug Report