Stonesjtu / pytorch_memlab

Profiling and inspecting memory in pytorch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Request: solving the lack of incremental reporting in loops / functions

stas00 opened this issue · comments

This is a great tool for finding where the memory has gone - thank you!

I have a request:

  1. add support for partial loop unrolling - reporting first iterations separately
  2. same for functions

Problem:

Memory is being reported incorrectly for any loop or a function, since those are repeated multiple times, it doesn't show how the peak/active memory counters were progressing in, say, first iteration and shows the data for the whole loop/function after it has run more than once. It's correct for the final iteration, but not the first one and it's crucial to see the first moment the memory usage has gone up and the peak.

This functionality is typically not needed for a normal memory profiler, since all we want to know there is frequency of calls and total usage for the given line, but here if it's an investigation tool we need to see the first few uses. I hope I was able to convey the issue clearly.

I tried to solve this manually by unrolling the loop in the code I was profiling and replicating the iteration code multiple times, which is not very sustainable.

It's also typical that there is a change in memory footprint from iteration 1 to 2 and then things stabilize at iteration 3 and onward (if there is no leak that is). So probably there could be an option to record and prints 3 different stats:

  • iteration 1
  • iteration 2
  • all iterations (like it's done now)

same applies to functions.

I'm thinking perhaps the low-hanging fruit is to give users an option to record a loop iteration or function just the first time it's run and report that. That already would be very useful and perhaps not to difficult to implement.

Thank you!

Hi Stas,

Thanks for your detailed feature description.

I would like to propose a sample to make sure I get the point.

statement unrolling:

Suppose I have such a funtion:

@profile
def func():
    net = torch.nn.Linear(55)
    x = torch.Tensor(5,5)
    for _ in range(10):
        x = net(x)

The expected output should be:

def func():
    for _ in range(10):  # first hit
        x = net(x)
    for _ in range(10):  # second hit
        x = net(x)
    for _ in range(10):  # third hit (probably)
        x = net(x)
    # ...

function unrolling

If you want to profile such a function like inner:

def inner(x):
    x = net(x)
    return x

def outter(x):
    for _ in range(10):
        x = inner(x)

you can simply add @profile_every to the inner for memory stats in every iteration,
which prints something like:

def inner(x):
    x = net(x)
    return x
def inner(x):
    x = net(x)
    return x
def inner(x):
    x = net(x)
    return x
def inner(x):
    x = net(x)
    return x

For functions, yes, @profile_every(1) does the trick, thank you! And would it be possible to add a limit? Say my function runs 100 times, and I want to see it's 1st, 2nd, and last time and not deal with a huge output which is not needed? Perhaps @profile_these_times(1,2,-1) ? Or if it's too complicated, perhaps just adding a stop as in @profile_every(1,3) == profile every run and stop after 3 times.

for loops, no, we want to do the same as the function, so that the body of the loop could be profiled. That is we write it symbolically:

     @profile_every(1) # not legal python
     for _ in range(10): 
        x = net(x)

output:

        x = net(x)  # first hit
        x = net(x)  # 2nd hit
        x = net(x)  # 3rd hit
        [...]
        x = net(x)  # 10th hit

I hope I was able to explain in the OP how one doesn't get the incremental info when the loop is repeated multiple times.

One workaround would be to turn the body of the loop into a function and profile it, but that may require significant alternations to the user's code.