alex-sherman / deco

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Memory isn't freed when creating and returning large objects in concurrent function

twolf90 opened this issue · comments

First of all thanks for the great tool, it's really easy to use.

However I've found a strange behaviour when creating and returning relatively large structures (e.g. a list with 1000000) in the concurrent function and returning them to the synchronized function. The memory allocated for the list in the concurrent function is simply never freed again. I guess some reference is leftover maybe!? I've build a little stand-alone code to reproduce the problem:

The setup looks like the following:

import multiprocessing
from deco import concurrent, synchronized


@concurrent
def my_conc():
    tmp = range(1000000)
    return tmp


@synchronized
def my_sync(my_dict):
    new_dict = {}

    for key, value in my_dict.iteritems():
        new_dict[key] = my_conc()


def main():
    cpus = multiprocessing.cpu_count()
    my_dict = {}
    for i in xrange(cpus):
        my_dict[i] = 0

    for i in xrange(100):
        print i
        my_sync(my_dict)


if __name__ == '__main__':
    main()

So depending on the number of cpus I build n lists with 1000000 ints, and call the synchronized consecutively in a for loop. The allocated memory basically increases until all of it is used and my pc starts swapping...

As soon as I remove the decoraters everything works fine (although not concurrent) ;). Also this only happens if I return tmp in the my_conc() function. Once I replace it with 'return 0' everythings fine again.

I'm sorry if I misunderstood some limitation of the tool - it's my first time using parallel processing in python.

Thanks in advance!

That is a great catch actually! This bug had some logical errors along with it, so pretty important to have resolved. The commit I just made c8ecc63 should sort it out, although it's a pretty small change. Thanks for finding this!

If you're interested in what was going on, during the synchronized function execution deco keeps track of all the assignments that happen. This includes a reference to Pool's async result object, which will include a reference to the return result of the concurrent function. During each synchronization event, deco goes through the list and performs the assignments. Turns out I had forgotten to clear the list of assignments though! So every event would perform every assignment over again, and also caused this memory leak since it maintained a reference to all this.

Once again, really appreciate you finding this! Let me know if there is still some memory leak in your application or feel free to close this.