PyTorch code training may have memory leak
DrRyanHuang opened this issue · comments
2 days ago, I used gc
to analyze memory leaks.
It seemed that the data set was not released after training/eval for one epoch, but I was very unsure because I didn't have enough time to do it.
Hope this helps you solve this problem, I add these codes after train_one_epoch
.
# if cuda_empty_cache:
# del metric_logger
# gc.collect()
# # torch.cuda.empty_cache()
# print(f"Number of objects in gc.garbage: {len(gc.garbage)}")
# ann = []
# for cycle in cycles:
# if isinstance(cycle, dict) and 'bbox' in cycle:
# ann.append(cycle)
# for obj in ann:
# referrers = gc.get_referrers(obj)
# print(f"Referrers of {obj}: {referrers}")
# break
2 days ago, I used
gc
to analyze memory leaks. It seemed that the data set was not released after training/eval for one epoch, but I was very unsure because I didn't have enough time to do it.Hope this helps you solve this problem, I add these codes after
train_one_epoch
.# if cuda_empty_cache: # del metric_logger # gc.collect() # # torch.cuda.empty_cache() # print(f"Number of objects in gc.garbage: {len(gc.garbage)}") # ann = [] # for cycle in cycles: # if isinstance(cycle, dict) and 'bbox' in cycle: # ann.append(cycle) # for obj in ann: # referrers = gc.get_referrers(obj) # print(f"Referrers of {obj}: {referrers}") # break
hello would you mind providing the full file? I'm confused how to use your solution. For example, I don't understand what's contained in the cycles
variable.