Why nlohmann does not release memory

Question

Why nlohmann does not release memory

mohammad-nazari opened this issue 5 years ago · comments

We have two problems:
1- Why nlohmann use huge memory to parse data
2- After call parser in a function locally like below code it does not release memory. My JSON data size is about 8MB and the parser use more then 50MB for parsing. I parsed this JSON data 10 times and memory usage goes up to 600MB and after the function is finished memory did not released.

    GenerateNlohmann() {
      std::string filePath{FILE_ADDRESS};
      std::ifstream iFile(filePath.c_str(), std::ios::in);
      std::string data{};
      if (iFile.is_open()) {
        data = std::string((std::istreambuf_iterator<char>(iFile)),
                           std::istreambuf_iterator<char>()); // About 8MB size
        iFile.close();
      }
      if (!data.empty()) {
        nlohmann::json json = nlohmann::json::parse(data); // Use memory about 50MB
        std::vector<nlohmann::json> jsons{};
        for (int i = 0; i < 10; ++i) {
          nlohmann::json j = nlohmann::json::parse(data);
          jsons.emplace_back(j);
        }
        while (!jsons.empty()) {
          jsons.pop_back();
        }
      }
    }

        int main() {
          GenerateNlohmann();

    // Now memory usage is about 600MB
          std::cout << "Input a numberto exit" << std::endl;
          int i;
          std::cin >> i;

          return 0;
        }

Our platform is Ubuntu 18.04, CMake 3.15.3, g++ 7.4.0

Niels Lohmann · Answer 1 · Mon Feb 03 2020 20:38:44 GMT+0800 (China Standard Time)

This is strange, as the library does release all allocated memory in the destructor, and we have no issues with Valgrind and ASAN. I need to double check your code.

Mohammad Nazari · Answer 2 · Tue Feb 04 2020 02:42:03 GMT+0800 (China Standard Time)

Hi nlohmann, It is more strange if I say any things is OK in windows. So I am waiting for your response.
Thanks.

Isaac Nickaein · Answer 3 · Tue Feb 04 2020 19:56:33 GMT+0800 (China Standard Time)

This is indeed strange. It could be due to allocator not releasing memory and caching it for later use. More info here and here.

I couldn't release the memory using malloc_trim() as the SO answers mentioned, but reproduced a similar behavior (~300MB for VmSize reported at /proc/<pid>/status) after calling the following function:

void TestMemAllocation() 
{
    std::vector<std::vector<int>> all_v{};
    for (int i = 0; i < 100; i++) 
    {
        std::vector<int> v;
        v.resize(5000000);
        all_v.emplace_back(v);
    }
}

Mohammad Nazari · Answer 4 · Wed Feb 05 2020 00:58:25 GMT+0800 (China Standard Time)

This is indeed strange. It could be due to allocator not releasing memory and caching it for later use. More info here and here.

I couldn't release the memory using malloc_trim() as the SO answers mentioned, but reproduced a similar behavior (~300MB for VmSize reported at /proc/<pid>/status) after calling the following function:
void TestMemAllocation() 
{
    std::vector<std::vector<int>> all_v{};
    for (int i = 0; i < 100; i++) 
    {
        std::vector<int> v;
        v.resize(5000000);
        all_v.emplace_back(v);
    }
}

This is a memory paging policy.

Mohammad Nazari · Answer 5 · Wed Feb 05 2020 18:19:11 GMT+0800 (China Standard Time)

This is strange, as the library does release all allocated memory in the destructor, and we have no issues with Valgrind and ASAN. I need to double check your code.

No idea or solution?

Chen · Answer 6 · Fri Feb 07 2020 15:38:16 GMT+0800 (China Standard Time)

This is indeed strange. It could be due to allocator not releasing memory and caching it for later use. More info here and here.

I couldn't release the memory using malloc_trim() as the SO answers mentioned, but reproduced a similar behavior (~300MB for VmSize reported at /proc/<pid>/status) after calling the following function:
void TestMemAllocation() 
{
    std::vector<std::vector<int>> all_v{};
    for (int i = 0; i < 100; i++) 
    {
        std::vector<int> v;
        v.resize(5000000);
        all_v.emplace_back(v);
    }
}

It is strange. I had try to reproduce your problem both in windows and linux, but all is OK and there are no issues with Valgrind and ASAN.

I agree with @nickaein, and will try to test and verify.

Niels Lohmann · Answer 7 · Sat Feb 08 2020 23:08:13 GMT+0800 (China Standard Time)

I cannot reproduce your example. In my tests, all allocated memory is released.

Mohammad Nazari · Answer 8 · Sun Feb 09 2020 15:50:45 GMT+0800 (China Standard Time)

I cannot reproduce your example. In my tests, all allocated memory is released.

Did you test my example? How to check memory is released? If you test with Valgrind or ASAN or ... it's OK. but in monitoring system you can see the memory is not released. We continually call the parser in a big project and the memory goes up more and more. We checked all other modules and libraries, They were OK.

Niels Lohmann · Answer 9 · Sun Feb 09 2020 17:13:26 GMT+0800 (China Standard Time)

Yes:

You see the memory is going up during parsing, but then goes flat once GenerateNlohmann() is left.

Mohammad Nazari · Answer 10 · Sun Feb 09 2020 20:01:17 GMT+0800 (China Standard Time)

Yes:

You see the memory is going up during parsing, but then goes flat once GenerateNlohmann() is left.

Very thanks.
What is this tool? Please send me download page.

Niels Lohmann · Answer 11 · Sun Feb 09 2020 20:35:29 GMT+0800 (China Standard Time)

It’s Xcode.

Mohammad Nazari · Answer 12 · Sun Feb 09 2020 21:36:20 GMT+0800 (China Standard Time)

Xcode

I test example in Ubuntu. Check it in Ubuntu please;

Isaac Nickaein · Answer 13 · Sun Feb 09 2020 23:27:05 GMT+0800 (China Standard Time)

I still believe this is an optimization by the allocator (probably glibc in your case) and unrelated to the library.

After a closer examination, I see that by calling malloc_trim(0), the memory is indeed returned to the OS. Note that the virtual memory size might doesn't shrink which is non-issue, since glibc will (hopefully) keep the VM size below max). However, the size of resident memory (amount of memory residing on physical RAM) shrinks. Note the values of VmSize and VmRSS in two cases

# without calling malloc_trim
$ cat /proc/$(pidof without-malloc_trim.out)/status | grep Vm
VmPeak:	  571516 kB
VmSize:	  388364 kB
VmLck:	       0 kB
VmPin:	       0 kB
VmHWM:	  528560 kB
VmRSS:	  385600 kB
VmData:	  382576 kB
VmStk:	     136 kB
VmExe:	      84 kB
VmLib:	    3380 kB
VmPTE:	     800 kB
VmSwap:	       0 kB

# with calling malloc_trim
$ cat /proc/$(pidof with-malloc_trim.out)/status | grep Vm
VmPeak:	  571516 kB
VmSize:	  370860 kB
VmLck:	       0 kB
VmPin:	       0 kB
VmHWM:	  528632 kB
VmRSS:	    3436 kB
VmData:	  365072 kB
VmStk:	     136 kB
VmExe:	      84 kB
VmLib:	    3380 kB
VmPTE:	     764 kB
VmSwap:	       0 kB

Chen · Answer 14 · Mon Feb 10 2020 14:53:31 GMT+0800 (China Standard Time)

I had reproduce successful in ubuntu.
My json data size about 1.4 KB but i parsed this json data for many times.
Here are my test result:

Before Run:
KiB Mem :  8167476 total,  5461204 free,   284120 used,  2422152 buff/cache

1000 times:
KiB Mem :  8167476 total,  5456600 free,   288724 used,  2422152 buff/cache

10000 times:
KiB Mem :  8167476 total,  5405916 free,   339376 used,  2422184 buff/cache

100000 times:
KiB Mem :  8167476 total,  4893176 free,   852104 used,  2422196 buff/cache

After input the int (After run)
KiB Mem :  8167476 total,  5462208 free,   283116 used,  2422152 buff/cache

There is indeed a problem, but this is an optimization by the allocator (probably glibc in your case) and unrelated to the library, as @nickaein said.

If you add malloc_trim(0) in your code:

	while (!jsons.empty()) {
	  jsons.pop_back();
	}
	
+	malloc_trim(0);
  }

you will find everythink will be OK.
In windows we can not reproduce because what we use is not glibc, i think.

Other Test:
I had wrote other program to malloc many small memory with glibc, and the problem will still alive.My program is unrelated to the library, it just malloc and free many small memory.

Anyway, the probleam is unrelated to the library.
If we add malloc_trim(0) into the library, there will be many calls during the parsering, which will reduce performance.So the better solution is add malloc_trim(0) in your code.

Mohammad Nazari · Answer 15 · Mon Feb 10 2020 15:20:24 GMT+0800 (China Standard Time)

Very thanks @dota17. And thanks to @nlohmann , @nickaein . You saved me and our team.
Just the last question: Is it possible to ignore glibc in linux based projects to avoid using malloc_trim(0)? We should find bottlenecks places in code and inject malloc_trim(0).

Isaac Nickaein · Answer 16 · Mon Feb 10 2020 17:41:12 GMT+0800 (China Standard Time)

I highly doubt this behavior can cause any problems in most applications, except maybe and only maybe in very memory-constrained situations like an embedded systems with very small physical memory. Note that the memory allocated by this optimization isn't going to grow infinitely. You can verify that by calling GenerateNlohmann() multiple times. The cached memory stays the same.

Nevertheless, if this is really problematic in your case, these are the workarounds I can think of:

glibc provides a tuning framework which you can use to tune memory allocation parameters. You might play around the parameters to see which combination can diminish the caching (e.g. trim_threshold, arena_max, mxfast, etc).
As mentioned earlier, calling malloc_trim(0) after a large de-allocation would be a solution.
Use other standard library implementations (a comparison on some libraries). Note that other libraries probably have implemented such caching too, since calling into kernel for each free() would be significantly impacts the performance.
JSON library supports overriding the default allocator (an example). You might try other allocators like tcmalloc to see if they are helpful.

Mohammad Nazari · Answer 17 · Mon Feb 10 2020 20:13:49 GMT+0800 (China Standard Time)

I highly doubt this behavior can cause any problems in most applications, except maybe and only maybe in very memory-constrained situations like an embedded systems with very small physical memory. Note that the memory allocated by this optimization isn't going to grow infinitely. You can verify that by calling GenerateNlohmann() multiple times. The cached memory stays the same.

Nevertheless, if this is really problematic in your case, these are the workarounds I can think of:

glibc provides a tuning framework which you can use to tune memory allocation parameters. You might play around the parameters to see which combination can diminish the caching (e.g. trim_threshold, arena_max, mxfast, etc).

As mentioned earlier, calling malloc_trim(0) after a large de-allocation would be a solution.

Use other standard library implementations (a comparison on some libraries). Note that other libraries probably have implemented such caching too, since calling into kernel for each free() would be significantly impacts the performance.

JSON library supports overriding the default allocator (an example). You might try other allocators like tcmalloc to see if they are helpful.

Great suggestions @nickaein. I will analyse your options and select the best solution.
Thanks.

Chen · Answer 18 · Mon Feb 10 2020 20:42:17 GMT+0800 (China Standard Time)

Unfortunately, I had tested the library - tcmalloc，the problem will be alive.
More info : link