microsoft / WSL

Issues found on WSL

Home Page:https://docs.microsoft.com/windows/wsl

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

WSL 2 consumes massive amounts of RAM and doesn't return it

LordMonoxide opened this issue · comments

  • Your Windows build number: 18917

  • What's wrong / what should be happening instead: WSL 2 starts using huge amounts of RAM after a while, just using it like normal. At the moment I'm using phpstorm, and did a dump/load of a database. Vmmem is using 7 GB of my 16 GB of RAM and not returning any, even though Ubuntu is actually using much less. I have seen it grow until nearly 100% of my system memory is in use, and it will not release it until I shut down the WSL 2 VM.

This may or may not be related to #4159

corey@Corey-Laptop:/mnt/c/WINDOWS/system32$ vmstat -s
     15235516 K total memory
       920348 K used memory
      1886048 K active memory
      6434312 K inactive memory
      6606548 K free memory
        76280 K buffer memory
      7632340 K swap cache
            0 K total swap
            0 K used swap
            0 K free swap
       163729 non-nice user cpu ticks
          298 nice user cpu ticks
        13177 system cpu ticks
     68988300 idle cpu ticks
         8962 IO-wait cpu ticks
            0 IRQ cpu ticks
        10022 softirq cpu ticks
            0 stolen cpu ticks
      1481417 pages paged in
      6792976 pages paged out
            0 pages swapped in
            0 pages swapped out
      1079177 interrupts
      5131981 CPU context switches
   1560599814 boot time
         8772 forks

Same here with Docker running on Ubuntu 18.04
image

Thanks for opening the issue. We have a fix for this in the works.

After moving to WSL2, VmMem seems to be constantly pegging my CPU. Please see attached screenshots.

Anything I can do to help your team troubleshoot?

Relevant details:

  • Ubuntu 18.04 and Debian on WSL2, but none are running anything at the moment.
  • Windows insider build 18917.1000 on Macbook Air (8 GB, 4 proc).

image
image

After moving to WSL2, VmMem seems to be constantly pegging my CPU. Please see attached screenshots.

I'm experiencing this too. It happens every time I run wsl after some elapsed time (varying from minutes to hours). It continues regardless of what's running in wsl, and I need to issue a wsl --shutdown to stop it.

Windows insider build: 18922.1000
Linux distro: 18.04

I assume this really belongs in a new issue, but wanted to group my comment with that of @mithunshanbhag so I'll leave it here.

Hi,
Same problem here.
Lots of memory consumption with WSL2, nodejs app and VS Code with Remote - WSL while very low consumption in WSL1 with the same tasks.

Same issue on Win build 18950

@benhillis - Do we have a rough ETA on the fix? I'm actually hitting OOM issues on a machine with 32 gigs while just running a normal ninja build that should succeed.

Build 18955

@zachChilders - ETA on this one is a bit hard unfortunately. Certainly before WSL2 ships to non-insiders.

Can confirm this on 18963 as well. WSL 2 ate most (29GB+ of 32GB) of my RAM after less than 15 minutes of uptime for no good reason; all I've done is a rsync from the host NTFS volume to the ext4 volume, and an apt-get update.

I can confirm the same issue on Microsoft Windows [Version 10.0.18963.1000]

C:\Users\david_d>wsl -l -v
NAME STATE VERSION

  • Ubuntu-18.04 Running 2
    Ubuntu Stopped 1

image

I at the time was running gdb

how about a workaround? this just started happening for me (with 18965) and it has effectively killed my WSL 2 environment

Depending of your usage, nocache utility can be used as a workaround to greatly reduce WSL 2 memory consumption, especially if whatever you're running inside it does a lot of filesystem operations...

does nocache follow children? I could start my shell with "nocache"...

Haven't tried, but if I understood how nocache works correctly, it should...

thanks! nocache has changed my env from "completely broken" to "merely sucks"

thanks! nocache has changed my env from "completely broken" to "merely sucks"

Hi, could you explain how nocache should be used?

Hi @benhillis , any update on when a fix for this might be deployed to insiders? My 8gb of ram is really getting hammered by this issue... :(

I wrap compilations in nocache to keep limping along. wrapping nocache around a shell works, but breaks emacs. I've also ordered more RAM :-(

Sorry no firm timeline, certainly before WSL2 is out of Insider builds though.

Hi @benhillis , do you have a workaround you could recommend though ?

Well...I need it all day long and the memory consumption goes crazy after only a few minutes (using docker and a quite heavy stack: elastic search, lots of workers, some databases etc). So...not an option unfortunately. Maybe go back to linux while it is not fixed 😎

I am facing the same issue. Is there any workaround (except shutting wsl down) for this @benhillis ?

Not currently. For some additional context this change requires some changes to the Linux kernel that are in the process of being upstreamed. We will be integrating these changes into the WSL2 kernel as soon as we can.

Workaround: Create a %UserProfile%\.wslconfig file in Windows and use it to limit memory assigned to WSL2 VM.

Example

[wsl2]
memory=6GB
swap=0
localhostForwarding=true

This will still consume the entire 6GBs regardless of Linux memory usage, but at least it'll stop growing more than that.

Supported settings are documented here.

Hey @apostolos, got here from Google after running into the memory leak issue in my WSL2 instance. I'm on build 18970 and I created a .wslconfig file in my home directory as follows:

[wsl2]
memory=2GB
swap=0
localhostForwarding=true

but it doesn't seem to have any effect on the memory actually allocated by the instance - right now I'm running a compile job and it's at 5.6GB allocated and climbing. Do I have to reboot or do something else to get WSL2 to detect the configuration?

@apostolos @pikajude didn't work for me either. Ditched WSL 2 for now due to that.

@arthurgeron and @pikajude I created a .wslconfig file this morning and it's been working for me. Did you run wsl --shutdown to fully restart WSL?

@joemaller which build are you on? I've run wsl --shutdown many, many times since creating the config file. Also what distro?

@pikajude Oh so many times... I'm running build 18970, with the unlabeled Ubuntu (18.04.3 LTS), clean installed yesterday

Same here :/ I wonder why yours works and mine doesn't.

@pikajude I'm not sure. I haven't done anything special. This is a clean Ubuntu 18.04.3 LTS installation (not upgraded from WSL1). OS Build is 18970.1005

You placed the .wslconfig on your Windows home dir and not on Linux home right?

I've now capped it at 8GB, still works fine:

wslconfig

@apostolos I had place it in my Windows home config.... Will try again later today

@apostolos Yeah, I'm not very familiar with windows so I typed %UserProfile% into Explorer and it took me to my home directory, so I created the file there at C:\Users\me\.wslconfig. I did edit it with vim inside my WSL instance, so I wonder whether the line endings are different and if that interferes with it.

@pikajude I saved the file from Windows. Vim reports dos line endings:
".wslconfig" [noeol][dos] 4L, 52C

Okay, I fixed this. when I opened the wslconfig file in Notepad the encoding was set to UTF-16 LE, which maybe is the default encoding for text files? Anyway, I used File > Save As and picked UTF-8 instead, and the setting now seems to be respected.

Using UTF-8 CRLF didn't work for me. I had to use the encoding DOS CP437 CRLF in VSCODE to get it working.

I'm facing the same issue when I build Docker images.
Basically the more images I build the more ram is used. It looks like that WSL2 isn't releasing the memory after every build and so the usage keeps spiking.

I would like to note that even though the wslconfig file does prevent my instance from consuming all my system's RAM, it's not actually releasing any unused RAM to the instance itself, which means all of my WSL processes eventually OOM given enough time. I've downgraded to WSL1 to work around this for now, even though the FS performance is nothing to write home about. Anxiously awaiting the fix! :)

In a laptop with 8GB of RAM, setting the WSL2 VM RAM to 4GB and increasing the swap partition with a SSD allowed me to perform operations that were failing before, just setting a lower WSL2 VM RAM.

Example %UserProfile%\.wslconfig file

[wsl2]
memory=4GB
swap=16GB
localhostForwarding=true
commented

Same thing happens in build 18990.

It happens with other hyper-v VMs or is wsl2 specific problem?

A fix is on the way, ETA ~2 weeks.

I ran this in my ubuntu environment and it seems to have cleared the cache.
must sudo su first
sync; echo 1 > /proc/sys/vm/drop_caches

** putting in on a cronjob as root works great too, seems to be allowing me to import a large db into my database via docker / mysql and have not run out of memory :)

will the fix come through Windows update in Fast Insiders mode. or is there another way?

I ran this in my ubuntu environment and it seems to have cleared the cache.
must sudo su first
sync; echo 1 > /proc/sys/vm/drop_caches

** putting in on a cronjob as root works great too, seems to be allowing me to import a large db into my database via docker / mysql and have not run out of memory :)

This did not seem to work for me; had to wsl --shutdown instead.

[Build 18995]

It doesn't release, it just clears the cache so the vm can use it again. I have it running on a 1-minute cron job and it works. However, what it does is clear all caches and should not be run in production or where lots of cache is needed.

@pcottrell good to know -- this might explain why when building docker containers I get weird allocation issues from the underlying VM when trying to build the container.

@benhillis , any updated ETA for the fix?

does not appear to be a part of the 19002 insiders build

I've been running into this more and more where all memory will be consumed by the VM instance: this causes programs running into Windows to crash because they can't allocate more memory. Then, I attempt to reboot and the computer hangs trying to reboot and need to be forced off using the power button.

I think the reboot issue is separate as I've been running into that even when I don't use WSL.

Ah, good to know.

The reboot issue got sorted in an update yesterday. Available in the fast ring.

Sounds great, I'll upgrade accordingly.

@cmeiklejohn However if you have AMD GPU,I strongly recommend NOT to upgrade to this update (Windows Version 19002.1002).In this version,AMD GPU performance will be pretty bad.It seems to be influenced by the DirectX Driver issue.
Related Discussion: https://amp.reddit.com/r/windowsinsiders/comments/dk3mo1/windows_insider_build_19002_bad_performance_with/

I'm on Surface Book 2 13", so it's the NVIDIA GPU that just got fixed. :)

I just bumped to 19002 and I've still got the reboot issue -- is there an issue somewhere on GitHub tracking this issue?

@cmeiklejohn please note that it got stuck at reboot for me one last time upon the update was downloaded and was about to be installed, then I force rebooted and attempted the installation again. Then it actually got installed and work from then on.

It seems like this is diverting from the original issue.

@benhillis Can we get an update on the fix for the VM memory issue? ETA was 2 weeks almost 3 weeks ago.

@jimjenkins5 - there are a lot of variables of when a given fix will make an Insider build. I would expect this change in the next week or so.

I updated to the latest "fast" build, and the issue still exists.

Computer useless converting exiting Ubuntu to wsl 2. Is there any work around I can do to release memory or clear without interrupting the conversion?

@dariothornhill, the best workaround that is working for me is to create a wslconfig file (%UserProfile%\.wslconfig) limiting the amount of RAM of the wsl2 vm and using additional swap space for RAM demanding operations.

[wsl2]
memory=4GB
swap=16GB
localhostForwarding=true

It is running really smooth in a laptop with SSD storage for the swap.

For the changes to the take effect you have to restart WSL 2 vm through powershell or cmd command wsl --shutdown

@luciorq I've tried this, but these settings do not seem to be restricting the size of the VM for me. I have the same file with the same contents, but I still see that process using up all available memory on the machine.

@cmeiklejohn, I'm not sure if it makes any difference, but I created the file in VSCode and using UTF-8 encoding and CRLF line endings.

Also restarted the machine and the WSL 2 VM.

Alternative to nocache : memcached. This should be used anyways for good practice.

Same problem in 19008.1000, the ram will surge if you compile or open a vscode, but it returns really slow if you close the vscode and so on. I tested that the ram charges from 280m to 1280m through the compile,drops to 1032m after the compile but never get back to 280m again. Fortunately the .wslcofig is really useful

How can i fix this ?

How can i fix this ?

Read above, they said they corrected this error, they only need to be implemented in the next windows update (possibly).

Okay , thank you

@dariothornhill, the best workaround that is working for me is to create a wslconfig file (%UserProfile%\.wslconfig) limiting the amount of RAM of the wsl2 vm and using additional swap space for RAM demanding operations.

[wsl2]
memory=4GB
swap=16GB
localhostForwarding=true

It is running really smooth in a laptop with SSD storage for the swap.

For the changes to the take effect you have to restart WSL 2 vm through powershell or cmd command wsl --shutdown

Thanks a lot, it really saved my 8GB ram laptop. I set ram limit to 512MB, and use 8GB swap as below:

[wsl2]
memory=512MB
swap=8GB
localhostForwarding=true

Before this change, ram usage always spikes to 1.5G+ after I run the docker, but for now, the ram usage is limit to 512MB as Task Explorer shows below and you can also find in machine that wsl2 start to use swap when memory 500MB exceeds.

image
image

Weird, the %userprofile%\.wslconfig still doesn't work for me. I verified the right path and the right line endings. Do you have a blank line at the end of the file, perhaps?

@cmeiklejohn - Can you elaborate? what settings are you using?
It just worked for me as is.

@cmeiklejohn - Correct, 19013 has the fix for this.

My Windows 10 Build 19013.vb_release.191025-1609 is still bothered by this issue. WSL2 Ubuntu 18.04 LTS.

ver

I'm using VSCode remotely editing codes in WSL2. Vmmem consumes large memory and does not release them.

mem

I close VSCode without executing wsl --shutdown. Vmmem does not release the memory.

mem2

Only after wsl --shutdown, the memory is given back to Windows.

commented

My Windows 10 Build 19013.vb_release.191025-1609 is still bothered by this issue. WSL2 Ubuntu 18.04 LTS.
mem
ver

exec free -h the safest is that the cache memory is occupying ram memory, free the cache and see if the memory returns to windows (before this did not work)

My Windows 10 Build 19013.vb_release.191025-1609 is still bothered by this issue. WSL2 Ubuntu 18.04 LTS.
mem
ver

exec free -h the safest is that the cache memory is occupying ram memory, free the cache and see if the memory returns to windows (before this did not work)

free -h shows that buff/cache is not released.

As a try, I copied a folder of large size to another place. And this copy operation leads the buff/cache to increase a lot. When the copy is finished, the buff/cache is not released. So is Vmmem. Check the following pic.

mem_free

My Windows 10 Build 19013.vb_release.191025-1609 is still bothered by this issue. WSL2 Ubuntu 18.04 LTS.
mem
ver

exec free -h the safest is that the cache memory is occupying ram memory, free the cache and see if the memory returns to windows (before this did not work)

free -h shows that buff/cache is not released.

As a try, I copied a folder of large size to another place. And this copy operation leads the buff/cache to increase a lot. When the copy is finished, the buff/cache is not released. So is Vmmem. Check the following pic.

mem_free

You may need such command like sync; echo 3 > /proc/sys/vm/drop_caches to purge Linux cache.

My Windows 10 Build 19013.vb_release.191025-1609 is still bothered by this issue. WSL2 Ubuntu 18.04 LTS.
mem
ver

exec free -h the safest is that the cache memory is occupying ram memory, free the cache and see if the memory returns to windows (before this did not work)

free -h shows that buff/cache is not released.
As a try, I copied a folder of large size to another place. And this copy operation leads the buff/cache to increase a lot. When the copy is finished, the buff/cache is not released. So is Vmmem. Check the following pic.
mem_free

You may need such command like sync; echo 3 > /proc/sys/vm/drop_caches to purge Linux cache.

OK. It works.

Can WSL2 automatically drop some cache when the Windows host has pressure on memory?

commented

I want to report another behavior related to VSCode. I use VS Code to remotely do code development. I'm not so sure if this issue should be reported here or not.

In Windows 10 Build 19013.vb_release.191025-1609, VS Code version 1.39.2 with Remote-WSL plugin 0.39.9, doing search in VS Code leads Vmmem to consume large memory. And I need to manually release the cache by sudo sh -c "/bin/echo 3 > /proc/sys/vm/drop_caches". However, such behavior is really annoying. Why does searching in VS Code give such results?

Or should I report this to VS Code team?

vscode

Linux uses RAM as a cache when writing to disk, this to gain speed. It is a usual kernel behavior and yes, you can control this behavior. Research Linux and ways to control the cache. The comments above talk about how to mitigate that behavior automatically.

Can WSL2 automatically drop some cache when the Windows host has pressure on memory?

I feel like it is. But I'm not sure. In a short period of time like 1 minute, I cannot feel the decrease. But it seems like it drops the RAM after some time.

We've released a blog post to help go over the details of this feature: Memory Reclaim in the Windows Subsystem for Linux 2. It should help clear up how memory becomes freed!

commented

I created a .wslconfig file in %UserProfile% but it doesn't take affect. I test it by setting the following:

[wsl2]
swap=0
processors=3

but when I do 'top' it still shows values from before this change. What am I missing?

Great, now the issue is just about buff/cache not being released.

Great, now the issue is just about buff/cache not being released.

Its a issue that is being a bit overlooked. Linux really does not have the habit of freeing memory from it cache/buffer and like to hold everything in memory forever ( only pushing out things that are not accessed in a long time, to make space for new things to cache ). So its a valid point, that is unfortunately somewhat ignored.

I think the expectations of the MS team is: You are done with your task, you close the bash and it auto frees memory. But if you have VSC open and you put your PC to sleep every day ( more productive then reopening everything every day ), it never gets freed. Thus it keep growing as you compile more code or do other file operations. And you are forced to put memory limits on those WSL instances.

By default WSL2 needs to have some kind of a limit or force the Linux kernel to be more aggressive in reclaiming the cache. For example: everything in cache, that is not used in 1 hour time, reclaim it ( a setting you can also override per WSL2 instance ).

/Edit:

And it becomes a issue fast when you work with a lot of microservices / have a lot of WSL2 instances open / like to have clean WSL2 instances by having your data cleanly separated per WSL2 /VM instance.

@Indribell - we are looking at improving the behavior in the future to improve the behavior regarding Linux's cached memory.

Wouldn't something like setting up a line at sudo crontab -e with the following do the desired effect?

0 * * * * echo 3 > /proc/sys/vm/drop_caches

I don't know the implication of that in the performance of the machine, but since it's not meant to be used in a production environment I think clearing memory cache every hour seems to be a nice way to workaround that issue.

@ErvalhouS - The performance impacts are substantial. When you drop the page cache you're going to be reading from disk so it's not something we want to do without being sure you're done with the files that are cached.

@benhillis et all, I'm a relative noob to the inner workings of the linux kernel, but might some way of addressing this be making a whitelist of processes whose memory affects can be cleared after they are done? For example, I unzip a .sql.gz file in my WSL instance, and now it permanently has an extra gig of memory tied up to the instance. Would there be an easy way to release from the cache those unzipped files?

This might be closer to a Docker issue, but I have been using the edge Docker for Windows with WSL 2. In my case, the `buff/cache' can take 12GB during the building process, and once everything is running it is using 20GB before dropping the cache.
Where as running purely in the old Hyper V version, I had no issues leaving the environment at 12GB max.

Can't run the command "sudo sync; echo 3 > /proc/sys/vm/drop_caches". It says permission denied

commented

Can't run the command "sudo sync; echo 3 > /proc/sys/vm/drop_caches". It says permission denied
1- sudo su
2- sync; echo 3 > /proc/sys/vm/drop_caches

So I'm on build 19025 and I still get this problem. Does this still to be fixed or is there something i'm doing wrong?

Same, came on here because despite having a single docker container running with WSL 2 backend and one VM with two bash prompts vmmem was using 10+ gigs of RAM. Build 19013 (slow ring).

Can't run the command "sudo sync; echo 3 > /proc/sys/vm/drop_caches". It says permission denied

two gotchas here. there's no effect of sudo on echo as echo is a separate statement from sudo sync due to ;. It still does not work if you change ; to && because redirection > happens before sudo gets permission. echo 3 | sudo tee /proc/sys/vm/drop_caches works.

I'm on build 19033 and seeing the same issue with memory consumption. The Vmmem process is currently using 31GB of memory. I'm exclusively running VSCode on WSL2, no containers or the like.

@adam-stamand workaround from #4166 (comment) still works like a charm.

commented

Workaround: Create a %UserProfile%\.wslconfig file in Windows and use it to limit memory assigned to WSL2 VM.

Example

[wsl2]
memory=6GB
swap=0
localhostForwarding=true

This will still consume the entire 6GBs regardless of Linux memory usage, but at least it'll stop growing more than that.

Supported settings are documented here.

It worked on Microsoft Windows [Version 10.0.19037.1]

EDIT: This is unrelated to memory compaction. The issue was resolved with #4737. Leaving the original below for posterity

Original Post:

Opening a new Ubuntu tab in Windows Terminal (preview) takes an eternity (~40s) for the prompt to appear on [Version 10.0.19037.1]. I ran dmesg and the last thing to occur in the kernel is:

image

Is it possible that long-running memory compaction is leading to excruciatingly long load times?

Things that I don't think are related to the issue:

  • My hardware: NVMe SSD, Intel 8700k, and 32GB of RAM
  • My .bashrc: it sources instantly.

The interesting thing is that it used to appear instantly, and then suddenly slowed down. The last thing I remember doing before this started was installing gcc and running make to install VIM 8.1.
I have disabled fast boot and rebooted quite a few times to no avail.

long_loads_with_overlay