microsoft / WSL

Issues found on WSL

Home Page:https://docs.microsoft.com/windows/wsl

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Output from WSL command pipeline is randomly truncated

petsuter opened this issue · comments

Environment

Windows build number: 10.0.17134.765

Steps to reproduce

  • Open cmd.exe and enter the following commands:
mkdir test && cd test
FOR /L %i IN (100,1,999) DO echo %i >> data.txt
FOR /L %i IN (1,1,100) DO copy data.txt data%i.txt
wsl cat * | wsl sort | wsl uniq

Repeat the last line multiple times to repeat the problem with randomly different behavior.

Expected behavior

After the last command the terminal should display all integers from 100 to 999.

Actual behavior

After the last command the terminal only displays all integers from 100 to some random integer smaller than 999 (e.g. 978). Sometimes the last integer is not shown completely (e.g. only "97" instead of "978").


The output is truncated? One of the programs in the pipeline is aborted early? It might be a timing issue? Repeating the last command often results in the same "random" number, but sometimes it gets stops later / earlier.

Using different counts (e.g. 100 to 400 instead of 100 to 999) the problem also occurs.

Using slightly different commands the problem does not occur, e.g. this works correctly:

wsl
cat * | sort | uniq

In issue microsoft/terminal#1078 it was suggested this is a bug in WSL.

Ref #3716 on the offchance. The issue title over there is misleading (has nothing to do with the clipboard). That one also looks like problem was limited to the single character case, which doesn't match this scenario. But it does have the nondeterministic factor going for it, shrug.

wsl cat * | sort | uniq

That won't run at all unless you've got a MSYS/Cygwin uniq.exe in your Windows path. The problem does seem limited to more than one cmd.exe pipe tho, at least on a glance. This works:

C:\Data> wsl.exe cat *.txt | wsl.exe /bin/bash -c "/usr/bin/sort | /usr/bin/uniq"

By

wsl
cat * | sort | uniq

I mean running wsl to start an interactive WSL shell and then running cat * | sort | uniq there. So no MSYS/Cygwin/uniq.exe is involved.

I think I experienced the same problem. I was using the cpp inside WSL because that's hack of a lot easier than doing it with Visual Studio.

WSL let me down on this line of code:

bash -c "cpp ./input-file" | Out-File tmp

But I also experienced the problem with these commands:

bash -c "cpp ./input-file" > tmp
wsl cpp ./input-file > tmp

Sometimes the output file would just stop in the middle breaking the whole pipeline.

Alright; appreciate the me2 that helps. Note this issue is most likely chirping crickets on the "old" (in WSL terms, 17134) Win10 build cited in the OP. If people are seeing Windows->WSL pipe problems on 1903 aka 18362 aka 19H1 aka "May 2019 update" (or later) probably wanna speak up. Use wsl.exe since bash.exe is long deprecated.

Your wsl.exe cpp ./input-file > tmp scenario ought work, and if it doesn't that's a bug. That said, doing such a thing is nearly always going to be the wrong approach. If you are calling Linux /usr/bin/cpp, call it from inside WSL (ie from /bin/bash) and operate on /mnt/c/Users/you files (if you must). Not the other way around. Read: Just because it ought work doesn't make it a great idea. With caveat "Free Country" natch.

I'm on 18362 so I guess, it should work. I also changed my script to use wsl - didn't take note of that change, thanks for the hint.

But could you re-iterate on why my usage of wsl is subpar? Should my script do this: wsl /bin/bash -c "cpp ./input" > tmp? And if so, is there any reference to give me that explains this in more detail?

I can confirm the original problem still occurs on Win10 1903 aka 18362 aka 19H1 aka "May 2019 update".

But could you re-iterate on why my usage of wsl is subpar?

Because you are crossing a separation of concerns boundary for no reason. You have no idea if that ./input file's character encoding is in UTF16 (endianess in BOM because Windows). You invited differing EOL convention to the party. All "because that's hack of a lot easier", which it is not, on multiple objective levels.

The guidance isn't for your use case specifically. Doing a wsl cat * | wsl sort | wsl uniq is a fraught path to the imagined end result desired. Rule being, if you aren't looking for a time-sink: keep Windows stuff in Windows and WSL stuff in WSL, and cross the boundary when you have to.

wsl /bin/bash -c "cpp ./input" > tmp

Maybe:

C:\> wsl /bin/bash -c "/usr/bin/cpp /mnt/c/path/to/input > /mnt/c/path/to/tmp"

But mostly, from a "Native Tools Command Prompt for VS 2019"...

C:\>cl.exe /E input > output

Which is demonstrably a hack of a lot easier. Or if MSVS isn't your soup, install Windows clang.

Also reproducing on 1903 18362.267

I have the issue with

  • wsl cat * | wsl sort | wsl uniq
  • wsl cat data.txt | node -e "process.stdin.pipe(process.stdout)"
    I wanted to make sure it's not related to wsl's input. this one actually doesn't print anything for me.
  • wsl git blame some-file | node -e "process.stdin.pipe(process.stdout)"
    output is truncated
  • Calling wsl git blame ... as child process and reading output. Output is truncated the same way

I was trying to implement a windows git.exe that runs the wsl git ... and translate the linux paths to windows paths in the output. I got issues when output is big (git log, git blame, ...)

I am observing a similar issue on 18362.535 (also with git output, but that doesn't seem to be related).
It worked OK on WSL2 when I checked.

This seems to be triggered by "fast writer slow reader", like in #610.
This python script demonstrates the issue (can be reproduced using similar code in Java):
https://gist.github.com/AMPivovarov/30a5222c82344f6742c88ce078b11643

Expected output (Linux / when executed inside WSL / WSL2):

True 110000
False 110000
True 110000
False 110000
True 110000
False 110000

Actual output (ubuntu.exe, 18362.535):

True 69630
False 110000
True 73700
False 110000
True 73700
False 110000
True 73711

I can't repro the OP test scenario on 19551, WSL1+WSL2. There has been some water under the bridge since 18362. If I can get a second from someone on Insiders I'll close "mysterious reasons".

The issue is still reproduced for me, I'm on Windows 10 Pro Insider Preview Build 19569.rs_prerelease.200214-1419.

For reproducing, just run outlined by @AMPivovarov python script using WSL v1 and WSL v2.

Result for WSL v1:

True 77913
False 110000
True 50193
False 110000
True 77836
False 110000
True 79365
False 110000
True 65527
False 110000

Result for WSL v2:

True 110000
False 110000
True 110000
False 110000
True 110000
False 110000
True 110000
False 110000
True 110000
False 110000

Looks to still be an issue in 2022

Edition	        Windows 10 Enterprise
Version	        21H1
Installed on	‎2021-‎12-‎16
OS build        19043.1706
Experience	Windows Feature Experience Pack 120.2212.4170.0

Just reproduced the same on WSL2. Proof:
image

One of the invocations of uname -sm was just truncated (and it was repeatable about every 8–10 times).

Proof that I have WSL2:
image

Windows 11
Version 21H2 (OS Build 22000.1219)

The problem is gone after wsl --shutdown and starting the instance again.

I have a felling that WSL2 is much better but still not bulletproof w.r.t. this problem :(

This issue has been automatically closed since it has not had any activity for the past year. If you're still experiencing this issue please re-file this as a new issue or feature request.

Thank you!