apnadkarni / iocp

Implements Tcl channels based on Windows I/O completion ports.

Home Page:https://iocp.magicsplat.com

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Very rare loss of sync when receiving data on iocp::inet::socket

Kazmirchuk opened this issue · comments

commented

Hello Ashok,
First of all, many thanks for your library - it literally saved my project. I find it a shame that your implementation of IOCP-based sockets still is not merged into core Tcl. However, I've run into a difficult to reproduce problem, and maybe you'll be able to help me.

My setup is a scientific instrument sending telemetry at a fairly high rate via 3 TCP connections (~650Mbit/s in total - standard Tcl sockets on Windows can't sustain such load).
It is received by 3 client processes (one process per connection), all running in a pure tclsh on Windows 10 Pro server (2x Xeon CPUs, hyperthreading off). Bitrate per connection is between 180-260 Mbit/s and doesn't change over time.

The traffic contains packets, that start with a standard PCAP header followed by some payload, sized between 5-10K each.
The incl_len field in the PCAP header tells me the size of the payload. I extract it using a simple

binary scan $pcapHdr @8i

Then the payload itself is simply written to an SSD disk using [chan copy]

The problem is that sometimes this incl_len field is extracted completely wrong (a negative value). In such case I log the whole PCAP header, and it doesn't look like a PCAP header. Instead it looks like somewhere in the middle of my payload, every time with the same offset. For now, it's dummy payload that looks like 0x0A0B0C0D etc.

So, it seems like part of a packet get duplicated or truncated, and so I loose sync on my incoming traffic, and can't read the next PCAP header. It happens quite rarely - maybe once or twice over the session of 8h.

There's a similar application written in C# that doesn't manifest this bug, so this rules out problems at the hardware/Windows level.

This is how I configure the socket:

chan configure $wzlSock -translation binary -buffersize 1000000 -keepalive 1
chan event $wzlSock readable [list WIZIF::readSocket $wzlSock]

proc readSocket reads one packet at a time. The socket is in the blocking mode. While troubleshooting this, I've added extra checks that every [read] returns as many bytes as requested (or throws an error in case of EOF, that shuts down the whole process)

sorcvbuf & sosndbuf are left at default 65K. Is it worth increasing them to 1M too?

If my description rings any bell for you, I'd like to hear.

Thanks in advance,
Petro

Sorry, Petro, nothing similar come to mind that I've encountered.

If I understand correctly, the sending end is not Tcl, only the client side right?

Can you post the readSocket proc? That may help me target a specific code path to examine for bugs.

Out of curiosity, if you are using event driven i/o, why is the socket kept in blocking mode?

Also, could you confirm you still see the issue with tcl 8.6.13? It fixed two bugs in channel buffering. I don't think it would impact your case (binary) but nevertheless would be be good to confirm you still see the issue in 8.6.13.

If the code is open source, you can just send me link to look at it.

/Ashok

commented

Thanks a lot for the pointer to 8.6.13! Good to hear there have been some fixes in this area 👍 I saw the announcement on comp.lang.tcl, but didn't look much thru bugfixes. We're using 8.6.12 at the moment.

The sending side is some proprietary SW (not Tcl) running on Windows 10 x64. The LAN is 10GbE using Intel Adapters. All this, including the server running our Tcl application, is our partner's HW that I only accessed remotely. We've received a C#-based simulator of the sending side, and we tried running it in house with the Tcl application, but the bug doesn't occur in such setup.

I've tried increasing sorcvbuf & sosndbuf to 1MB - it didn't make any difference.

if you are using event driven i/o, why is the socket kept in blocking mode?

I thought it would be simpler to implement readSocket with blocking [read] and [chan copy]. Just read one packet at a time and rely on Tcl to invoke readSocket again (~4000 pkts/s). Basically smth like:

proc readSocket {sock} {
    set pcapHdr [read $sock 44]
    binary scan $pcapHdr @8i incl_len
    if {$incl_len < 24 || $incl_len > 262144} {
        error "Invalid incl_len"
    }
    set pktHeader [read $sock 20]  ;# CCSDS packet header: extract some values and insert into Postgres; done in a separate thread
    ...
    chan copy $sock $::binFile -size $pktPayloadLength ;# copy packet payload straight into a file
}

After a bit more debugging I noticed that when the bug occurs, the very first [read] 'skips' the first 256 bytes received from TCP, so I land in the middle of packet payload. Interesting, that it's always 256 bytes.

In hindsight I realize that letting Tcl event loop call readSocket 4000 times/s probably wasn't the best idea :-) maybe I should have another proc on top that would have [chan pending] and then call readSocket in a loop until all available packets are read (in practice they have the same size). Or do a non-blocking socket... [chan copy] would be inconvenient in this case, but I can replace it with plain [read].

Unfortunately, the HW is now being packaged for delivery to a customer, and they seem to accept it as-is for now, so I won't be able to experiment further until we start working on a 2nd delivery in a few months. I will let you know.

Before calling chan copy, are you turning off the fileevent handler (in the ... section of readSocket) ? I would be a little uncomfortable with using chan copy while a event handler was registered. It is possible there would be a race condition between the event handler firing and the chan copy internal loop. But I don't know for sure.

/Ashok

commented

I don't. The [chan copy] docs mention turning off fileevent handlers only when doing a background copy. I believed that it wasn't necessary when doing a blocking copy, like here. Anyway, while troubleshooting this I did try replacing [chan copy] with plain [read]+[puts], and it made no difference - neither in performance, nor with the bug. BTW I guess, for packets of 5-10KB using [chan copy] probably means just showing off my Tcl skills rather than making a real difference. I guess, it becomes worthwhile starting with... 1MB?..

Also, do you know if [chan pending] is a cheap operation or it should be cached in a variable? E.g. if in a fileevent handler I want to read from a socket until there's 10KB left:

while { [chan pending input $chan] > 10000} {
    set data [read $chan $numBytes]
   # process $data
}

is it an OK implementation for a high-performance loop? or better call [chan pending] once before the loop?

E.g. recently I accidentally discovered that getting all socket options with [chan configure $chan] is in fact very expensive, and the other side disconnected me as a slow client :-)

On a more general note, do you know about a reasonably modern Tcl open-source project that is a good example of proper work combining TCP sockets and coroutines? with full error handling and testing? (in addition to your book ofc!) A couple of years ago, while working on my NATS client, I tried looking around and couldn't find much, so had to learn many things through trial and error. E.g. the fact that [socket -async] can throw an error when given an invalid host name, because DNS resolution is done synchronously anyway, was quite a surprise :D

commented

oh and another completely unrelated question :-) do you plan to publish a 2nd edition of your Tcl book in the near future? I'm going to buy it, and thought, maybe it's worth waiting a bit for an update

I don't want to say never but no plans currently for a second edition. The thought of proofing again is too daunting :-)

Having said that, purchasing the PDF version (from gumroad) will also allow access to future editions.

Regarding your other questions, fconfigure on sockets can be expensive because of the reverse DNS lookup on the remote address.

Sorry I don't know the answer to your other questions regarding sockets (chan pending etc.). Probably best to measure and see.

/Ashok

Forgot to mention. I had a look through the code paths again on the receive but didn't see anything that would explain the sporadic data corruption. Does not mean it doesn't exist of course. I'll set up a long term sink test and see if I can spot it. I can't produce data at the rates you are seeing though.

/Ashok

commented

Don't worry too much. We've suggested a workaround to our customer (reading the missed packets later from an archive), and they are fine with it. I couldn't reproduce the problem in-house, so it could be limited to the specific HW setup. Next time I get a change to look at it will be in a few months, so I suppose it doesn't make sense to keep the issue open. If I have any news, I'll comment here. Thank you for all the replies and for your immense contribution to the Tcl ecosystem!

commented

After updating Tcl 8.6.10 -> 8.6.13 the loss of sync doesn't occur anymore, and we have a stable throughput of 650Mbit/s (as received from our instrument) with very little CPU and memory consumption.

I've been working with Tcl for almost 10y, and the discovery that the standard Tcl sockets on Windows are still based on the slow API from the late 90s, was very sudden. I've just tried googling "tcl socket performance on windows" and no results point to your package. That's why I think that it is highly important to integrate iocp_inet into core for Tcl 9.

It's interesting that the sync issues went away with 8.6.13. Thanks for letting me know.

Yes, I'm aware of the need for better networking performance for the Tcl core. But Tcl 9 is in such a state of flux, I'm reluctant to add one more risk factor right now and do not have the time either. But eventually once things settle down.

/Ashok