Azure / azure-relay-dotnet

☁️ .NET Standard client library for Azure Relay Hybrid Connections

Home Page:https://docs.microsoft.com/en-us/azure/service-bus-relay/relay-what-is-it

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Send > 1K data, client onmessage get call serveral times

danny8002 opened this issue · comments

Actual Behavior

  1. Server send (1024 + 1) = 1025 Bytes data, client receive 2 onmessage call, and the first call get data size = 1024, and the second call get data size =1.

Expected Behavior

  1. Server send (1024 + 1) = 1025 Bytes data, client receive 1 onmessage which contains the full data

see Azure/azure-relay-node#13

after dig into the code, the bug should exist in Line 176#WebSocketStream.cs

await this.webSocket.SendAsync(
                    new ArraySegment<byte>(buffer, offset, count), this.WriteMode == WriteMode.Binary ? WebSocketMessageType.Binary : WebSocketMessageType.Text, true, linkedCancelSource.Token).ConfigureAwait(false);

from MSDN
public override Task SendAsync(
ArraySegment buffer,
WebSocketMessageType messageType,
bool endOfMessage,
CancellationToken cancellationToken
)

we can know, endOfMessage always true. StreamWriter use buffer size = 1024, so when we write 1025 bytes data, SendAsync will be called twice which means web socket send 2 frames but every frame is marked 'FIN = 1'.

the mitigation solution is using HybridConnectionStream directly to write data, eg.

    public static async Task SendAsync(HybridConnectionStream stream, string str)
    {
        var bytes = Encoding.UTF8.GetBytes(str);

        await stream.WriteAsync(bytes, 0, bytes.Length);
    }

Are you using a BufferedStream on top of the sending HybridConnectionStream?

HybridConnectionStream is built on top of .NET's WebSocket support which doesn't guarantee that sending 16kb & endOfMessage=true with ClientWebSocket will result in one read on the other end with exactly that many bytes. However, it could be split into multiple reads on the other end with only the last read having endOfMessage = true.

This is behavior in the .NET Framework, which Relay is built upon:
WebSocket Sender calls ClientWebSocket.SendAsync with 16384 bytes and endOfMessage = true
WebSocket Reader gets a read of ~16300 bytes with endOfMessage = false
WebSocket Reader gets a read of ~84 bytes with endOfMessage = true.

The problem is that HybridConnectionStream, being a System.IO.Stream doesn't currently have any way to pass the information to its caller whether a given read included endOfStream == true || false. .NET's Stream class typically doesn't guarantee that the reader gets the bytes in the exact chunk sizes that the writer writes them.

A new feature would need to be added to Microsoft.Azure.Relay.dll to pass this information along to readers (dare I say 'relay' this information?).

As a work-around you could prepend a 32-bit (as an example) integer containing the size when writing your atomic chunks to the HybridConnectionStream.

It's generous to call it a workaround when you have to effectively create your own protocol and manually chunk the message. It seems like a questionable design to create another abstraction (HybridConnectionStream) which completely hides the underlying socket and has different semantics.

In case anyone is having an issue with this api, here is an example of reading a message with a size prefix:


private async Task RunAsync()
{
    var header = new byte[4];
    while (true)
    {
        await mConnection.ReadAsync(header, 0, 4);
        var size = BitConverter.ToInt32(header, 0);
        var message = new byte[size];
        var read = 0;
        while (read < size)
        {
            read += await mConnection.ReadAsync(message, read, size - read);
        }
        //process message
    }
}

Would adding bool HybridConnectionStream.EndOfMessage which comes from the inner WebSocketReceiveResult.EndOfMessage be sufficient to know when all frames (chunks) have been received for a given "Message"?

Makes sense to me. It wouldn't break anything and seems like the best way to bridge the two metaphors.

@rossbrower's workaround is the correct solution to meet this requirement with streams.
@danny8002, I am closing this issue, but feel free to request to be opened again if you still have questions.