facebook / mvfst

An implementation of the QUIC transport protocol.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

mvfst receive data fail from quic-go

raulftang opened this issue · comments

Hi,

We use quic-go to start a server and want to use mvfst work as a client. The quic-go server is started as follows:

listener, err := quic.ListenAddr("0.0.0.0:4242", generateTLSConfig(), nil)
func generateTLSConfig() *tls.Config {
key, err := rsa.GenerateKey(rand.Reader, 1024)
if err != nil {
panic(err)
}
template := x509.Certificate{SerialNumber: big.NewInt(1)}
certDER, err := x509.CreateCertificate(rand.Reader, &template, &template, &key.PublicKey, key)
if err != nil {
panic(err)
}
keyPEM := pem.EncodeToMemory(&pem.Block{Type: "RSA PRIVATE KEY", Bytes: x509.MarshalPKCS1PrivateKey(key)})
certPEM := pem.EncodeToMemory(&pem.Block{Type: "CERTIFICATE", Bytes: certDER})

tlsCert, err := tls.X509KeyPair(certPEM, keyPEM)
if err != nil {
	panic(err)
}
return &tls.Config{
	Certificates: []tls.Certificate{tlsCert},
	NextProtos:   []string{"quic-echo-example"},
}

}
Then we run the mvfst tperf as follows:

tperf -duration 1 -host 127.0.0.1 -mode client -port 4242 -v=4

However, it always stops with the following error info after running 10s+:
I1230 08:27:59.497284 3295 QuicReadCodec.cpp:274] Client cannot read key phase one packet server=42c78f7f client=
I1230 08:27:59.497326 3295 QuicReadCodec.cpp:274] Client cannot read key phase one packet server=42c78f7f client=
I1230 08:27:59.580900 3295 QuicReadCodec.cpp:274] Client cannot read key phase one packet server=42c78f7f client=
I1230 08:28:00.339814 3295 QuicReadCodec.cpp:274] Client cannot read key phase one packet server=42c78f7f client=
I1230 08:28:00.339854 3295 QuicReadCodec.cpp:274] Client cannot read key phase one packet server=42c78f7f client=
I1230 08:28:02.421409 3295 QuicTransportBase.cpp:2473] lossTimeoutExpired Exceeded max PTO client CID= server CID=42c78f7f peer address=127.0.0.1:4242
I1230 08:28:02.421435 3295 QuicTransportBase.cpp:328] Closing transport due to abandoned connection client CID= server CID=42c78f7f peer address=127.0.0.1:4242
I1230 08:28:02.421447 3295 QuicTransportBase.cpp:2747] Clearing datagram callback
I1230 08:28:02.421454 3295 QuicTransportBase.cpp:2750] Clearing 0 peek callbacks
E1230 08:28:02.421465 3295 tperf.cpp:719] TPerfClient error: Connection abandoned; errStr=Exceeded max PTO

Best regards

By the way,
Tried quic-go and lsquic client, both worked well with same server.

Tperf client does not send a request to the server. It's expects the server to start sending data immediately upon accepting the connection (like iperf). The code you shared doesn't show if your server implementation is doing that. If your server is expecting a request, you probably want to use hq from the proxygen repo: https://github.com/facebook/proxygen/tree/main/proxygen/httpserver/samples/hq.

In general, you can check the QUIC interop dashboard to see if Mvfst has any issues with other QUIC implementations: https://interop.seemann.io/. This uses hq for testing Mvfst as a server and a client.

If the problem persists, please share a full main.go file to reproduce the issue.

Hi jbeshay,

Yes , the server start sending data immediately upon accepting the connection. And tperf client works for some time and will stop with previous error info. Attached full main.go.
For tperf, adding following setting to support version verification and use default TransportSetting.

quicClient_->setSupportedVersions({QuicVersion::QUIC_DRAFT});

main.go.zip

Thanks

Thanks @raulftang. I am not able to reproduce the behavior you are seeing. I compiled your main.go file and was able to run tperf successfully after I made the following changes to tperf:

  • Set quic version as you mentioned.
  • Use the same ALPN in fizzClientContext as you have for the server (quic-echo-example).

Here is the output I get:
Server:

./main
stream write fail

Client:

./tperf -duration 1 -host 127.0.0.1 -mode client -port 4242
I0105 15:16:38.084794 2176165 tperf.cpp:763] TPerfClient connecting to 127.0.0.1:4242
I0105 15:16:38.087875 2176165 tperf.cpp:673] TPerfClient: onTransportReady
I0105 15:16:39.088536 2176165 tperf.cpp:683] TPerfClient connection end
I0105 15:16:39.088603 2176165 tperf.cpp:604] Received 91557000 bytes in 1 seconds.
I0105 15:16:39.088614 2176165 tperf.cpp:606] Overall throughput: 698.524Mb/s
I0105 15:16:39.088645 2176165 tperf.cpp:610] Average per Stream throughput: 698.524Mb/s over 1 streams

Thanks @jbeshay , verry sorry for missing some important info.
I also made following change to keep tperf running.
In timeoutExpired function, I comment "quicClient_->closeNow", and restart the timer at the end. Then it will run about 10s before fail.
void timeoutExpired() noexcept override {
//not close client and keep running
// quicClient_->closeNow(folly::none);
constexpr double bytesPerMegabit = 131072;
LOG(INFO) << "Received " << receivedBytes_ << " bytes in "
<< duration_.count() << " seconds.";
LOG(INFO) << "Overall throughput: "
<< (receivedBytes_ / bytesPerMegabit) / duration_.count()
<< "Mb/s";
// Per Stream Stats
LOG(INFO) << "Average per Stream throughput: "
<< ((receivedBytes_ / receivedStreams_) / bytesPerMegabit) /
duration_.count()
<< "Mb/s over " << receivedStreams_ << " streams";
if (receivedStreams_ != 1) {
LOG(INFO) << "Histogram per Stream bytes: " << std::endl;
LOG(INFO) << "Lo\tHi\tNum\tSum";
for (const auto bytes : bytesPerStream_) {
bytesPerStreamHistogram_.addValue(bytes.second);
}
std::ostringstream os;
bytesPerStreamHistogram_.toTSV(os);
std::vectorstd::string lines;
folly::split("\n", os.str(), lines);
for (const auto& line : lines) {
LOG(INFO) << line;
}
}

//reset receivedBytes and restart timer
averageThroughput();
receivedBytes_ = 0;
eventBase_.timer().scheduleTimeout(this, duration_);
}

Now I am able to reproduce the issue. The root cause is that quic-go automatically triggers a key update every 100K packets: https://github.com/lucas-clemente/quic-go/blob/84e03e59760ceee37359688871bb0688fcc4e98f/internal/protocol/params.go#L183

Mvfst does not yet support key updates, which is why tperf is unable to decrypt any packets after the key update causing the connection to timeout.

Thanks @jbeshay,
Got it. I'll check quic-go to switch key update off or other measure.