JKRhb / dtls2

A DTLS library for Dart based on OpenSSL.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Crash when calling _maintainOutgoing --> _libSsl.SSL_ctrl

Ifilehk opened this issue · comments

Hello Jan,

here a crash trace. Could you check that ?

Thanks.

Tarik

===== CRASH =====
si_signo=Segmentation fault(11), si_code=1, si_addr=0x1f0
version=2.19.3 (stable) (Unknown timestamp) on "linux_x64"
pid=3214466, thread=3214944, isolate_group=main(0x5626b38ac000), isolate=main(0x5626b394d000)
os=linux, arch=x64, comp=no, sim=no
isolate_instructions=5626b1d74c80, vm_instructions=5626b1d74c80
  pc 0x00007fe557d815b6 fp 0x00007fe5a1ffce18 /lib/x86_64-linux-gnu/libssl.so+0x255b6
  pc 0x00007fe557e2fe0c fp 0x00007fe5a1ffce60 Unknown symbol
  pc 0x00007fe557e2f989 fp 0x00007fe5a1ffcee0 Unknown symbol
  pc 0x00007fe557e2e1c0 fp 0x00007fe5a1ffcf50 Unknown symbol
  pc 0x00007fe557e2bcae fp 0x00007fe5a1ffcfa0 Unknown symbol
  pc 0x00007fe557e2b237 fp 0x00007fe5a1ffcfd8 Unknown symbol
  pc 0x00007fe557e23eaf fp 0x00007fe5a1ffd050 Unknown symbol
  pc 0x00007fe557e22e63 fp 0x00007fe5a1ffd0a8 Unknown symbol
  pc 0x00007fe557e22c51 fp 0x00007fe5a1ffd128 Unknown symbol
  pc 0x00007fe557e22714 fp 0x00007fe5a1ffd180 Unknown symbol
  pc 0x00007fe557e2204d fp 0x00007fe5a1ffd1c0 Unknown symbol
  pc 0x00007fe557e21d5c fp 0x00007fe5a1ffd200 Unknown symbol
  pc 0x00007fe557e21ab2 fp 0x00007fe5a1ffd240 Unknown symbol
  pc 0x00007fe557e21763 fp 0x00007fe5a1ffd280 Unknown symbol
  pc 0x00007fe557e23a5d fp 0x00007fe5a1ffd2c0 Unknown symbol
  pc 0x00007fe557e1ef52 fp 0x00007fe5a1ffd318 Unknown symbol
  pc 0x00007fe59ec651df fp 0x00007fe5a1ffd360 Unknown symbol
  pc 0x00007fe59ec64e2e fp 0x00007fe5a1ffd3a0 Unknown symbol
  pc 0x00007fe59ec64d59 fp 0x00007fe5a1ffd3c8 Unknown symbol
  pc 0x00007fe59ec64c7e fp 0x00007fe5a1ffd408 Unknown symbol
  pc 0x00007fe59ec24208 fp 0x00007fe5a1ffd448 Unknown symbol
  pc 0x00007fe5a608300c fp 0x00007fe5a1ffd4c0 Unknown symbol
  pc 0x00005626b1eedf79 fp 0x00007fe5a1ffd560 dart::DartEntry::InvokeCode(dart::Code const&, unsigned long, dart::Array const&, dart::Array const&, dart::Thread*)+0x139
  pc 0x00005626b1eeddf5 fp 0x00007fe5a1ffd5c0 dart::DartEntry::InvokeFunction(dart::Function const&, dart::Array const&, dart::Array const&, unsigned long)+0x145
  pc 0x00005626b1ef0154 fp 0x00007fe5a1ffd600 dart::DartLibraryCalls::HandleMessage(long, dart::Instance const&)+0x144
  pc 0x00005626b1f13738 fp 0x00007fe5a1ffdb90 dart::IsolateMessageHandler::HandleMessage(std::__2::unique_ptr<dart::Message, std::__2::default_delete<dart::Message>>)+0x348
  pc 0x00005626b1f3bfca fp 0x00007fe5a1ffdc10 dart::MessageHandler::HandleMessages(dart::MonitorLocker*, bool, bool)+0x15a
  pc 0x00005626b1f3c6eb fp 0x00007fe5a1ffdc60 dart::MessageHandler::TaskCallback()+0x1db
  pc 0x00005626b206791b fp 0x00007fe5a1ffdce0 dart::ThreadPool::WorkerLoop(dart::ThreadPool::Worker*)+0x13b
  pc 0x00005626b2067d68 fp 0x00007fe5a1ffdd10 dart::ThreadPool::Worker::Main(unsigned long)+0x78
  pc 0x00005626b1fd91d6 fp 0x00007fe5a1ffddd0 /usr/lib/dart/bin/dart+0x22281d6
-- End of DumpStackTrace
  pc 0x0000000000000000 fp 0x00007fe5a1ffce18 sp 0x0000000000000000 Cannot find code object
  pc 0x00007fe557e2fe0c fp 0x00007fe5a1ffce60 sp 0x00007fe5a1ffce28 [Optimized] FfiTrampoline__SSL_ctrl
  pc 0x00007fe557e2f989 fp 0x00007fe5a1ffcee0 sp 0x00007fe5a1ffce70 [Unoptimized] OpenSsl.SSL_ctrl
  pc 0x00007fe557e2e1c0 fp 0x00007fe5a1ffcf50 sp 0x00007fe5a1ffcef0 [Unoptimized] _DtlsServerConnection@106018504._maintainOutgoing@106018504
  pc 0x00007fe557e2bcae fp 0x00007fe5a1ffcfa0 sp 0x00007fe5a1ffcf60 [Unoptimized] _DtlsServerConnection@106018504._maintainState@106018504
  pc 0x00007fe557e2b237 fp 0x00007fe5a1ffcfd8 sp 0x00007fe5a1ffcfb0 [Unoptimized] _DtlsServerConnection@106018504._incoming@106018504
  pc 0x00007fe557e23eaf fp 0x00007fe5a1ffd050 sp 0x00007fe5a1ffcfe8 [Unoptimized] DtlsServer._handleSocketRead@106018504
  pc 0x00007fe557e22e63 fp 0x00007fe5a1ffd0a8 sp 0x00007fe5a1ffd060 [Unoptimized] DtlsServer._startListening@106018504.<anonymous closure>
  pc 0x00007fe557e22c51 fp 0x00007fe5a1ffd128 sp 0x00007fe5a1ffd0b8 [Unoptimized] _RootZone@4048458.runUnaryGuarded
  pc 0x00007fe557e22714 fp 0x00007fe5a1ffd180 sp 0x00007fe5a1ffd138 [Unoptimized] _BufferingStreamSubscription@4048458._sendData@4048458
  pc 0x00007fe557e2204d fp 0x00007fe5a1ffd1c0 sp 0x00007fe5a1ffd190 [Unoptimized] _BufferingStreamSubscription@4048458._add@4048458
  pc 0x00007fe557e21d5c fp 0x00007fe5a1ffd200 sp 0x00007fe5a1ffd1d0 [Unoptimized] _SyncStreamController@4048458._sendData@4048458
  pc 0x00007fe557e21ab2 fp 0x00007fe5a1ffd240 sp 0x00007fe5a1ffd210 [Unoptimized] _StreamController@4048458._add@4048458
  pc 0x00007fe557e21763 fp 0x00007fe5a1ffd280 sp 0x00007fe5a1ffd250 [Unoptimized] _StreamController@4048458.add
  pc 0x00007fe557e23a5d fp 0x00007fe5a1ffd2c0 sp 0x00007fe5a1ffd290 [Unoptimized] new _RawDatagramSocket@14069316..<anonymous closure>
  pc 0x00007fe557e1ef52 fp 0x00007fe5a1ffd318 sp 0x00007fe5a1ffd2d0 [Unoptimized] _NativeSocket@14069316.issueReadEvent.issue
  pc 0x00007fe59ec651df fp 0x00007fe5a1ffd360 sp 0x00007fe5a1ffd328 [Unoptimized] _microtaskLoop@4048458
  pc 0x00007fe59ec64e2e fp 0x00007fe5a1ffd3a0 sp 0x00007fe5a1ffd370 [Unoptimized] _startMicrotaskLoop@4048458
  pc 0x00007fe59ec64d59 fp 0x00007fe5a1ffd3c8 sp 0x00007fe5a1ffd3b0 [Unoptimized] _startMicrotaskLoop@4048458
  pc 0x00007fe59ec64c7e fp 0x00007fe5a1ffd408 sp 0x00007fe5a1ffd3d8 [Unoptimized] _runPendingImmediateCallback@1026248
  pc 0x00007fe59ec24208 fp 0x00007fe5a1ffd448 sp 0x00007fe5a1ffd418 [Unoptimized] _RawReceivePort@1026248._handleMessage@1026248
  pc 0x00007fe5a608300c fp 0x00007fe5a1ffd4c0 sp 0x00007fe5a1ffd458 [Stub] InvokeDartCode

More info related to this crash. Definitely related to the libssl version mismatch between client and server.
Client is 1.1.1 and server 3.02

Obvious fix: server and client must load the same version

Thank you for reporting the issue! I will also have a deeper look into this :)

(I took the freedom of wrapping the log/stack trace in your comment into a code block to increase readability.)

Hmm, quick question: This crash occurs on the server side using OpenSSL 3.0.2, right?

From the stack trace, it seems to me as if the error occurs here, when _libSsl.SSL_ctrl is being called:

if (_libSsl.SSL_ctrl(_ssl, DTLS_CTRL_GET_TIMEOUT, 0, buffer.cast()) > 0) {
_timer = Timer(buffer.cast<timeval>().ref.duration, _maintainState);
}

Edit: Oh, sorry, just realized that you also stated that in the issue title 😅

Hmm. Could you try out if the server works when you comment out lines 308 to 310 referenced above?

Could you also try upgrading to OpenSSL 3.0.8?

Reading through the OpenSSL documentation and discussions, it seems as if calling SSL_ctrl directly is a bad practice. I will try out replacing it with DTLSv1_get_timeout, maybe that resolves the issue.

Yes on the server then strangely the client crashes too at almost same time

I don't want to confiscate your nice work but adding this could gain access to the library version.

` ffi.Pointer<ffi.Char> OpenSSL_version(
int i) {
return _OpenSSL_version(
i
);
}

late final _OpenSSL_versionPtr = _lookup<
ffi.NativeFunction<
ffi.Pointer<ffi.Char> Function(
ffi.UnsignedLong )>>('OpenSSL_version');
late final _OpenSSL_version = _OpenSSL_versionPtr.asFunction<
ffi.Pointer<ffi.Char> Function(int)>();`

Reading through the OpenSSL documentation and discussions, it seems as if calling SSL_ctrl directly is a bad practice. I will try out replacing it with DTLSv1_get_timeout, maybe that resolves the issue.

Yes was reading it too but I don´t know if it will solve the problem because 1.1 and 3 seam to be incompatible. At which extend I don't know. Will try it anyway. Thank you for your support.

Thank you for the suggestion, @Ifilehk! Due to the fact that the function returns a formatted string that varies on the input value, I now generated bindings to the functions that return the major, minor, and patch versions as numeric values instead – I think that should be a bit easier to work with.

I am not sure yet how to differentiate between the two cases yet, however... Have you also checked what data is exchanged between client and server via Wireshark yet?

Reading through the OpenSSL documentation and discussions, it seems as if calling SSL_ctrl directly is a bad practice. I will try out replacing it with DTLSv1_get_timeout, maybe that resolves the issue.

Yes was reading it too but I don´t know if it will solve the problem because 1.1 and 3 seam to be incompatible.

After trying a bit, I also noticed that this strategy does not really work since DTLSv1_get_timeout is a macro that is replaced by a call to SSL_ctrl (with the same arguments as in the server class). I also remembered that I tried something similar before already ;)

The segmentation fault does not occur on your OpenSSL 3.0 machine when running the tests, for example, does it?

Furthermore, is this issue somehow related to #46?

Yes it occurs on the Linux machine running libssl 3.0.2, then the client crashes.

New test new infos:
Client Android with OpenSSL 1.1.1c 28 May 2019 against Server Linux with OpenSSL 1.1.1l 24 Aug 2021

Server side:
At connection in void _connectToPeer() _libSsl.DTLSv1_listen(_ssl, _bioAddr) return value of -1
and in void _handleError(int ret, void Function(Exception) errorHandler) _libSsl.SSL_get_error(_ssl, ret) returns SSL_ERROR_SSL and _libCrypto.ERR_error_string(_libCrypto.ERR_get_error(), nullptr) returns

error:00000000:lib(0)::reason(0)

Thank you for the additional information! Does the client crash, though, or is it just the connection that fails?

Yes it occurs on the Linux machine running libssl 3.0.2, then the client crashes.

So the error also occurs if you run dart test in a local copy of the repository?

Hm, seems like error:00000000:lib(0):reason(0) means that there is no actual error on the OpenSSL error stack. So, I guess, the connection attempt is simply not successful.

More info:
Now same client against Cloud Linux Server with OpenSSL 1.1.1d 10 Sep 2019

Actually this setup was working the past days but was crashing after some minutes. Could not trace the problem because no debugging possibility. Was just getting a kind of
===== CRASH =====
si_signo=Segmentation fault(11), si_code=1, si_addr=0x2d0
Aborted

This setup is able to send the identityHint and is received by the android client. After that both crash.

Could not get a libssl match yet but will investigate further.

This setup is able to send the identityHint and is received by the android client. After that both crash.

Oh, maybe the identityHint is the root cause here, I think I have not tested it (that thoroughly) yet. I will have another look into it.

Checking the dtls server running locally on my Linux machine against this client:
openssl s_client -dtls -connect etc.. etc...

gives the same result:
DtlsException: error:00000000:lib(0)::reason(0)

I am not sure about your assumption regarding the identityHint because all was working yesterday and I could send and receive messages after the successful handshake.

I am not sure about your assumption regarding the identityHint because all was working yesterday and I could send and receive messages after the successful handshake.

Hmm, okay, then it is probably something else.

Now the difference:

6789
CONNECTED(00000003)
Can't use SSL_get_servername
40070EFA4B7F0000:error:0A000438:SSL routines:dtls1_read_bytes:tlsv1 alert internal error:../ssl/record/rec_layer_d1.c:613:SSL alert number 80
---
no peer certificate available
---
No client certificate CA names sent
---
SSL handshake has read 231 bytes and written 550 bytes
Verification: OK
---
New, TLSv1.2, Cipher is PSK-AES128-CCM8
Secure Renegotiation IS supported
Compression: NONE
Expansion: NONE
No ALPN negotiated
SSL-Session:
    Protocol  : DTLSv1.2
    Cipher    : PSK-AES128-CCM8
    Session-ID: 
    Session-ID-ctx: 
    Master-Key: 8EEF3C1AF84F27F894492B29867292AF1FF8DCFAFCC5B6B73F4A6BDA63FA387E3AC9D421B2602B60FD91858717088E28
    PSK identity: 7X2KT
    PSK identity hint: This is the identity hint!
    SRP username: None
    Start Time: 1678384562
    Timeout   : 7200 (sec)
    Verify return code: 0 (ok)
    Extended master secret: yes
---

This is the result of openssl s_client -dtls -connect etc.. etc... just with the ip of my cloud server running the same code but accessing a different libssl (locally I have OpenSSL 1.1.1l 24 Aug 2021) and on my cloud server (OpenSSL 1.1.1d 10 Sep 2019)

Handshake id successful. But after that server crashes:
===== CRASH =====
si_signo=Segmentation fault(11), si_code=1, si_addr=0x2d0

So there is definitely a problem with the way the libssl library is called

Forget this crash. Was related to fact that the psk key was not returned from the _serverPskCallback

Goooood news!!! Working again like yesterday with the cloud server. For this problem, because I was trying to debug the crash that was occurring from time to time I when back to your example server but messed up with the identity and key.

Fact stays that even if this configuration works, but the one that I need on my local linux machine to trace the crash that occurs "sporadically" does'n.

For now:

Client OpenSSL 1.1.1c 28 May 2019 <----------> Server OpenSSL 1.1.1d 10 Sep 2019:

  • Handshake OK
  • Transmission OK
  • Crash sporadically

Client OpenSSL 1.1.1c 28 May 2019 <----------> Server OpenSSL 1.1.1l 24 Aug 2021:

  • DtlsException: error:00000000:lib(0)::reason(0) that occurs before handshake

Hello Jan,

I have new info regarding this problem. I managed to build libssl and licrypto 3 for my client.

ClientOpenSSL 3.0.8 7 Feb 2023 <----------> Server OpenSSL 3.0.2 15 Mar 2022:

  • Handshake OK
  • Transmission OK
  • Crash on long run ? checking it and will come back with traces because this setup works on my local linux server in the dart environment

@Ifilehk Would you say that we can consider this issue resolved as well?

Yes resolved!

Yes resolved!

Great, thank you for your feedback :)

Let me know if you run into any more issues :)