IBBoard / cawbird

A fork of the Corebird GTK Twitter client that continues to work with Twitter

Home Page:https://ibboard.co.uk/cawbird/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Do not lose tweets on TLS issues

IBBoard opened this issue · comments

Sometimes (possibly just on one network) I get TLS connection errors. Normally this happens when RTing, but occasionally it happens when tweeting. Normally, Corebird/Cawbird keeps the last tweet, but in one specific situation that I think is related to the TLS errors then it forgets it. My assumption is that it doesn't handle the error correctly and treats it as posted.

Error is:

Exception: Error performing TLS handshake: An unexpected TLS packet was received in TweetUtils.vala:91

At least it's from Vala code and we get a line number!

This may be handled correctly on sending tweets. start_send_tweet in ComposeTweetWindow.vala only wipes the last message when the async finishes and the tweet succeeds (src).

But save_last_tweet only saves the text when we're not replying to something. So maybe we lose replies.

From initial testing then the patches are working. I've had an RT that failed that then reset back to the "not RTed" status so that I could RT properly. Previously the UI would say RTed, but the dialog says it failed, then you'd un-RT and it'd complain that the tweet didn't exist, and then you'd RT.

The only problem is that I've had to add a "Save tweet?" dialog to give users a way out when they quote a tweet, it fails for transient reasons (like the TLS issue), and then by the time they go to post it again then it fails to load the tweet and so they want to clear the draft tweet.

One thing I need to think about is attached media. I'm not sure how well we can handle keeping draft attached images.

Today was a REALLY bad day for these errors. Happened repeatedly. Not a clue what's happening with Twitter/the network. I briefly tried Wireshark but it wasn't hugely helpful.

However, I did find that I've got more work to do:

  • Fix the double notification (I'm considering rearchitecting how that works)
  • Don't inject the DM until we know it sent
  • Reset the "Send" status and stop the spinner when tweeting fails

Also, I still need to look at attached media.

Okay, I think that tweet, quote, reply, DMs, RT/unRT, favourite/un-favourite and delete are all working now (as in they catch the error and report it to the user, then put the system in a sensible state).

We've also got the "Save?" dialog when you close the compose window with text. I'm not hugely keen on it, but it gives us a way out of the "compose a reply, sending fails, re-open compose later, try to load tweet that you were replying to, tweet doesn't exist, cancel button saves tweet, re-open compose, try to load tweet, tweet doesn't exist…" loop. But as I type this then I'm wondering whether we're better off asking the user whether to ditch the saved tweet when fetching the reply fails.

Now I need to look at saving attached images and work out whether it's better to save the local paths and handle the user deleting them, or save the remote paths (which would lose incomplete uploads) and work out how to download the thumbnail.

Technical detail

From running Wireshark and decrypting my comms (instructions) and then running with GNUTLS_DEBUG_LEVEL set and poking around the gnutls source code then it looks like the problem comes when Twitter sends a "Server Hello" with "New Session Ticket" as well as "Change Cipher Spec".

gnutls[4]: HSK[…]: Server's version: 3.3
gnutls[4]: HSK[…]: SessionID length: 32
gnutls[4]: HSK[…]: SessionID: 5fe…
gnutls[4]: EXT[…]: Parsing extension 'Safe Renegotiation/65281' (1 bytes)
gnutls[4]: HSK[…]: Safe renegotiation succeeded
gnutls[5]: REC[…]: SSL 3.3 Handshake packet received. Epoch 0, length: 186
gnutls[5]: REC[…]: Expected Packet ChangeCipherSpec(20)
gnutls[5]: REC[…]: Received Packet Handshake(22) with length: 186
gnutls[5]: REC[…]: Decrypted Packet[1] Handshake(22) with length: 186
gnutls[3]: ASSERT: record.c[recv_hello_request]:774
gnutls[3]: ASSERT: record.c[_gnutls_recv_in_buffers]:1579
gnutls[3]: ASSERT: record.c[_gnutls_recv_int]:1777
gnutls[3]: ASSERT: handshake.c[recv_handshake_final]:3272
gnutls[3]: ASSERT: handshake.c[handshake_client]:3074
gnutls[3]: ASSERT: buffers.c[_gnutls_io_read_buffered]:589
gnutls[3]: ASSERT: record.c[_gnutls_recv_int]:1777
(cawbird:14730): cawbird-WARNING **: 11:43:55.926: Could not send tweet: Error performing TLS handshake: An unexpected TLS packet was received.

I still don't know why this is a problem, but the gnutls state machine appears to be in an incorrect state (or Twitter are sending things out of order) and it's expecting the wrong bit.

Question answered on the "save local image path or uploaded image" - from the docs:

  • The returned media_id and media_key are only valid for expires_after_secs seconds. Any attempt to use either after this time period in other endpoints will result in an HTTP 4xx Bad Request.

(Emphasis mine)

Given that we don't know when we'll come back to it, we'll have to assume that users keep the images that they're attaching (and if they don't then it's their fault!). The example shows 24h, which will generally be sufficient, but we can't guarantee that'll be the case.

Okay, I've improved how we handle saving - most users won't see a prompt. You'll now only get prompted if you compose with a draft reply/quote and it fails to load the referenced tweet (e.g. TLS error).

Only thing to do now is load/save draft images.

I've patched my custom version of Cawbird with the branch as it stands, so I'll do some testing before I merge.

I've put in some load/save draft images stuff now. Possibly not so relevant since we now don't close the window and lose the tweet if it fails anyway, but a) useful for when people give up and try again later and b) it's let me spot another TLS error case to handle!

Still to go:

  • TLS error during timeline loading
  • TLS error during image upload

Anything I missed?

I'm vaguely suspicious that it's a GnuTLS bug underlying all of this, but I can't track it down and I don't know that level of the network stack enough to just drop in to their community and go "what's going on here?". Also, it could be GnuTLS following the spec and Twitter not doing.

Okay, timeline loading issues have a basic cludge. Image upload is sorted. I'm going to run a test build for a while to see if I hit any issues.

It looks like GnuTLS are acknowledging this as a bug, so all of these changes will be defensive and shouldn't be needed in future. It appears to be something that Twitter changed that GnuTLS doesn't handle cleanly.

https://gitlab.com/gnutls/gnutls/issues/841#note_225110002

Okay, I think this is about done now. I've still got to look in to the GnuTLS side, but most things should now be handled correctly, fail and retry, or fail silently for some remaining less important things.

I've merged to Master and will release soon, so give me a shout if there's anything else that people find.

It looks like there's a simple fix upstream in GnuTLS that appears to work for me.

It's currently scheduled for v3.6.11 in December.

In the mean-time, I'll consider this as fixed as it's going to be, unless someone tells me otherwise.

@IBBoard Have you really checked compiling cawbird with GnuTLS from master?
I tried backporting the patches from this commit and compiled cawbird with a glib-networking that uses this patched GnuTLS in NixOS, but I still get TLS errors.

I want to test building cawbird agains the latest GnuTLS master but currently fail at building the latter.

I've been running with a custom build of GnuTLS for a few days that basically just adds MR 1087 to the 3.6.10 code and I've not seen a single TLS error.

I don't believe it's glib-networking that needs recompiling with the updated GnuTLS. I think it's libsoup. I've rebuilt GnuTLS but not glib-networking.

I don't believe it's glib-networking that needs recompiling with the updated GnuTLS. I think it's libsoup.

That is pretty interesting, as libsoup does not even take gnutls as a dependency in NixOS but just depends on glib-networking.
And that does take gnutls as a dependency.

Huh, maybe you're right. Maybe I was reading old docs where libsoup used GnuTLS directly. I just rebuilt them both and it worked, but looking at build specs and ldd output then libsoup doesn't reference gnutls but does reference GIO's g_tls_… functions and glib-networking provides /usr/lib64/gio/modules/libgiognutls.so, which loads libgnutls.so.30. I might not need to rebuild libsoup.

Good news, having built Cawbird against NixOS-unstable with the latest GnuTLS release, I haven't encountered any TLS issues for several hours so far \0/

Good job.

I just debugged it. The GnuTLS people knew the protocols enough to understand what to do to fix it!

I'm running Ubuntu 20.04 with GnuTLS 3.6.13 and still getting the error.

Well i'm running the snap which has a 18.04 base, so maybe that's the issue then.

I've never stopped having this error the first time I try to open any tweet in the timeline.
Pressing back and trying again then works on any tweet.
18.04, 1.0.5 now.

As far as I can tell, it looks like Ubuntu 18.04 is still using an old version of GnuTLS (possibly 3.5.18).

From what I understand of Snap, I think it bundles its own version of many libraries and I assume it is also outdated if you're seeing the errors.

There's nothing we can do. The changes in this ticket put in the best work-arounds we can without causing different issues. The only real solution is for distros and "application distribution systems" that insist on bundling their own copies of libraries to update their libraries to fix the real bug.

I've filed a bug with the snap dev of cawbird as this is fixable in snaps.

Patch to fix it in 18.04 and 19.10 was submitted by the snap maintainer.
https://bugs.launchpad.net/ubuntu/+source/gnutls28/+bug/1873565