ssh -t abduco

Question

ssh -t abduco

AeliusSaionji opened this issue 8 years ago · comments

abduco often has problems when called in this way: ssh host -t abduco -A irc weechat
First, if I'm creating a new session, it never works. I get a blank screen and I need to kill the abduco process to recover.
Attaching to an existing session works most of the time, but sometimes has rendering issues, and has failed to properly attach at least once (blank screen, must kill abduco).

Marc André Tanner · Answer 1 · Sat Apr 02 2016 20:10:39 GMT+0800 (China Standard Time)

I've been testing it the last couple of days with my abduco+dvtm setup and did not encounter any problems. That is both session creation and attaching did work as expected. Does it only happen with weechat or also with other applications?

Regarding the redraw issue, abduco just sends a SIGWINCH signal to the underlying process and expects that it redraws accordingly. I do not now what weechat does in this case. As mentioned at the end of the Quickstart section, if you encounter redraw issues after attaching it is recommended to run the underlying application within dvtm.

For your case this would translate into: ssh host -t abduco -A irc dvtm weechat

Aelius · Answer 2 · Sat Apr 02 2016 20:57:21 GMT+0800 (China Standard Time)

I actually have been using dvtm, and still run into this issue infrequently. I guess irssi/weechat are troublemakers- I have tried to ditch tmux for abduco a few times now, but end up returning to tmux. How would I go about finding and submitting a core dump?

Marc André Tanner · Answer 3 · Sat Apr 02 2016 21:26:24 GMT+0800 (China Standard Time)

Strange, I never had this issue when using dvtm.

Does it also happen if you login normally and then attach from within your interactive SSH shell session?

local$ ssh host
 host$ abduco -A irc dvtm weechat

Also I assume resizing the terminal does not improve the situation?

According to your description abduco does not crash but simply hangs? If this is the case then no core dump will be produced. You could try to compile abduco with debug flags (make debug will also enable possibly annoying logging) and then attach with a debugger gdb -p <pid>.

There will be two abduco processes, the server will be reparented to init . Hence something like pgrep -P 1 abduco should print the server process id.

Having said that the process will likely just be sitting in the select based mainloop. Where further debugging would be required ...

Another option is to use strace(1) as described in the README.

Aelius · Answer 4 · Sat Apr 02 2016 21:53:23 GMT+0800 (China Standard Time)

Some behavior I have witnessed did include a crash and core dump, but simply hanging is the more frequent result.

I'll report back with more specifics next time it glen happens.

Aelius · Answer 5 · Fri May 06 2016 06:57:19 GMT+0800 (China Standard Time)

So I still don't really have much to give you; at least once a month, abduco will freeze, consuming 100% of a CPU according to htop. Trying to attach gives me a blank screen. It most recently just happened literally in the middle of me typing something on my IRC client weechat: I was typing, suddenly it became nonresponsive, and that was it. I ssh'd in without abduco and eventually had to just kill weechat and abduco.

Martin Zimmermann · Answer 6 · Mon May 16 2016 03:26:11 GMT+0800 (China Standard Time)

I can report something similar:

Gentoo Hardened
Alpine Linux 3.3 (grsec as well)

with s6-supervise always results in 100% CPU usage. It's a bit tricky to setup, but it is something like this:

$ mkdir -p services/weechat
$ echo <<EOF
#!/bin/sh

USER=irc
HOME=`getent passwd $USER | cut -d: -f6`

exec s6-setuidgid $USER abduco -c weechat weechat
EOF >services/weechat/run
$ chmod +x services/weechat/run
$ s6-svscan services

I have no experience with gdb unfortunately, I might try again later. The same setup works perfectly fine with screen -D -m. Although screen works different with -D (doesn't exit but also doesn't attach).

Marc André Tanner · Answer 7 · Tue May 17 2016 05:13:27 GMT+0800 (China Standard Time)

@posativ is it completely reproducible (i.e. does it reliably happen all the time?) with the environment you described? If so, that would be progress. Unfortunately I can't promise to look into it myself in the near future. But gdb backtrace/strace output would probably help diagnose the issue ...

Ghjuvan Lacambre · Answer 8 · Thu Jun 09 2016 00:24:30 GMT+0800 (China Standard Time)

Also experiencing this problem with abduco-0.5 on Gentoo.
Here's how I reproduce the issue every time:
Start abduco and weechat on the host with abduco -A irc weechat.
Then, try to attach to the session from the client by using ssh host abduco -A irc weechat. What happens here is that the screen is not refreshed and it seems that abduco isn't able to find out the window's size either.
When doing kill -SIGWINCH $(pidof weechat), the abduco client will be resized and displayed (almost) how it should be (the display is one line too high, which means you can't see the first line).
Detaching from the current session (and thus closing the ssh connection) doesn't close the abduco client which then starts eating the CPU.

I am far from being from being great with gdb but when attaching to the process I noticed a whole lot of SIGPIPE signals being send to abduco. Trying to get the backtrace after a SIGPIPE shows this:

Program received signal SIGPIPE, Broken pipe.
0x76eda260 in write () from /lib/libc.so.6
(gdb) backtrace
#0  0x76eda260 in write () from /lib/libc.so.6
#1  0x76e819b8 in _IO_file_write () from /lib/libc.so.6
#2  0x76e80da0 in ?? () from /lib/libc.so.6
#3  0x76e8200c in _IO_file_xsputn () from /lib/libc.so.6
#4  0x76e5b1b0 in ?? () from /lib/libc.so.6
Backtrace stopped: previous frame identical to this frame (corrupt stack?)

I tried to generate a core file. I am no expert with gdb so if I did something wrong let me know and give me instructions on how to generate a proper core file.
abduco_core.zip

Ghjuvan Lacambre · Answer 9 · Thu Jun 09 2016 16:46:44 GMT+0800 (China Standard Time)

Alright, found the culprit for the CPU-eating problem. You're not checking for ENOTTY.
In client.c, I replaced line 116 (if (len == -1 && errno != EAGAIN && errno != EINTR)) with if ((len == -1 && errno != EAGAIN && errno != EINTR) || errno == ENOTTY) and the problem suddenly went away. I figured this is not a proper solution as the client is not properly closed, thus I didn't want to create a pull request. Hope this helps.

Marc André Tanner · Answer 10 · Fri Jun 10 2016 00:43:08 GMT+0800 (China Standard Time)

@glacambre thanks for the information, you didn't specify -t when connecting via ssh? I think this is a different issue. I suspect the errno is set to ENOTTY by the failing ioctl(2) call. I will have to check whether it can also occur as a result of read(2). In any case we should not attempt to send resize events when not dealing with an interactive tty.

In general it might be best to always use ssh host -tt abduco ... for interactive sessions.

At some point I will have to review the signal handling code, SIGPIPE in the client should probably result in a detach.

Dave Setchell · Answer 11 · Wed Jul 05 2017 23:33:22 GMT+0800 (China Standard Time)

I also have had some issues w/ dvtm running on and compiled on a grsec machine (alpine x86_64) . Issues are both with display of vis/vim* and with locking when running commands like ls. Same exact build process on a non grsec linux ( centos 7 ) gives a working dvtm. This is built w/ netbsd-curses btw.

I built w/ a cross compiler i don't have a gdb built for yet. I'll update this issue when I've got more for you.

*vis/vim issues is essentially printing of non-printing chars. i.e. ^[ gets printed as .

Aelius · Answer 12 · Thu Jul 06 2017 23:04:11 GMT+0800 (China Standard Time)

fwiw I don't think I've had this issue in a very long time.

Aelius · Answer 13 · Mon Jul 24 2017 11:04:19 GMT+0800 (China Standard Time)

Nevermind, I had this issue twice this week.

mosh server -- abduco -A irc irssi

I log in to a blank screen. When I look at top, I can see abduco eats 100% of one CPU core.

Piotr Gaczkowski · Answer 14 · Wed Sep 27 2017 19:15:07 GMT+0800 (China Standard Time)

Hi! I'm trying to use abduco together with tmux. Although I believe this was not the original idea, I wanted to implement a scratchbook session that I can summon at any time.

When run form terminal abduco behaves as expected. But when I run tmux split-window "abduco -a tmux-scratch" the screen in the new window does not redraw and I have to force it with sleep 1 && tmux resize-pane -U && tmux resize-pane -D.

Is there a better way to handle this instead of this hackish workaround?

Marc André Tanner · Answer 15 · Sat Mar 17 2018 20:47:19 GMT+0800 (China Standard Time)

For those of you (@AeliusSaionji, @posativ, @glacambre) who experienced 100% CPU usage, could you please update to current git master HEAD and check whether it is still an issue?

EOF handling on the client side wasn't done properly which could lead to an infinite loop. See commit fdbda93, not sure whether it is the cause of the reported issue here, seems strange that I personally never experienced it.

Aelius · Answer 16 · Sun Apr 08 2018 06:40:30 GMT+0800 (China Standard Time)

I have updated to the latest git master, but iirc I stopped experiencing the bug ~~when I switched from irssi to weechat~~. I recently gave irssi 1.0 a try, and I do not recall encountering the bug with irssi 1.0.

edit- or maybe it's because I stopped using ssh -t and started using mosh. Maybe something else. All I can say is I haven't had the issue in a while.

Marc André Tanner · Answer 17 · Wed May 16 2018 17:13:19 GMT+0800 (China Standard Time)

I have been using ssh ... -t abduco ... for the last couple of weeks and it seems to work fine for me with latest git master. I'm going to close this issue for now. Feel free to comment/re-open if you encounter problematic behaviour.

More generally I would like to evenutally tag another release. If anyone has pending issues, now would be a good time to report them.

Gabriel Pettier · Answer 18 · Wed Mar 01 2023 00:20:59 GMT+0800 (China Standard Time)

I recently started using abduco, installed on a centos7 machine, it's abduco 0.6 installed from the repository, and i get the same issue (blank screen, 100% cpu on the session), it usually happens when connecting back, but i'm pretty sure i got the issue before disconnecting earlier today, as everything was stuck i killed my terminal (kitty) and tried connecting again, and had the blank screen. Since it's the latest release, i'll try running from latest git version.

irisjae · Answer 19 · Sun Oct 29 2023 02:33:39 GMT+0800 (China Standard Time)

Last I debugged it with strace, I believe I saw abduco continuously get caught in EAGAIN

write(3, "\0\0\0\0\377\7\0\0\33[52;1H                 "..., 2055) = -1 EAGAIN (Resource temporarily unavailable)

I didn't pursue it further at the time, but I ran into it again recently. I'll try to put a patch that waits for a short period of time instead of spinning, and see if it resolves the issue.

irisjae · Answer 20 · Fri Nov 10 2023 16:19:59 GMT+0800 (China Standard Time)

Last I debugged it with strace, I believe I saw abduco continuously get caught in EAGAIN
write(3, "\0\0\0\0\377\7\0\0\33[52;1H                 "..., 2055) = -1 EAGAIN (Resource temporarily unavailable)
I didn't pursue it further at the time, but I ran into it again recently. I'll try to put a patch that waits for a short period of time instead of spinning, and see if it resolves the issue.

Small update: I've managed to reliably reproduce the hanging problem when I access abduco from ssh, make the terminal output a lot of stuff (like while true; do echo a lot of stuff; done), and attempt to switch my VPN location (probably switching any network will do). I've attempted to simply sleep on EGAIN, and now abduco no longer blindly spins (no longer consuming 100% of cpu), but still gets caught in the EAGAIN loop after sleeping.

I'll investigate this a bit more later.

irisjae · Answer 21 · Sun Dec 10 2023 12:44:00 GMT+0800 (China Standard Time)

By the way, if you happen to be stuck in abduco on this issue, try to find the stuck abduco -a ... process left-over from when the connection was abruptly severed. In my experience, once you kill that process, the daemon process abduco -c ... will stop consuming 100% CPU and accept new connections normally.