Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

use_pty: after killing an SSH connection some application are still running #367

Closed
DispatchCode opened this issue Apr 5, 2024 · 12 comments

Comments

@DispatchCode
Copy link

Few months ago we discovered a bug that can be reproduced using Defaults use_pty while connected to a machine using SSH.
The bug was "more" related, and here you can find the report: more: exit if POLLHUP or POLLERR on stdin is received.

Now we are facing - more or less - the same problem but with tail. In short, you can reproduce the issue doing:

  1. be sure to use Defaults use_pty (that is now the default)
  2. sudo su -
  3. tail -f /var/log/messages
  4. kill the SSH connection

You will see the tail still running after that. strace produced this output:

strace: Process 2051 attached
select(5, [4], NULL, NULL, NULL)        = 1 (in [4])
read(4, "\1\0\0\0\2\0\0\0\0\0\0\0\0\0\0\0", 34) = 16
fstat(3, {st_mode=S_IFREG|0640, st_size=2031533, ...}) = 0
read(3, "2024-02-21T10:51:06.467247+01:00"..., 8192) = 215
write(1, "2024-02-21T10:51:06.467247+01:00"..., 215) = -1 EIO (Errore di input/output)
read(3, "", 8192)                       = 0
select(5, [4], NULL, NULL, NULL)        = 1 (in [4])
read(4, "\1\0\0\0\2\0\0\0\0\0\0\0\0\0\0\0", 34) = 16
fstat(3, {st_mode=S_IFREG|0640, st_size=2031648, ...}) = 0
read(3, "2024-02-21T10:51:06.477515+01:00"..., 8192) = 115
write(1, "2024-02-21T10:51:06.477515+01:00"..., 115) = -1 EIO (Errore di input/output)
read(3, "", 8192)                       = 0

and it never ends. Basically tail keep reading from the file.

Considering that they are not the only tasks with a similar problem, can be this caused by sudo, specifically, the option use_pty ?

The use of KillUserProcesses=yes is not an option.

Thank you!

@millert
Copy link
Collaborator

millert commented Apr 5, 2024

You always need to check for POLLHUP when polling input since it is possible to get POLLHUP without POLLIN also being set. This is different from select() where readfds is set for both input and EOF conditions.

I'm curious why you would only see this with sudo and use_pty, though. Which version of sudo do you see this with?

@DispatchCode
Copy link
Author

Hi @millert , thanks for your reply. I'll report the informations in our internal ticket.

BTW, months ago I manage to ran many tests trying to reproduce the problem with more running sudo version 1.9.12p1 (same used by the customer).
With this last case (where "tail" is involved) I reproduced the problem with "sudo: 1.9.9".

I can confirm that after:

  • comment Defaults use_pty using visudo (and also reboot, just in case)
  • sudo su -
  • tail -f /var/log/messages

The session is correctly killed. Enabling use_pty, tail is still running.

I think is the behavior of use_pty after sudo su - that cause this issue... or better, I guess so. I noticed that if I'm logged in with two users (user and user that became root) I can see 3 users. This happen obviously with or without use_pty.

But, without use_pty if you kill the SSH connection (ALT+F4, directly) you can see 2 users connected; with use_pty the users are still 3.
This is what I can see after the ALT+F4 on the other shell:

UID        PID  PPID  PGID   SID  C STIME TTY          TIME CMD
root      2049  1988  2049  1986  0 09:56 ?        00:00:00 tail -f /var/log/messages

I suspect that there are no many reports just because in that case tail consume 0% of CPU. Indeed, we have 1 report for tail but 4, about more (because it reach 100% of CPU).

Anyhow, maybe you can find this interesting: this situation happen also only if you do sudo su - because on a "normal" root, the session is killed and tail (more, is the same) is not running. Still, if you run this with sudo tail -f /var/log/messages, the issue cannot be reproduced. You can only with tail on a user that is root, after sudo su -

@DispatchCode
Copy link
Author

Hi @millert just checking: you had the time to take a look to my previous comment? Anything you wanna share?

Thanks!

@millert
Copy link
Collaborator

millert commented Apr 18, 2024

When tail is running directly via sudo and the session is closed, sudo will receive SIGHUP from the kernel due to the tty going away, which it forwards to the command running in the other pty. So that is probably what is killing tail in the normal case. This also works when sudo runs the shell directly, for instance:

  • sudo bash --login
  • tail -f /var/log/messages

In this case, when the session is closed, sudo sends SIGHUP to bash which then kills the command.

However, this doesn't seem to be the case with "sudo su -" and, at least on Ubuntu, I see the sudo process is still running after the session is closed, probably because it is waiting for the command to exit.

gao-yan added a commit to gao-yan/pacemaker that referenced this issue Apr 23, 2024
Previously when crm_mon was running in console mode, if the attached
pseudo-terminal got lost, crm_mon would persist in the background and
raise the CPU usage to 100%.

The situation triggers if `use_pty` is enabled for `sudo`, in which case
it creates a separate pseudo-terminal device for its child process.

A producer:

- Enable `use_pty` by adding `Defaults use_pty` to /etc/sudoers

- Open a console and execute:
  > sudo su -
  # crm_mon

- Open another console, find the PID of crm_mon's bash parent and kill
  it:
  # kill -9 $(ps -C crm_mon -o ppid=)

The pty device created by `sudo` from the first console is basically
deleted, but crm_mon continues running in the background and raises the
CPU usage.

This commit fixes it by watching more conditions from stdin and exiting upon
(G_IO_ERR | G_IO_HUP).

The similar was reported and fixed for the `more` command:
util-linux/util-linux#2635
util-linux/util-linux#2795

The `tail` command is also impacted but hasn't been fixed so far.

There's the relevant discussion here:
sudo-project/sudo#367
gao-yan added a commit to gao-yan/pacemaker that referenced this issue Apr 23, 2024
Previously when crm_mon was running in console mode, if the attached
pseudo-terminal got lost, crm_mon would persist in the background and
raise the CPU usage to 100%.

The situation triggers if `use_pty` is enabled for `sudo`, in which case
it creates a separate pseudo-terminal device for its child process.

A producer:

- Enable `use_pty` by adding `Defaults use_pty` to /etc/sudoers

- Open a console and execute:
  > sudo su -
  # crm_mon

- Open another console, find the PID of crm_mon's bash parent and kill
  it:
  # kill -9 $(ps -C crm_mon -o ppid=)

The pty device created by `sudo` from the first console is basically
deleted, but crm_mon continues running in the background and raises the
CPU usage.

This commit fixes it by watching more conditions from stdin and exiting upon
(G_IO_ERR | G_IO_HUP).

The similar was reported and fixed for the `more` command:
util-linux/util-linux#2635
util-linux/util-linux#2795

The `tail` command is also impacted but hasn't been fixed so far.

There's the relevant discussion here:
sudo-project/sudo#367
gao-yan added a commit to gao-yan/pacemaker that referenced this issue Apr 23, 2024
Previously when crm_mon was running in console mode, if the attached
pseudo-terminal got lost, crm_mon would persist in the background and
raise the CPU usage to 100%.

The situation triggers if `use_pty` is enabled for `sudo`, in which case
it creates a separate pseudo-terminal device for its child process.

A producer:

- Enable `use_pty` by adding `Defaults use_pty` to /etc/sudoers

- Open a console and execute:
  > sudo su -
  # crm_mon

- Open another console, find the PID of crm_mon's bash parent and kill
  it:
  # kill -9 $(ps -C crm_mon -o ppid=)

The pty device created by `sudo` from the first console is basically
deleted, but crm_mon continues running in the background and raises the
CPU usage.

This commit fixes it by watching more conditions from stdin and exiting upon
(G_IO_ERR | G_IO_HUP).

The similar was reported and fixed for the `more` command:
util-linux/util-linux#2635
util-linux/util-linux#2795

The `tail` command is also impacted but hasn't been fixed so far.

There's the relevant discussion here:
sudo-project/sudo#367

Fixes T16
millert added a commit that referenced this issue Apr 28, 2024
Previously, we would simply close the pty leader in the main sudo
process.  This had the effect of revoking the pty, but the foreground
process would not necessarily receive SIGHUP.  By using TIOCNOTTY
in the monitor, the running command has a better chance of getting
SIGHUP.  Once the monitor has revoked the pty, the main sudo process
will close the pty leader, invalidating the pty.  GitHub issue #367.
@millert
Copy link
Collaborator

millert commented Apr 28, 2024

I just committed a change that should fix the problem of running "sudo su -" and then killing the session. Sudo will now try to revoke the terminal using the TIOCNOTTY ioctl before closing the pseudo-terminal. On Linux, this will cause the foreground process to receive SIGHUP, even if it is in a different process group that the original command (in this case su). In my testing that causes the tail command to exit, as well as the bash and su processes.

@DispatchCode
Copy link
Author

Thank you for this change! I'll look tomorrow and I will try your patch.

millert added a commit that referenced this issue Apr 28, 2024
We don't need to revoke the terminal in the monitor, just signal
the foreground process group.  This is more portable and has the
same effect as ioctl(TIOCNOTTY) would on Linux.  Since we now signal
the command from the monitor, there is no reason to forward SIGHUP
from the kernel.  GitHub issue #367.
@DispatchCode
Copy link
Author

Hi, sorry for my delay in respondig. I didn't received feedback from the support till now.

Anyhow, thank you for the patch! It works in that scenario.

Can be achived the same killing directly "su -"?
I have a shell where I launched:

sudo su -
tail -f /var/log/messages

Then from another shell:

germ335:~ # ps aux | grep tail
root      1066  0.0  0.0   2940   736 pts/2    S+   10:12   0:00 tail -f /var/log/messages
root      1069  0.0  0.0   8212   812 pts/0    S+   10:13   0:00 grep --color=auto tail

germ335:~ # ps -ft pts/2
UID        PID  PPID  C STIME TTY          TIME CMD
root      1010  1009  0 10:12 pts/2    00:00:00 sudo su -
root      1011  1010  0 10:12 pts/2    00:00:00 su -
root      1012  1011  0 10:12 pts/2    00:00:00 -bash
root      1066  1012  0 10:12 pts/2    00:00:00 tail -f /var/log/messages

germ335:~ # kill -9 1011

germ335:~ # ps aux | grep tail
root      1066  0.0  0.0   2940   736 ?        S    10:12   0:00 tail -f /var/log/messages

As you can see tail is still running.

Thanks!

@millert
Copy link
Collaborator

millert commented May 18, 2024

I see the same behavior without sudo. For example:

$ su -
Password:
root@linux-build:~# tail -f /var/log/syslog
...

From another terminal:

$ ps auxwwg | grep su
root 1657018 0.3 0.0 29964 4736 pts/0 S 08:44 0:00 su -
millert 1657034 0.0 0.0 6624 2048 pts/1 S+ 08:44 0:00 grep su

$ sudo kill -9 1657018
$ ps auxwwg | grep tail
root 1657030 0.0 0.0 5680 1920 pts/0 S 08:44 0:00 tail -f /var/log/syslog
millert 1657039 0.0 0.0 6624 2176 pts/1 S+ 08:44 0:00 grep tail

This was on Ubuntu but I'd expect it to be similar on other distros.

When running a command in a pty, sudo changes the controlling terminal to itself after the command exits before sending the exit status to the parent sudo process. This prevents the kernel from sending SIGHUP to commands spawned by the sudo-run command when use_pty is enabled to better match the behavior when use_pty is disabled.

This seems like the correct behavior to me.

millert added a commit that referenced this issue May 19, 2024
There's no need to send messages back and forth to the monitor
when the main process can just do it.  GitHub issue #367.
@DispatchCode
Copy link
Author

DispatchCode commented May 20, 2024

I see the same behavior without sudo. For example:

Yes, I don't know why I added sudo...
The problem reported is without sudo, but only with "su -".

Thanks for your reply, I'll look at your latest commit referenced here.

@DispatchCode
Copy link
Author

$ ps auxwwg | grep su root 1657018 0.3 0.0 29964 4736 pts/0 S 08:44 0:00 su - millert 1657034 0.0 0.0 6624 2048 pts/1 S+ 08:44 0:00 grep su

$ sudo kill -9 1657018 $ ps auxwwg | grep tail root 1657030 0.0 0.0 5680 1920 pts/0 S 08:44 0:00 tail -f /var/log/syslog millert 1657039 0.0 0.0 6624 2176 pts/1 S+ 08:44 0:00 grep tail

BTW, I can confirm that this is exactly the issue.

@millert
Copy link
Collaborator

millert commented May 20, 2024

If you kill su with SIGKILL there is no way for it to forward the signal to the child process and no notification for the child that the parent has exited. Linux su will forward some signals, such as SIGTERM to the child, but SIGKILL is not catchable.

@DispatchCode
Copy link
Author

I think we are free to close, if something else comes out I'll open another issue.

Thank you for all your help, really appreciated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants