Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing real time signal prioritization #8899

Open
patacongo opened this issue Mar 24, 2023 · 11 comments
Open

Missing real time signal prioritization #8899

patacongo opened this issue Mar 24, 2023 · 11 comments
Labels
bug Something isn't working Standards NuttX application interfaces must compy to standards

Comments

@patacongo
Copy link
Contributor

patacongo commented Mar 24, 2023

Real Time Signals

Real time signals differ from standard signals in several different ways:

  • They are available for use by applications, never explicitly by the OS.
  • They are used for IPCs
  • The POSIX real time extensions that added real time signals specify special APIs that are intended for real time signals (we don't make this distinction at the interface level however).
  • Queuing of signal handling is optional for standard signals, it is required for real time signals.

But the most important difference and the one that NuttX does not address is:

  • Real time signals are prioritized.

Signal Prioritization

This is important for good real time performance of real time signals as an IPC.

SIGRTMIN is the number of the highest priority real time signal; Higher numbered signals have lower priority. Standard signals are unprioritized and, if queued, should be processed FIFO. Nuttx does queue signal handling, but all signals are unprioritized and handled FIFO.

POSIX does not specify the relative priority of standard signals and real time signals. Linux handles that in this way: All signals are prioritized. That is not quite consistent with POSIX and creates some odd prioritization (like CONFIG_SIG_USR1/2 are high priority than SIGTERM or SIGSTOP). It also means that all of the standard signals are higher in priority and any of the real time signals.

Adopting the Linux behavior is easy: The single FIFO queue becomes prioritized (but FIFO for pending signal handling of the same signal). Following POSIX is not much more difficult: We would need two queues: An unprioritized FIFO queue for the standard signals and a prioritized queue for the real time signals.

Nested Signal Handling

Currently, NuttX does not supported nested signal handlers. Here is where one-at-a-time signal handling is enforced for ARMv7M: https://github.com/apache/nuttx/blob/master/arch/arm/src/armv7-m/arm_schedulesigaction.c#L88

/* Refuse to handle nested signal actions */

if (tcb->xcp.sigdeliver == NULL)
  {

Interrupting one signal handler to process another implies some priority handling of signals. Why would you interrupt one handler to run another versus letting the current handler run to completion first.

If prioritized signal handling is implemented, then we should also reconsider this. It might not be a bad thing to permit a higher priority signal handler to interrupt a lower priority signal handler.

@patacongo patacongo added bug Something isn't working Standards NuttX application interfaces must compy to standards labels Mar 24, 2023
@pkarashchenko
Copy link
Contributor

pkarashchenko commented Mar 25, 2023

I have noticed that there are few more items to be checked:

  • The default action for a real-time signal is to terminate the receiving process -- I'm not sure that we have this
  • Real-time signals are queued to the receiving process. This applies even when the same real-time signal is received multiple times. In contrast, if a standard signal is blocked and multiple instances of it are to be delivered to a process, only one becomes pending and the rest are discarded. -- I think we can miss real-time signal if sigwait is called after signal is sent and signal is not blocked (sighold() is not called).

Should we have a separate issue for this?

@pkarashchenko
Copy link
Contributor

@patacongo should RT signals always go via sigqueue? Seems like they should

@pkarashchenko
Copy link
Contributor

I'm getting even more lost. Seems like sigqueue is not implemented:

/****************************************************************************
 * Name: nxsig_queue
 *
 * Description:
 *   This function sends the signal specified by signo with the signal
 *   parameter value to the process specified by pid.
 *
 *   If the receiving process has the signal blocked via the sigprocmask,
 *   the signal will pend until it is unmasked. Only one pending signal (per
 *   signo) is retained.  This is consistent with POSIX which states, "If
 *   a subsequent occurrence of a pending signal is generated, it is
 *   implementation defined as to whether the signal is delivered more than
 *   once.

@pkarashchenko
Copy link
Contributor

Sorry. Not a sigqueue, but a queue of signalling

@pkarashchenko
Copy link
Contributor

Seems I mix-up the realtime signals queue with pending signals queue. I will go and read specs to get things clarified

@pkarashchenko
Copy link
Contributor

Seems like signal queue and pending queue are the same, so we just need to implement separate logic for realtime signals that will allow multiple signals to be pushed to the pending queue.

But there is still an additional question is about should realtime signals always be passed via pending queue or only if the signal is blocked. Passing always via queue has a benefit that even if the realtime signal was published before consumer calls sigwait it will be delivered. On the other hand user has full control on the path by masking a signal during initialisation

@patacongo
Copy link
Contributor Author

patacongo commented Mar 26, 2023

Hi, @pkarashchenko . I just noticed the questions. Sorry, I don't subscribe to notifications unless the specifically address me by my username. I am not sure I can answer all of these questions as asked. Let me instead talk about the queuing logic itself. I am a little rusty with this and need the refresher anyway:

Signals are not queued. That is sloppy English. Most signal handling occurs immediately when the signal is received. For example, if a thread is waiting for a signal, then the thread will be awaken immediately upon receipt of the signal.

What is really queued is just the signal actions. Only one signal handler can run at once once for a given task. The queue that hold the pending signal actions is called sigpendactionq and the head of this list is in the thread's TCB.

A pending signal action is added to sigpendactionq in nxsig_tcbdispatch() if a received signal is blocked.

The queue is emptied by a loop in nxsig_deliver() which loops, running signal handlers until the queue is empty. It calls up_signal_dispatch which sets up signal handler to run via the trampoline arm_sigdeliver() for arm. When the signal handler returns, this function returns to nxsig_deliver() which will process the next queued signal handler.

The first question would be: When POSIX talks about queing signals, are they are referring to the same thing?

To emulate Linux behavior, we would simply replace the FIFO sigpendactionq with a prioritized queue. The priority would be based on the signal number; lower signal numbers have higher priority. Maybe the priority would be SIGRTMAX - signal_number? In this case, as it is in Linux, the lowered standard signals would all have priority of any real time signal.

And alternative would be to have two queues one prioritized like this just for real time signals and an non-prioritized FIFO for the standard signals. Which would be higher priority? It would seem like real time signals should have higher priority for better real timer performance, but higher priority standard signals would behave more like Linux.

Feedback/thoughts/disagreements welcome.

@patacongo
Copy link
Contributor Author

patacongo commented Mar 26, 2023

I have noticed that there are few more items to be checked:

* `The default action for a real-time signal is to terminate the receiving process` -- I'm not sure that we have this

That could be a good feature in some case. However, that means that the sigaction must always be set even if it is not used. For example if the signal only wakes up a sleep. The default action is SIG_DFL, at a minimum than it would have to be set to SIG_IGN.

* `Real-time signals are queued to the receiving process. This applies even when the same real-time signal is received multiple times. In contrast, if a standard signal is blocked and multiple instances of it are to be delivered to a process, only one becomes pending and the rest are discarded.` -- I think we can miss real-time signal if `sigwait` is called after signal is sent and signal is not blocked (`sighold()` is not called).

That is confusing. It could be the same real time signal or any other signal: In any case, only a single signal handler can run at at time; a signal handler cannot be interrupted by another signal handler. So any signal handling action received while a signal handler is running has to pend in a queue.

Blocked signals are always queued. The signal handler can run when the signal is unblocked.

Multiple standard signals with the same number can be queued, can't they??? I would need a POSIX reference to understand why you say that. Signals that terminate the task will, of course cause all pending signals to be discarded. But things like SIGCHLD, the death-of-a-child signal could happen several times if child tasks exit synchronously. I think you not want to lose those events.

The last time I looked, POSIX said the queuing of standard signals is optional. But again, we should probably clarify what POSIX means by queuing. In NuttX there is one queue for all pending signal actions; you seem to be implying a separate queue for each signal.

Should we have a separate issue for this?

Up to you. We can fold that into this issue if you like; or you can create a new issue.

@patacongo
Copy link
Contributor Author

  • The default action for a real-time signal is to terminate the receiving process -- I'm not sure that we have this

No we don't. The default actions are setup and handled in sched/signal/sig_default.c. It currently only addresses standard signals. Notice that is also supports "turning off" the default actions for many signals for which the default is abnormal termination. For example:

132 #ifdef CONFIG_SIG_SIGALRM_ACTION
133   { SIGALRM,   0,                nxsig_abnormal_termination },
134   { SIGVTALRM, 0,                nxsig_abnormal_termination },
135 #endif

witch is selected in sched/Kconfig:

1402 config SIG_SIGALRM_ACTION
1403         bool "SIGALRM SIGVTALRM"
1404         default n
1405         ---help---
1406                 Enable the default action for SIGALRM AND SIGVTALRM(terminate the task)
1407                 Make sure that your applications are expecting this POSIX behavior.
1408                 Backward compatible behavior would require that the application use
1409                 sigaction() to ignore SIGALRM.
1410

So, by default, the default action is turned off. No defconfig file currently enables any default action for any of the signals where the option is supported.

@patacongo
Copy link
Contributor Author

patacongo commented Mar 27, 2023

This is kind of old, but might still have useful information about the flow of signal processing (it pre-dates default signal actions): https://cwiki.apache.org/confluence/display/NUTTX/Signal+Handler+Tour

Handling of signals in multi-threaded tasks can be complex. Any thread that does not have the signal blocked via pthread_sigprocmask can be used to run the handler for that signal. Signals are sent to task groups and not to threads.

There are two two queues for pending action queues, there is one in the task group structure. All signals that are received are placed in the queue the moment that they are received: https://github.com/apache/nuttx/blob/master/include/nuttx/sched.h#L491

This one is a holding area for signals that are received but not yet dispatched. There is another pending signal action queue in each thread's TCB: https://github.com/apache/nuttx/blob/master/include/nuttx/sched.h#L617
When the signal action is dispatched, a container holding the action is added to another queue in the TCB of the task that will handle the signal: https://github.com/apache/nuttx/blob/master/sched/signal/sig_dispatch.c#L108

The queue in the group determines the order in which signals received are dispatched to different threads. It should prioritized but is currently FIFO. This queue determines the order in which signals are dispatched for one thread. This TCB queue should also be prioritized.

NOTE. A potential problem is that the decision of which thread will get the signal is made before signal action is queued in the TCB: https://github.com/apache/nuttx/blob/master/sched/signal/sig_dispatch.c#L584 Things could change when the signal is eventually processed. This is probably not an issue in a healthy system that receives signals at a low rate on each thread.

The queue is emptied by a loop in nxsig_deliver() which loops, running signal handlers until the queue is empty: https://github.com/apache/nuttx/blob/master/sched/signal/sig_deliver.c#L89 It calls up_signal_dispatch which sets up signal handler to run via the trampoline arm_sigdeliver() for arm. When the signal handler returns, this function returns to nxsig_deliver() which will free the entry from the pending action process the next queued signal handler: https://github.com/apache/nuttx/blob/master/sched/signal/sig_deliver.c#L206

The signal is blocked or a thread when each signal handler runs: https://github.com/apache/nuttx/blob/master/sched/signal/sig_deliver.c#L120 and unblocked when the signal handler returns: https://github.com/apache/nuttx/blob/master/sched/signal/sig_deliver.c#L190

@patacongo
Copy link
Contributor Author

@pkarashchenko @xiaoxiang781216 I think will have some time to work on the implementation of this in the next few days. Based on my summary above, I would do a very small change:

  1. Change the way that signal information is added to signal queues so that signals are enqueued in priority order (lower signal number has higher priority). These queues include
  1. There would be no change how signals are removed: The next signal removed from the head of either pending signal list will be the highest pending signal in the list.

  2. This probably involves creation of custom list insertion routines probably similar to nxsched_add_prioritized(). Two such routines may be required since the underlying form of the queue information is different.

How compliant is this with the POSIX requirements? I think it is compliant but does implement some optional, unspecified behaviors:

  1. Standard signals may or may not be queued. POSIX neither requires nor forbids this to my knowledge.
  2. Similarly, POSIX does not require standard signals to be prioritized, nor does it forbid them from being prioritized.

I like this solution because:

  1. It is very simple and lower risk to implement. It does not require any additional data structures or conditional logic for standard and real time signals. There is no change to the behavior except that signals are enqueued in priority order
  2. The prioritization is identical to the way that signals prioritization is done in Linux. In Linux:

Delivery priority

Lower number signals are given priority over higher number signals. In Linux this is true across the range of all signals, which incidentally gives the regular signals higher priority than all real-time signals (POSIX says that priority of real-time versus ordinary signals is unspecified, and only the realtime signals have a specified priority relative to each other).

From http:https://davmac.org/davpage/linux/rtsignals.html

  1. But uses the legacy queuing of standard signals, i.e., all signals are queued. Linux queues standard signals but merges signals if multiple instances of the same signal is received:

Signal queues

One of the primary differences between normal signals and real-time signals is that the real-time signals are queued. That is, for every real-time signal successfully sent, the process signal handler will be called (assuming that a handler has been established). Normal signals on the other hand will be merged if sent to a process which already has that signal pending, as explained in the libc manual.

From http:https://davmac.org/davpage/linux/rtsignals.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Standards NuttX application interfaces must compy to standards
Projects
None yet
Development

No branches or pull requests

2 participants