Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[kernel] 被动挂起的线程,无法处理信号 #7663

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

weidongshan
Copy link

拉取/合并请求描述:(PR description)

[
如果一个线程不是调用rt_thread_delay等函数主动进入挂起状态,不是在rt_schedule函数中放弃运行,
那么其他线程给它发信号时,它是不会执行信号处理函数的。
这个Bug的原因在于:在_signal_deliver函数中,发现目标线程处于挂起状态时,
只做了3件事:把它唤醒、设置线程状态为RT_THREAD_STAT_SIGNAL_PENDING、调用rt_schedule。
我们期望目标线程被唤醒后,从rt_schedule继续运行时,发现线程状态为RT_THREAD_STAT_SIGNAL_PENDING时调用信号处理函数。
但是,目标线程挂起时,不一定是在rt_schedule函数中放弃运行,比如它可能是被其他线程调用rt_thread_suspend挂起的。
这时候,目标线程不会执行信号处理函数。

为什么提交这份PR (why to submit this PR)

你的解决方案是什么 (what is your solution)

修改了src/signal.c,即使目标线程处于挂起状态,也是去修改它的栈。

在什么测试环境下测试通过 (what is the test environment)

可以如下复现这个Bug:创建2个线程,让thread2挂起thread1,然后给thread1发信号。示例代码如下:
`static void thread1_entry(void *parameter)
{
const char *thread_name = parameter;
int cnt = 0;

/* 打印线程的信息 */
rt_kprintf("%s run ...\r\n", thread_name);

/* 安装信号,自定义处理函数 */
rt_signal_install(SIGUSR1, thread1_signal_handler);

/* 解除阻塞 */
rt_signal_unmask(SIGUSR1);

while(1)
{		
    rt_kprintf("%s %d\r\n", thread_name, cnt++);
}

}

/* 线程2的入口函数 */
static void thread2_entry(void *parameter)
{
const char *thread_name = parameter;
int cnt = 0;

/* 打印线程的信息 */
rt_kprintf("%s run ...\r\n", thread_name);

rt_thread_mdelay(100);	

while(1)
{		
    rt_kprintf("%s %d\r\n", thread_name, cnt++);
    rt_thread_suspend(thread1);    
    rt_thread_kill(thread1, SIGUSR1); //向线程1发送信号SIGUSR1
}

}`

]

当前拉取/合并请求的状态 Intent for your PR

必须选择一项 Choose one (Mandatory):

  • 本拉取/合并请求是一个草稿版本 This PR is for a code-review and is intended to get feedback
  • 本拉取/合并请求是一个成熟版本 This PR is mature, and ready to be integrated into the repo

代码质量 Code Quality:

我在这个拉取/合并请求中已经考虑了 As part of this pull request, I've considered the following:

  • 已经仔细查看过代码改动的对比 Already check the difference between PR and old code
  • 代码风格正确,包括缩进空格,命名及其他风格 Style guide is adhered to, including spacing, naming and other styles
  • 没有垃圾代码,代码尽量精简,不包含#if 0代码,不包含已经被注释了的代码 All redundant code is removed and cleaned up
  • 所有变更均有原因及合理的,并且不会影响到其他软件组件代码或BSP All modifications are justified and not affect other components or BSP
  • 对难懂代码均提供对应的注释 I've commented appropriately where code is tricky
  • 代码是高质量的 Code in this PR is of high quality
  • 已经使用formatting等源码格式化工具确保格式符合RT-Thread代码规范 This PR complies with RT-Thread code specification

@CLAassistant
Copy link

CLAassistant commented Jun 13, 2023

CLA assistant check
All committers have signed the CLA.

@supperthomas supperthomas changed the title 被动挂起的线程,无法处理信号 [kernel] 被动挂起的线程,无法处理信号 Jun 14, 2023
@polarvid
Copy link
Contributor

polarvid commented Jul 28, 2023

有个题外话。导致这个问题的直接原因 —— 调用 rt_thread_suspend() 挂起其它线程。从 API 的使用场景来说应该是滥用了吧?这个实现可能最初假定就是在 rt_schedule() 恢复运行时去做这些逻辑。因此我有点好奇这种不是主动挂起的场景是否具有合理性呢?

rt-thread/src/thread.c

Lines 966 to 982 in 307e9e5

/**
* @brief This function will suspend the specified thread and change it to suspend state.
*
* @note This function ONLY can suspend current thread itself.
* rt_thread_suspend(rt_thread_self());
*
* Do not use the rt_thread_suspend to suspend other threads. You have no way of knowing what code a
* thread is executing when you suspend it. If you suspend a thread while sharing a resouce with
* other threads and occupying this resouce, starvation can occur very easily.
*
* @param thread the thread to be suspended.
* @param suspend_flag status flag of the thread to be suspended.
*
* @return Return the operation status. If the return value is RT_EOK, the function is successfully executed.
* If the return value is any other values, it means this operation failed.
*/
rt_err_t rt_thread_suspend_with_flag(rt_thread_t thread, int suspend_flag)

Do not use the rt_thread_suspend to suspend other threads.

@weidongshan
Copy link
Author

有个题外话。导致这个问题的直接原因 —— 调用 rt_thread_suspend() 挂起其它线程。从 API 的使用场景来说应该是滥用了吧?这个实现可能最初假定就是在 rt_schedule() 恢复运行时去做这些逻辑。因此我有点好奇这种不是主动挂起的场景是否具有合理性呢?

rt-thread/src/thread.c

Lines 966 to 982 in 307e9e5

/**
* @brief This function will suspend the specified thread and change it to suspend state.
*
* @note This function ONLY can suspend current thread itself.
* rt_thread_suspend(rt_thread_self());
*
* Do not use the rt_thread_suspend to suspend other threads. You have no way of knowing what code a
* thread is executing when you suspend it. If you suspend a thread while sharing a resouce with
* other threads and occupying this resouce, starvation can occur very easily.
*
* @param thread the thread to be suspended.
* @param suspend_flag status flag of the thread to be suspended.
*
* @return Return the operation status. If the return value is RT_EOK, the function is successfully executed.
* If the return value is any other values, it means this operation failed.
*/
rt_err_t rt_thread_suspend_with_flag(rt_thread_t thread, int suspend_flag)

Do not use the rt_thread_suspend to suspend other threads.

使用rt_thread_suspend 暂停其他线程确实不是一个好的编程习惯,因为你不知道别的线程暂停时需要做什么额外的工作。比如要暂停一个音乐播放器,你使用rt_thread_suspend 的话,也许它会发出一个恒定不变的声音(比如使用PWM播放音乐时就会这样)。
这里提到的信号bug,也会出现在“主动阻塞的场景”里,比如thread1调用rt_thread_delay,然后thread2给thread1发信号:会唤醒thread1,但是thread1并不会执行信号函数。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants