-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sched: Unify the main thread and pthread behaivour #1187
Conversation
91f493f
to
1da698d
Compare
@patacongo could you give some feedback? so other committer can merge the code. |
12e2de2
to
e0273b5
Compare
ae2e14b
to
e1b03f8
Compare
Does anyone can help review the change? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this PR is complex.
can you consider to separate quick_exit into another PR?
sched/task/task_cancelpt.c
Outdated
* immediately, interrupting the thread with its processing." | ||
* | ||
* REVISIT: Does this rule apply to equally to both SIGKILL and | ||
* SIGINT? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't understand this REVISIT comment. how signals are related here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment is copied from sig_default, I will remove it.
@@ -85,6 +76,15 @@ int pthread_cancel(pthread_t thread) | |||
pthread_exit(PTHREAD_CANCELED); | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this mean to make pthread_cancel a cancellation point?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The behaviour is same as before, the change is just an optimizition. If you look at nxnotify_cancellation, the main job is call nxsem_wait_irq/nxsig_wait_irq/nxmq_wait_irq but the code unnecessary if some thread call pthread_cancel on self because the thread is impossible in the wait state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't think pthread_cancel(self) is supposed to terminate the thread in the case of deferred cancellation.
after your changes, it seems to do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After re-reading the related document, you are right we need defer the termination even pthread_cancel self. I will drop this patch from PR.
sched/task/task.h
Outdated
****************************************************************************/ | ||
|
||
#define EXIT_NORMAL 0 | ||
#define EXIT_BYPASS 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you document what this bypasses?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok.
sched/task/task_delete.c
Outdated
* task group if this_task is a pthread. | ||
*/ | ||
|
||
group_kill_children(rtcb); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this necessary?
why is this rtcb, not dtcb?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this necessary?
I think that It's dangerous to terminate the main thread but leave other threads in the task keep running. And other similar function also do the same thing(e.g. task_restart, nxsig_abnormal_termination, exit...).
why is this rtcb, not dtcb?
good catch, I will fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is it dangerous? it's something i believe the most of pthread implementations allow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most people call task_delete mean to kill the whole task/process, just like you kill the process from task manager windows on Linux/Windows, but now this function just delete the main thread but let other threads in the same task keep running. And worse you can't kill other threads in this task by task_delete, then we don't have any API to terminate all threads in a task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR summary does not does not describe the changes of all of the 11 commits. This must be broken into separately traceable PRs.
Because nobody review my PR more than 7 days and some late patch depend on the previous one, I have to accumulate the patch in this PR. Anyway, I will remove the new patch from the list and send new PR when this PR get merged.
@@ -173,8 +165,7 @@ void pthread_cleanup_push(pthread_cleanup_t routine, FAR void *arg) | |||
*/ | |||
|
|||
sched_lock(); | |||
if ((tcb->cmn.flags & TCB_FLAG_TTYPE_MASK) == TCB_FLAG_TTYPE_PTHREAD && | |||
tcb->tos < CONFIG_PTHREAD_CLEANUP_STACKSIZE) | |||
if (tcb->tos < CONFIG_PTHREAD_CLEANUP_STACKSIZE) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we not bailing out here when called with a non pthread type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because main/kernel thread type is TCB_FLAG_TTYPE_TASK/TCB_FLAG_TTYPE_KERNEL, but both types should be able to call pthread_cleanup_push/pthread_cleanup_pop. The major target of whole patch set is ensure the main/kernel thread can call all most pthread API like pthread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because main/kernel thread type is TCB_FLAG_TTYPE_TASK/TCB_FLAG_TTYPE_KERNEL, but both types should be able to call pthread_cleanup_push/pthread_cleanup_pop. The major target of whole patch set is ensure the main/kernel thread can call all most pthread API like pthread.
But kernel threads cannot create pthreads. pthreads reside only in user space.
What would that even mean in a kernal build. It makes no sense. Which user space?
No pthread APIs should be availble to kernel threads. That must be forbidden by the OS. I believe that would result in severe, catastrophic breakage if permitted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't believe that use any pthread APIs should be permitted within the OS. Nor do I believe that the kernel thread TCB should carry the burden of the pthread data structures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tcb_s is shared by three entities(main task, kernel thread and pthread). To confirm POSIX standard, both main thread and pthread MUST support pthread_cleanup_[push|pop] and robust mutex. For kernel thread, it's not bad to support the cleanup stack. Yes, the robust mutex isn't need. But we only waste four byes for the late case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me it is a totally unacceptable partitioning of user space and kernel interfaces. I am very opposed to supported this. It is wrong. It is a serious architectural mistake. Let's not go that way... it is very sickening design concept. Let's keep logic in place to prohibit the use of any pthread APIs from kernel threads. Please. Don't shit on the OS! We are entrusted you to maintain good functional partitioning and not to trash all of the interfaces.
I am super opposed to the concept. I am very disappointed in what you are doing to the OS. Very disappointed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My long term goals include:
- To separate kernel threads from all groups. To eliminate streams and file descriptors from strings,
- Move all pthread logic possible into user space as will all other Unix system.
You are ruining that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certainly I will never merge this code. I believe that it is an abminination to the concept of a properly modularized Unix system. But I leave that to the good sense of others if they want it or not. (God, I hope there is no one out there so foolish).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a shoddy, incorrect solution to an important issue. The correct solution that does not contaminate the kernel thread is significantly more complex. I cut corners and implemented a bad simple solution that is bad of the OS and must not be merged. You are doing one of the "Enemies of Inviolables" in INVIOLABLES.txt:
The Enemies
===========
No Short Cuts
-------------
o Doing things the easy way instead of the correct way.
o Reducing effort at the expense of Quality, Portability, or
Consistency.
o Focus on the values of the organization, not the values of the Open
Source project. Need to support both.
o It takes work to support the Inviolables. There are no shortcuts.
That is unacceptable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me it is a totally unacceptable partitioning of user space and kernel interfaces. I am very opposed to supported this. It is wrong. It is a serious architectural mistake. Let's not go that way... it is very sickening design concept. Let's keep logic in place to prohibit the use of any pthread APIs from kernel threads.
Ok, we can make the change like this:
DEBUGASSERT((tcb->flags & TCB_FLAG_TTYPE_MASK) !=
TCB_FLAG_TTYPE_KTHREAD);
or
if ((rtcb->flags & TCB_FLAG_TTYPE_MASK) != TCB_FLAG_TTYPE_KTHREAD)
{
...
}
So the kernel thread will fail or no-op for pthread API, but main/pthread can work correctly.
Please. Don't shit on the OS! We are entrusted you to maintain good functional partitioning and not to trash all of the interfaces.
I am super opposed to the concept. I am very disappointed in what you are doing to the OS. Very disappointed.
My long term goals include:
- To separate kernel threads from all groups. To eliminate streams and file descriptors from strings,
- Move all pthread logic possible into user space as will all other Unix system.
You are ruining that.
This change doesn't ruining your goal, but a path to your final target:
1.Unify the main and thread behaviour:
a.main thread can call pthread_cleanup_[push|pop]
b.robust mutex work with main thread too
c.pthread_exit in main doesn't terminate the whole process
b.cancellation point in main doesn't terminate the whole process
2.The next step is reorg tcb_s maybe like this:
tcb_s
|
+-------+--------+
| |
tcb_kernel_s tcb_user_s
|
+-------+--------+
| |
tcb_task_s tcb_pthread_s
3.Finally merge tls_info_s into tcb_user_s and place userspace tcb at the beginning of stack
You can see the first step help this path because it make main thread same as pthread which is always required.
Yes, it's better to review the patch one by one in commit tab.
It's too difficult because quick_exit depend the previous patch which clean up the exit logic. If need, I can remove quick_exit patch from the PR and send a new PR once this PR get merged. |
@xiaoxiang781216 why is this needed? Why did you take the approach of removing the union? Where are the test cases with all the permutations of the CONFIG_* that control the changes? |
In any event, this should not be merged until the 9.1 release has branched. This is far to risky to go into a release unverified. |
We are expecting to branch for 9.1 in 5 days, it should be reasonable to wait. |
+1 for the wait |
The PR summary does not does not describe the changes of all of the 11 commits. This must be broken into separately traceable PRs. |
You mean the union inside exitinfo_s, this struct is added by me in commit:
Now we just have ostest. I run the change with ostest, no problem found so far. Yes, the test case is very little. Can you improve in this area? |
I just ran into this myself, I was thinking dma preflight was lit is the some of the test cases it was not. So it did not break the build, but it also was never tested. |
Signed-off-by: Xiang Xiao <[email protected]> Change-Id: Ifefccda6cb7e2335e11976dcec74e308d64c7f5e
Signed-off-by: Xiang Xiao <[email protected]> Change-Id: I2d91154572805699237cfc028202021c8f8eee40
In general, I think these changes are pretty ill-conceived. But I have not looked at all of the details; that is just my impression. There are many good ideas mish-mashed together with some not so good ideas. I don't think it should go to master. It should be broken up into multiple PRs that can be reviewed, approved, and merged. It should never come into master in this form (and we still do not want to perturb the code base prior the branching off 9.1!). |
I already split the PR into small one, please give your comment and suggestion there. Thanks. |
uint8_t tos; | ||
struct pthread_cleanup_s stack[CONFIG_PTHREAD_CLEANUP_STACKSIZE]; | ||
#endif | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be better to move these into TLS? They should only be used by user space applications and, hence, should be accessed in the same way that the errno variable is.
We should move as much as pthread logic as possible out of the operating system. Over time, the entire pthread implementation (other than a a few OS hooks) should be moved into the C library. That is a proper and standard roadmap for pthread support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@patacongo I will provide a patch to move them to TLS when I finish upgrade libcxx to the latest version.
Summary
1.Make pthread_cleanup_[push|pop] callable from the main thread
2.Robust mutex also support in main thread now
Impact
Testing
nxstyle error can be ignored(mixed case and file rename):