-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
libc/semaphore:sem_init change defult protocol #5070
Conversation
44f2d72
to
9d14eb6
Compare
b61bc35
to
b48cac1
Compare
9638e8f
to
076bf25
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a massive breaking change. Please do not merge this until there is time after the holidays to get proper feed back from the reviews listed.
Yes, this is a partial change to show the issue and potential modification. If the community agree the fix after review, all [nx]sem_init need review and modify. |
This will also break many things for downstream projects using NuttX. We need to make sure the wider community notices this proposed change so it can be reviewed properly. I will post on the mailing list, but there are other venues, so to everyone frequenting them, please help draw attention to this there, too. To be clear, I am in favor of standards-compliance, provided this is in line with NuttX's Inviolables:
Also, if this change is implemented, we need to increment the major version number. It will have to be NuttX 11.x. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As mentioned elsewhere, there are likely other parts of our tree that need review/changes.
Regarding whether we should do this or not, I cannot offer an opinion until I re-read POSIX details regarding semaphores to see if this is correct per our Inviolable Principles.
Only when the project enable CONFIG_PRIORITY_INHERITANCE, otherwise no difference before and after apply this patch.
Sure.
As far as I know, POSIX never mention the priority inheritance in spec, so it's the special behavior implemented by NuttX to improve the real time capability. But, sem_init without nxsem_set_protocol(SEM_PRIO_NONE) activate the priority inheritance by default which make the follow simple POSIX compliant program crash after the second sem_post:
So, it's very dangerous to enable priority inheritance without explicit indicator(e.g. a function call or an initial flag).
Of course.
Sure.
Ok, let's wait your research. |
Drivers that use the semaphores for signaling should not be involved with priority inheritance (when enabled). That use case was assumed to be the non default. It seams to me this change can be done with adding an Kconfig option for the default and leave it as it was. But allow the default to be changed if someone is using Linux code semantics. |
Why non default? semaphore use as signaling is a normal practice in POSIX. On the other hand, most program use POSIX pthread mutex as lock, so semaphore seldom use as lock except NuttX kernel.
Kconfig is good to disable/enable the optional feature, but it isn't a good solution here. Do you want sem_t change it's behavior by an option? And all place which call sem_init/nxsem_init need check the option and call nxsem_set_protocol with the different value.
First, it isn't Linux semantics. Second, my demo show that isn't a minor issue because the program will crash the system. |
Seems that disabling priority inheritance by default is the right way of doing things. My arguments are that by default priority inheritance is used to struggle with resource guarding issue (mutual exclusion use case) and does not have much sense in case of event signaling use case. The POSIX semaphores however have have more wide range then mutual exclusion use case and it is more a coincidence that |
After successfully sem_wait, the task becomes the semphore's holder. Then the task exit, but not be removed from the semphore's holder list, so it will crash if visit this holder in the future.
|
@zhaoxiu-zeng how do you plan to fix this? I was looking into the code as well and was looking for a solution, but didn't found a lightweight solution. The code only idea that I had was to create a global list of semaphores that have at least one holder and iterate that list on task exit in order to find a semaphore that contain task in the holder list or create a list of semaphores in task control block and add semaphore to that list when task is added as a semaphore holder. But both seems to me quite "heavy" in terms of both memory and performance. |
@anjiahao1 has a simple fix which change tcb_s to pid, so we can detect the abnormal case and remove the orphan node easily. But, we want to fix the root cause first, and then add the safe guard in case of the wrong usage. |
@xiaoxiang781216 changing |
this is the last patchset to fix the default value of priority inheritance, please review them, @hartmannathan @davids5 @masayuki2009 @pkarashchenko . All related changes are recorded in #5184. |
d9b38f7
to
3f79ae6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before, mutual exclusion semaphores had priority inheritance enabled by default; now it is opt-in.
So I wonder if the opposite of commit 48d8383 ("sem:remove sem default protocl") should be done also: Locate all mutual exclusion semaphores and call (nx)sem_set_protocol(..., SEM_PRIO_INHERIT)?
@hartmannathan it was done by converting all mutual exclusion semaphores to mutex lock in this PR #6320. |
#6965 this patch repaced all sem use as lock with mutex |
Great! This is a much cleaner design. Thank you! |
b3a62ee
to
315ee8f
Compare
Signed-off-by: anjiahao <[email protected]>
Signed-off-by: anjiahao <[email protected]>
Signed-off-by: anjiahao <[email protected]>
Signed-off-by: anjiahao <[email protected]>
@xiaoxiang781216 |
I have documented the mitigation entry for the next release's Release Notes here: https://cwiki.apache.org/confluence/display/NUTTX/NuttX+11.1.0#NuttX11.1.0-DefaultSemaphoreProtocolChanged Please check and let me know if there are any mistakes! |
The description looks good, @hartmannathan thanks. |
Semaphores are generally used in two ways, one is to count resources or notify event, the other is to use as a mutual exclusion lock
In Nuttx, the default behavior of semaphores is mutual exclusion lock. I think it is better to change the default behavior to count resources or notify event, and it will be more general in Linux programs.
Why would I do this:
In porting some programs which confirm POSIX standard, the semaphore is used to notify event, but its default behavior is a mutual exclusion lock(nuttx): One thread post the semaphore to indicate something happen, another thread gets the semaphore and exits. when the first thread post the semaphore again which will cause a crash when CONFIG_PRIORITY_INHERITANCE is enable, because htcb field in semholder_s point to a freed memory.
so, it's better to change the default sem behavior to no priority inheritance.