Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Second parameter of pthread_mutex_timedlock should be absolute time not timeout #516

Closed
jonathan-brandenburg-metecs opened this issue Jun 18, 2020 · 2 comments · Fixed by #518
Assignees
Labels
Milestone

Comments

@jonathan-brandenburg-metecs

In src/os/posix/src/os-impl-binsem.c, there are three calls to pthread_mutex_timedlock, each time with &OS_POSIX_BINSEM_MAX_WAIT as the second parameter where OS_POSIX_BINSEM_MAX_WAIT has a value of 2 seconds. because the second parameter of pthread_mutex_timedlock should be an absolute time and is not a timeout, if the mutex is not immediately available, pthread_mutex_timedlock will return with ETIMEDOUT resulting in the OS_BinSemFlush_Impl, OS_GenericBinSemTake_Impl, or OS_BinSemGive_Impl call returning OS_SEM_FAILURE.

Here's a code snippet of the "as is":
if ( pthread_mutex_timedlock(&sem->id, &OS_POSIX_BINSEM_MAX_WAIT) != 0 )
{
return(OS_SEM_FAILURE);
}

It should look something like:

struct timespec timeoutTime;
clock_gettime(CLOCK_REALTIME, &timeoutTime);
timeoutTime.tv_sec += OS_POSIX_BINSEM_MAX_WAIT.tv_sec;
timeoutTime.tv_nsec += OS_POSIX_BINSEM_MAX_WAIT.tv_nsec;
if ( pthread_mutex_timedlock(&sem->id, &timeoutTime) != 0 )
{
return(OS_SEM_FAILURE);
}

Reproducing this issue is tricky as the cFS application has to run long enough to have a mutex not immediately available for the lock. The behavior I'm seeing in my installation is the background task terminates and Ctrl-C can no longer be used to terminate the application. I'm running in a Centos 7 Virtual Box machine.

Jonathan C. Brandenburg
METECS
[email protected]

@jphickey
Copy link
Contributor

Thanks for catching this! I somehow missed that the timeout here was absolute and not relative. Kind of annoying that POSIX isn't consistent here.

@jphickey jphickey self-assigned this Jun 19, 2020
@jphickey jphickey added the bug label Jun 19, 2020
@jonathan-brandenburg-metecs
Copy link
Author

Glad to help! It is a bit annoying. I read the documentation for that method 100 times and kept seeing "absolute" but it just didn't compute because that's not at all common. Thank you!

@skliper skliper added this to the 5.1.0 milestone Jun 22, 2020
astrogeco added a commit that referenced this issue Jun 23, 2020
Fix #516, Use absolute timeout for timedlock calls
jphickey pushed a commit to jphickey/osal that referenced this issue Aug 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants