Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

macOS dyld4 introduces a deadlock bug into Profile code #49733

Closed
vtjnash opened this issue May 10, 2023 · 2 comments · Fixed by #49740
Closed

macOS dyld4 introduces a deadlock bug into Profile code #49733

vtjnash opened this issue May 10, 2023 · 2 comments · Fixed by #49740

Comments

@vtjnash
Copy link
Sponsor Member

vtjnash commented May 10, 2023

AFAIK, there is no possible workaround for this bug in dyld4, aside from forcing the linker to let us use the old dyld2.

(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x00007ff81489ad2e libsystem_kernel.dylib`__ulock_wait + 10
    frame #1: 0x00007ff814903b07 libsystem_platform.dylib`_os_unfair_lock_lock_slow + 162
    frame #2: 0x00007ff8145856f2 dyld`dyld4::RuntimeState::withLoadersReadLock(void () block_pointer) + 34
    frame #3: 0x00007ff8145b04be dyld`dyld4::APIs::findImageMappedAt(void const*, dyld3::MachOLoaded const**, bool*, char const**, void const**, unsigned long long*, unsigned char*) + 784
    frame #4: 0x00007ff8145b078b dyld`dyld4::APIs::dyld_image_path_containing_address(void const*) + 51
    frame #5: 0x00007ff81465822a libsystem_trace.dylib`_os_trace_dylib_or_main_executable_was_loaded + 46
    frame #6: 0x00007ff814589df9 dyld`invocation function for block in dyld4::RuntimeState::notifyLoad(std::__1::span<dyld4::Loader const*, 18446744073709551615ul> const&) + 324
    frame #7: 0x00007ff814585d29 dyld`dyld4::RuntimeState::withNotifiersReadLock(void () block_pointer) + 45
    frame #8: 0x00007ff814589a68 dyld`dyld4::RuntimeState::notifyLoad(std::__1::span<dyld4::Loader const*, 18446744073709551615ul> const&) + 338
    frame #9: 0x00007ff8145b13ea dyld`dyld4::APIs::dlopen_from(char const*, int, void*) + 932

this lock order however is violated here by atfork:
https://github.com/apple-oss-distributions/dyld/blob/c8a445f88f9fc1713db34674e79b00e30723e79d/dyld/DyldRuntimeState.cpp#L2559-L2571

@gbaraldi
Copy link
Member

So it's kind of like #49446 but inside their own code?

@vtjnash
Copy link
Sponsor Member Author

vtjnash commented May 10, 2023

Actually, there are some possible work-arounds, though they are not especially pretty. However, it would be our 3rd such identical work-around for similar bugs in Apple's dyld, so it seems unsurprising at this point. Either, (a) since these are recursive locks, we could theoretically call _dyld_dlopen_atfork_prepare(); _dyld_atfork_prepare(); /* dlopen */; _dyld_atfork_parent(); _dyld_dlopen_atfork_parent(); around every dlopen call to force it to acquire the locks correctly. This defends against use of fork. Additionally, we can add _dyld_dlopen_atfork_parent to our profiling code. This is annoying, since it means dlopen becomes even more of a black box to our profiling code, but should make profiling safe against deadlock from concurrent dlopen (identical to #43701).

vtjnash added a commit that referenced this issue May 10, 2023
Extend the fix for #43578 (2939272) to
cover the deadlock bug present internally in dyld4 inside the function
we use to avoid the previous deadlock issue.

Fix #49733
vtjnash added a commit that referenced this issue May 10, 2023
Extend the fix for #43578 (2939272) to
cover the deadlock bug present internally in dyld4 inside the function
we use to avoid the previous deadlock issue.

Fix #49733
vtjnash added a commit that referenced this issue May 11, 2023
Extend the fix for #43578 (2939272) to
cover the deadlock bug present internally in dyld4 inside the function
we use to avoid the previous deadlock issue.

Fix #49733
kpamnany pushed a commit to RelationalAI/julia that referenced this issue Oct 17, 2023
Extend the fix for JuliaLang#43578 (2939272) to
cover the deadlock bug present internally in dyld4 inside the function
we use to avoid the previous deadlock issue.

Fix JuliaLang#49733
kpamnany pushed a commit to RelationalAI/julia that referenced this issue Oct 17, 2023
Extend the fix for JuliaLang#43578 (2939272) to
cover the deadlock bug present internally in dyld4 inside the function
we use to avoid the previous deadlock issue.

Fix JuliaLang#49733
kpamnany pushed a commit to RelationalAI/julia that referenced this issue Oct 18, 2023
Extend the fix for JuliaLang#43578 (2939272) to
cover the deadlock bug present internally in dyld4 inside the function
we use to avoid the previous deadlock issue.

Fix JuliaLang#49733
kpamnany pushed a commit to RelationalAI/julia that referenced this issue Oct 19, 2023
Extend the fix for JuliaLang#43578 (2939272) to
cover the deadlock bug present internally in dyld4 inside the function
we use to avoid the previous deadlock issue.

Fix JuliaLang#49733
kpamnany pushed a commit to RelationalAI/julia that referenced this issue Oct 19, 2023
Extend the fix for JuliaLang#43578 (2939272) to
cover the deadlock bug present internally in dyld4 inside the function
we use to avoid the previous deadlock issue.

Fix JuliaLang#49733
DelveCI pushed a commit to RelationalAI/julia that referenced this issue Oct 20, 2023
Extend the fix for JuliaLang#43578 (2939272) to
cover the deadlock bug present internally in dyld4 inside the function
we use to avoid the previous deadlock issue.

Fix JuliaLang#49733
kpamnany pushed a commit to RelationalAI/julia that referenced this issue Oct 21, 2023
Extend the fix for JuliaLang#43578 (2939272) to
cover the deadlock bug present internally in dyld4 inside the function
we use to avoid the previous deadlock issue.

Fix JuliaLang#49733
DelveCI pushed a commit to RelationalAI/julia that referenced this issue Oct 23, 2023
Extend the fix for JuliaLang#43578 (2939272) to
cover the deadlock bug present internally in dyld4 inside the function
we use to avoid the previous deadlock issue.

Fix JuliaLang#49733
DelveCI pushed a commit to RelationalAI/julia that referenced this issue Nov 1, 2023
Extend the fix for JuliaLang#43578 (2939272) to
cover the deadlock bug present internally in dyld4 inside the function
we use to avoid the previous deadlock issue.

Fix JuliaLang#49733
DelveCI pushed a commit to RelationalAI/julia that referenced this issue Nov 2, 2023
Extend the fix for JuliaLang#43578 (2939272) to
cover the deadlock bug present internally in dyld4 inside the function
we use to avoid the previous deadlock issue.

Fix JuliaLang#49733
udesou pushed a commit to udesou/julia that referenced this issue Nov 3, 2023
Extend the fix for JuliaLang#43578 (2939272) to
cover the deadlock bug present internally in dyld4 inside the function
we use to avoid the previous deadlock issue.

Fix JuliaLang#49733
DelveCI pushed a commit to RelationalAI/julia that referenced this issue Nov 7, 2023
Extend the fix for JuliaLang#43578 (2939272) to
cover the deadlock bug present internally in dyld4 inside the function
we use to avoid the previous deadlock issue.

Fix JuliaLang#49733
DelveCI pushed a commit to RelationalAI/julia that referenced this issue Nov 10, 2023
Extend the fix for JuliaLang#43578 (2939272) to
cover the deadlock bug present internally in dyld4 inside the function
we use to avoid the previous deadlock issue.

Fix JuliaLang#49733
DelveCI pushed a commit to RelationalAI/julia that referenced this issue Nov 14, 2023
Extend the fix for JuliaLang#43578 (2939272) to
cover the deadlock bug present internally in dyld4 inside the function
we use to avoid the previous deadlock issue.

Fix JuliaLang#49733
DelveCI pushed a commit to RelationalAI/julia that referenced this issue Nov 15, 2023
Extend the fix for JuliaLang#43578 (2939272) to
cover the deadlock bug present internally in dyld4 inside the function
we use to avoid the previous deadlock issue.

Fix JuliaLang#49733
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants