Skip to content

Commit

Permalink
Enable per thread register state cache on libunwind (#55049)
Browse files Browse the repository at this point in the history
Looking into a profile recently I realized that when recording
backtraces the CPU utilization is mostly dominated by lookups/updates to
libunwind's register state cache (`get_rs_cache`, `put_rs_cache`):

![Screenshot from 2024-07-05
19-29-45](https://github.com/JuliaLang/julia/assets/5301739/5e65f867-6dc8-4d55-8669-aaf1f756a2ac)

It is also worth noting that those functions are taking a lock and using
`sigprocmask` which does not scale, so by recording backtraces in
parallel we get:

![Screenshot from 2024-07-05
19-30-21](https://github.com/JuliaLang/julia/assets/5301739/ed3124dd-f340-4b52-a7f9-c0a203f935b6)

And this translates to these times on a recent laptop (Linux X86_64):
```
julia> @time for i in 1:1000000 Base.backtrace() end
  8.286924 seconds (32.00 M allocations: 8.389 GiB, 1.46% gc time)

julia> @time Threads.@sync for i in 1:16
           Threads.@Spawn for j in 1:1000000
               Base.backtrace()
           end
       end
 20.448630 seconds (160.01 M allocations: 123.740 GiB, 8.05% gc time, 0.43% compilation time: 18% of which was recompilation)
```

Good news is that libunwind already has the solution for this in the
form of the `--enable-per-thread-cache` build option which uses a thread
local cache for register state instead of the default global one
([1](https://libunwind-devel.nongnu.narkive.com/V3gtFUL9/question-about-performance-of-threaded-access-in-libunwind)).
But this is not without some hiccups due to how we `dlopen` libunwind so
we need a small patch
([2](https://libunwind-devel.nongnu.narkive.com/QG1K3Uke/tls-model-initial-exec-attribute-prevents-dynamic-loading-of-libunwind-via-dlopen)).

By applying those changes we get:
```
julia> @time for i in 1:1000000 Base.backtrace() end
  2.378070 seconds (32.00 M allocations: 8.389 GiB, 4.72% gc time)

julia> @time Threads.@sync for i in 1:16
           Threads.@Spawn for j in 1:1000000
               Base.backtrace()
           end
       end
  3.657772 seconds (160.01 M allocations: 123.740 GiB, 52.05% gc time, 2.33% compilation time: 19% of which was recompilation)
```

Single-Threaded:
![Screenshot from 2024-07-05
20-25-49](https://github.com/JuliaLang/julia/assets/5301739/ebc87952-e51f-488c-92f4-72aed5abb93a)

Multi-Threaded:
![Screenshot from 2024-07-05
20-26-32](https://github.com/JuliaLang/julia/assets/5301739/0ea2160a-60e8-49ea-af62-7d8ffc35c963)

As a companion to this PR I have created another one for applying the
same change to LibUnwind_jll [on
Yggdrasil](JuliaPackaging/Yggdrasil#9030). After
that lands we can bump the version here.
  • Loading branch information
andrebsguedes committed Jul 31, 2024
1 parent fdecc59 commit 5a904ac
Show file tree
Hide file tree
Showing 5 changed files with 81 additions and 27 deletions.
48 changes: 24 additions & 24 deletions deps/checksums/unwind
Original file line number Diff line number Diff line change
@@ -1,26 +1,26 @@
LibUnwind.v1.8.1+0.aarch64-linux-gnu.tar.gz/md5/e25a186941b2bedeb4a0fca60b1e5d1b
LibUnwind.v1.8.1+0.aarch64-linux-gnu.tar.gz/sha512/4b488ef13b1b09d37dd2d2f62647e6407404730beb8cab58263c2d8e9db3716bfdb8949eca8ebb126eb22a3fcd81deb7ea0774fe7527ba7374f76047fe03abd7
LibUnwind.v1.8.1+0.aarch64-linux-musl.tar.gz/md5/75fea80870d951a5e87d37bc67e52cfb
LibUnwind.v1.8.1+0.aarch64-linux-musl.tar.gz/sha512/efb54577cddaf5e7930b15cdd98ed88e4d60ba3a1fe0097b2a64a868f92177985c71a86cfb40475976005ab55a01401960afa9c20649b1e34ea02ef262caa046
LibUnwind.v1.8.1+0.armv6l-linux-gnueabihf.tar.gz/md5/30f3077b185f6e51b8b6ddfddcb8effb
LibUnwind.v1.8.1+0.armv6l-linux-gnueabihf.tar.gz/sha512/524810edbcfcba4938cb63c325905569b7d232dd8b02856e5f1592d7e36620c3ee166c0c788e42a14abc281c41723f49563f59d8cf5175ae1c3605ec29a97b9f
LibUnwind.v1.8.1+0.armv6l-linux-musleabihf.tar.gz/md5/087d263a8edacec1b79d4eccef03ab53
LibUnwind.v1.8.1+0.armv6l-linux-musleabihf.tar.gz/sha512/bad2bea6f98ed9e0ac293ab3cd7873d2c164616bd09103ad773300da1875e28ac51744809629d01b69744c610d93c90cc48ec4c81411b5d3f036db86e098adcd
LibUnwind.v1.8.1+0.armv7l-linux-gnueabihf.tar.gz/md5/218f8a37d910bcfaba1bbeb9f61593a1
LibUnwind.v1.8.1+0.armv7l-linux-gnueabihf.tar.gz/sha512/1912b7aa4bbcaca3facad13bf9a8a8b4bb42183b9c542c6b51f0f4a715c27b7583dcf36f49a1fac9787ba7b39728a5d1a151661a570ef637d1080c11d5426fc4
LibUnwind.v1.8.1+0.armv7l-linux-musleabihf.tar.gz/md5/c2582785ca7dc2edbc529a93ea0f4120
LibUnwind.v1.8.1+0.armv7l-linux-musleabihf.tar.gz/sha512/ae5414a274d973623070402806eb279dd2ab708c801fa7f24ba9b8066e7fc13ae9ebe1f331f76dd54a4eba572e87117c57d502190b63978af87d7fa35a011632
LibUnwind.v1.8.1+0.i686-linux-gnu.tar.gz/md5/324ae0c4916a435a6746ca77a1034b58
LibUnwind.v1.8.1+0.i686-linux-gnu.tar.gz/sha512/fe5ac30e6cdda9f99c873a7af60407c5f1ca1d17396ab46679df56093fea37289e802dd53ed083a4963f7439a1887b4d401a9ab489bdeddd2d003b761af84c1c
LibUnwind.v1.8.1+0.i686-linux-musl.tar.gz/md5/0495beea1d8e5e4572f32830125cb329
LibUnwind.v1.8.1+0.i686-linux-musl.tar.gz/sha512/3db7f9241e11e139f02239826a65f40d77d968aa7dde574cf91759706dc9a5c97fb055b34ec011f9ac085eec121c3807e9c873773d1ab091a5a7180200ea73ec
LibUnwind.v1.8.1+0.powerpc64le-linux-gnu.tar.gz/md5/1f0feb7cced4b847295dff4c1cd0dde1
LibUnwind.v1.8.1+0.powerpc64le-linux-gnu.tar.gz/sha512/88707b4a45e3de2901a343f20a35d2003d24db6604a5194712a3a687299b98e7507934a1bd4d7a21f84f089e0378964334c483f10311dd1bfbaa5d8b42ab9f76
LibUnwind.v1.8.1+0.x86_64-linux-gnu.tar.gz/md5/a03c84494c04ba08fa7e314584d28945
LibUnwind.v1.8.1+0.x86_64-linux-gnu.tar.gz/sha512/eb97ec8cf03fc5cb77a6218fcc4f1ef1266e66a774dea34e1d1fb7f89c026287bb4bd09de0b61a83b42495b8b4d5be475a61b4df68c83bfb33be2145ed659627
LibUnwind.v1.8.1+0.x86_64-linux-musl.tar.gz/md5/194654cfd8d202599b7096783659c0ab
LibUnwind.v1.8.1+0.x86_64-linux-musl.tar.gz/sha512/f39f8d0488ec02d9693b4a17ca73ec683ea062cfc67400d02e1e38bfeb43c371068742379d5e17f8c8b4ab478de48f91284e17b0e1b94e09d1a64713276326c7
LibUnwind.v1.8.1+0.x86_64-unknown-freebsd.tar.gz/md5/6453d66204ba5fb941046afd85345b90
LibUnwind.v1.8.1+0.x86_64-unknown-freebsd.tar.gz/sha512/77e67c3ddda5eaee0e8b127ad8e2ad41add4410e356c4e4b9bc46eb19871b91d006a59009d9948c4cc0951c2d9e956a99c946a60ba47ceb7f827b2897d6939e5
LibUnwind.v1.8.1+1.aarch64-linux-gnu.tar.gz/md5/0f789b9e5b2604a39cc363c4c513a808
LibUnwind.v1.8.1+1.aarch64-linux-gnu.tar.gz/sha512/4c9c8250bfd84a96135a5e9ecdd4500214996c39852609d3a3983c2c5de44a728d9ce6b71bd649c1725e186db077f74df93a99f07452a31d344c17315eedb33d
LibUnwind.v1.8.1+1.aarch64-linux-musl.tar.gz/md5/356deb10e57d4c7e7bf7dbc728d6628d
LibUnwind.v1.8.1+1.aarch64-linux-musl.tar.gz/sha512/a998eebe7a4928bd417620bef0de9728c080f5d9714f15314ac190b333efa1bd7a21207156d56c132515bd3f7154d60204f1fac2dac5468560a7017682527c78
LibUnwind.v1.8.1+1.armv6l-linux-gnueabihf.tar.gz/md5/b0ff12f5f0c801e5e280a142a1b7a188
LibUnwind.v1.8.1+1.armv6l-linux-gnueabihf.tar.gz/sha512/68003f39eaf55c8742e821a228889590e8673cbafb74013a5b4f6a0c08ee372cb6b102a574e89ce9f46a38dd3d31ef75de95762f72a31a8ec9d7f495affaeb77
LibUnwind.v1.8.1+1.armv6l-linux-musleabihf.tar.gz/md5/b04c77d707875989777ecfed66bd2dad
LibUnwind.v1.8.1+1.armv6l-linux-musleabihf.tar.gz/sha512/fb20586a0cbc998a0482d4102d8b8e5b2f802af519e25c440a64f67554468b29c6999a9ec5509ba375714beb93a4b48e8dbf71e6089c25ecd63b11eead844041
LibUnwind.v1.8.1+1.armv7l-linux-gnueabihf.tar.gz/md5/e948016b4179d34727b456bc768cd8e1
LibUnwind.v1.8.1+1.armv7l-linux-gnueabihf.tar.gz/sha512/6fc64e8ac7248540b95c321103d234f2c8633087f261e368251fe2cf6ea4e0654325716ac7017ae966edc4ddbb004a0f808d6e25cca766faaf505ca1f8f4aee7
LibUnwind.v1.8.1+1.armv7l-linux-musleabihf.tar.gz/md5/660cf49c34a2ead1afbdcb44491e174a
LibUnwind.v1.8.1+1.armv7l-linux-musleabihf.tar.gz/sha512/edf337d176440c210f5860e90771758335256fe9d2f179d506656bccf92a9f9aa478d176d4b0db2213945ae847dad5bb88265110c92cfcd538d5740858b6a3f0
LibUnwind.v1.8.1+1.i686-linux-gnu.tar.gz/md5/7032a70cfecb88cdd49cc3a4879456c6
LibUnwind.v1.8.1+1.i686-linux-gnu.tar.gz/sha512/e34acc8f270c5156ede3ac3377d0f428c672daed869570734351c6b5a8946d65b5c0c041b713dddefedef81e55c65f5683aed0fec0d366e2d0207d8b902b0e33
LibUnwind.v1.8.1+1.i686-linux-musl.tar.gz/md5/0541c3419020334173d299cf3482ff85
LibUnwind.v1.8.1+1.i686-linux-musl.tar.gz/sha512/0b57745d280fb9893772936cd4872b0e04f41d86379e772b889e75baffe9324ef8dd168bb4c9761c1b8372f387ce99721dd6086b1d52b9a91215f40e8113968d
LibUnwind.v1.8.1+1.powerpc64le-linux-gnu.tar.gz/md5/fee37734fe95d1e96ebc77316df64192
LibUnwind.v1.8.1+1.powerpc64le-linux-gnu.tar.gz/sha512/953ef70fb203db73764eeab0a37521b94e79ce70644ae16fe3157ca8d1011a0319d1928d094a3e2ed1e0489fdc0ca7dda33722095fd3aa40ed1fde150cf44c2a
LibUnwind.v1.8.1+1.x86_64-linux-gnu.tar.gz/md5/bbb201e7455fd13b805b0a96dc16183b
LibUnwind.v1.8.1+1.x86_64-linux-gnu.tar.gz/sha512/b1e21f7d772bd15bada17d287e1876ae586a97c6a8669e714347e7bf8a9b202fe53e8559cf19358f88bc458b2fe15ccbd616b64163cc715ce253f43f5133a8cd
LibUnwind.v1.8.1+1.x86_64-linux-musl.tar.gz/md5/72156f9d6da9a2742d9152822e5525f5
LibUnwind.v1.8.1+1.x86_64-linux-musl.tar.gz/sha512/53a3f1985c5ae4816693f292604810cbe948e6332aeb227fb900ba3730f4379e863b144ae87af2c0651c2b9633b35c45c7a0a6fa34958dc9f58e0f8baa2ea701
LibUnwind.v1.8.1+1.x86_64-unknown-freebsd.tar.gz/md5/e4346df03246d847f2867df3ab5ac624
LibUnwind.v1.8.1+1.x86_64-unknown-freebsd.tar.gz/sha512/ee01bc12726288ae091476c1bed44de224a9ef5355687fd6fd64742da6628450434d7f33d4daf81029263aa6d23549a0aa5c5ae656599c132051255d1d742d5d
libunwind-1.8.1.tar.gz/md5/10c96118ff30b88c9eeb6eac8e75599d
libunwind-1.8.1.tar.gz/sha512/aba7b578c1b8cbe78f05b64e154f3530525f8a34668b2a9f1ee6acb4b22c857befe34ad4e9e8cca99dbb66689d41bc72060a8f191bd8be232725d342809431b3
44 changes: 44 additions & 0 deletions deps/patches/libunwind-disable-initial-exec-tls.patch
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
diff --git a/include/libunwind-common.h.in b/include/libunwind-common.h.in
index 893fdd69..80ab9648 100644
--- a/include/libunwind-common.h.in
+++ b/include/libunwind-common.h.in
@@ -340,5 +340,6 @@ extern int unw_get_elf_filename_by_ip (unw_addr_space_t, unw_word_t, char *,
extern const char *unw_strerror (int);
extern int unw_backtrace (void **, int);
extern int unw_backtrace2 (void **, int, unw_context_t*, int);
+extern int unw_ensure_tls (void);

extern unw_addr_space_t unw_local_addr_space;
diff --git a/src/dwarf/Gparser.c b/src/dwarf/Gparser.c
index 7a5d7e1f..8453ffb0 100644
--- a/src/dwarf/Gparser.c
+++ b/src/dwarf/Gparser.c
@@ -623,7 +623,7 @@ get_rs_cache (unw_addr_space_t as, intrmask_t *saved_maskp)
#if defined(HAVE___CACHE_PER_THREAD) && HAVE___CACHE_PER_THREAD
if (likely (caching == UNW_CACHE_PER_THREAD))
{
- static _Thread_local struct dwarf_rs_cache tls_cache __attribute__((tls_model("initial-exec")));
+ static _Thread_local struct dwarf_rs_cache tls_cache;
Debug (16, "using TLS cache\n");
cache = &tls_cache;
}
diff --git a/src/mi/init.c b/src/mi/init.c
index e4431eeb..07cae852 100644
--- a/src/mi/init.c
+++ b/src/mi/init.c
@@ -82,3 +82,15 @@ mi_init (void)
unw_init_page_size();
assert(sizeof(struct cursor) <= sizeof(unw_cursor_t));
}
+
+int
+unw_ensure_tls (void)
+{
+#if defined(HAVE___CACHE_PER_THREAD) && HAVE___CACHE_PER_THREAD
+ static _Thread_local int alloc_trigger;
+ alloc_trigger = 1;
+ return alloc_trigger;
+#else
+ return 0;
+#endif
+}
8 changes: 6 additions & 2 deletions deps/unwind.mk
Original file line number Diff line number Diff line change
Expand Up @@ -38,13 +38,17 @@ $(SRCCACHE)/libunwind-$(UNWIND_VER)/libunwind-aarch64-inline-asm.patch-applied:
cd $(SRCCACHE)/libunwind-$(UNWIND_VER) && patch -p1 -f -u -l < $(SRCDIR)/patches/libunwind-aarch64-inline-asm.patch
echo 1 > $@

$(SRCCACHE)/libunwind-$(UNWIND_VER)/libunwind-disable-initial-exec-tls.patch-applied: $(SRCCACHE)/libunwind-$(UNWIND_VER)/libunwind-aarch64-inline-asm.patch-applied
cd $(SRCCACHE)/libunwind-$(UNWIND_VER) && patch -p1 -f -u -l < $(SRCDIR)/patches/libunwind-disable-initial-exec-tls.patch
echo 1 > $@

# note minidebuginfo requires liblzma, which we do not have a source build for
# (it will be enabled in BinaryBuilder-based downloads however)
# since https://github.com/JuliaPackaging/Yggdrasil/commit/0149e021be9badcb331007c62442a4f554f3003c
$(BUILDDIR)/libunwind-$(UNWIND_VER)/build-configured: $(SRCCACHE)/libunwind-$(UNWIND_VER)/source-extracted $(SRCCACHE)/libunwind-$(UNWIND_VER)/libunwind-aarch64-inline-asm.patch-applied
$(BUILDDIR)/libunwind-$(UNWIND_VER)/build-configured: $(SRCCACHE)/libunwind-$(UNWIND_VER)/source-extracted $(SRCCACHE)/libunwind-$(UNWIND_VER)/libunwind-disable-initial-exec-tls.patch-applied
mkdir -p $(dir $@)
cd $(dir $@) && \
$(dir $<)/configure $(CONFIGURE_COMMON) CPPFLAGS="$(CPPFLAGS) $(LIBUNWIND_CPPFLAGS)" CFLAGS="$(CFLAGS) $(LIBUNWIND_CFLAGS)" --enable-shared --disable-minidebuginfo --disable-tests --enable-zlibdebuginfo --disable-conservative-checks
$(dir $<)/configure $(CONFIGURE_COMMON) CPPFLAGS="$(CPPFLAGS) $(LIBUNWIND_CPPFLAGS)" CFLAGS="$(CFLAGS) $(LIBUNWIND_CFLAGS)" --enable-shared --disable-minidebuginfo --disable-tests --enable-zlibdebuginfo --disable-conservative-checks --enable-per-thread-cache
echo 1 > $@

$(BUILDDIR)/libunwind-$(UNWIND_VER)/build-compiled: $(BUILDDIR)/libunwind-$(UNWIND_VER)/build-configured
Expand Down
6 changes: 6 additions & 0 deletions src/threading.c
Original file line number Diff line number Diff line change
Expand Up @@ -404,6 +404,12 @@ jl_ptls_t jl_init_threadtls(int16_t tid)
jl_fence();
uv_mutex_unlock(&tls_lock);

#if !defined(_OS_WINDOWS_) && !defined(JL_DISABLE_LIBUNWIND) && !defined(LLVMLIBUNWIND)
// ensures libunwind TLS space for this thread is allocated eagerly
// to make unwinding async-signal-safe even when using thread local caches.
unw_ensure_tls();
#endif

return ptls;
}

Expand Down
2 changes: 1 addition & 1 deletion stdlib/LibUnwind_jll/Project.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
name = "LibUnwind_jll"
uuid = "745a5e78-f969-53e9-954f-d19f2f74f4e3"
version = "1.8.1+0"
version = "1.8.1+1"

[deps]
Libdl = "8f399da3-3557-5675-b5ff-fb832c97cbdb"
Expand Down

2 comments on commit 5a904ac

@vtjnash
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nanosoldier runbenchmarks(ALL, isdaily = true)

@nanosoldier
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your benchmark job has completed - possible performance regressions were detected. A full report can be found here.

Please sign in to comment.