Bug 439897 - Heaptrack produces impossible/incorrect stack traces
Summary: Heaptrack produces impossible/incorrect stack traces
Status: RESOLVED FIXED
Alias: None
Product: Heaptrack
Classification: Applications
Component: general (show other bugs)
Version: unspecified
Platform: Gentoo Packages Linux
: NOR normal
Target Milestone: ---
Assignee: Milian Wolff
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2021-07-15 18:20 UTC by Matt Whitlock
Modified: 2021-08-17 00:00 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:
Sentry Crash Report:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Matt Whitlock 2021-07-15 18:20:48 UTC
SUMMARY

I'm trying to use Heaptrack to find a memory leak in plasmashell, but Heaptrack is producing some nonsensical stack traces that purport to contain impossible calls.

STEPS TO REPRODUCE
1. heaptrack plasmashell
2. (use plasmashell for a bit to exercise a suspected leak)
3. kquitapp5 plasmashell
4. heaptrack --analyze heaptrack.plasmashell.*.zst
or heaptrack_print heaptrack.plasmashell.*.zst

OBSERVED RESULT

Analysis contains "impossible" stack traces like (for example):

94095 calls with 0B peak consumption from:
    QString::reallocData(unsigned int, bool)
      at ../../../qtbase-everywhere-src-5.15.2/src/corelib/text/qstring.cpp:2372
      in /usr/lib64/libQt5Core.so.5
    QString::resize(int)
      at ../../../qtbase-everywhere-src-5.15.2/src/corelib/text/qstring.cpp:2289
      in /usr/lib64/libQt5Core.so.5
    std::__atomic_base<>::load(std::memory_order) const
      at /usr/lib/gcc/x86_64-pc-linux-gnu/11.1.0/include/g++-v11/bits/atomic_base.h:481
      in /usr/lib64/libQt5Core.so.5
    int QAtomicOps<>::loadRelaxed<>(std::atomic<> const&)
      at ../../../qtbase-everywhere-src-5.15.2/include/QtCore/../../src/corelib/thread/qatomic_cxx11.h:239
    QBasicAtomicInteger<>::loadRelaxed() const
      at ../../../qtbase-everywhere-src-5.15.2/include/QtCore/../../src/corelib/thread/qbasicatomic.h:107
    QtPrivate::RefCount::isShared() const
      at ../../../qtbase-everywhere-src-5.15.2/include/QtCore/../../src/corelib/tools/qrefcount.h:101
    QString::detach()
      at ../../../qtbase-everywhere-src-5.15.2/include/QtCore/../../src/corelib/text/qstring.h:1088
    QString::data()
      at ../../../qtbase-everywhere-src-5.15.2/include/QtCore/../../src/corelib/text/qstring.h:1084
    operator>>(QDataStream&, QString&)
      at ../../../qtbase-everywhere-src-5.15.2/src/corelib/text/qstring.cpp:10418
    QDataStream& QtPrivate::readArrayBasedContainer<>(QDataStream&, QVector<>&)
      at /usr/include/qt5/QtCore/qdatastream.h:257
      in /usr/lib64/libKF5Service.so.5
    KServicePrivate::load(QDataStream&)
      at ../kservice-5.84.0/src/services/kservice.cpp:324
      in /usr/lib64/libKF5Service.so.5
    KService::KService(QDataStream&, int)
      at ../kservice-5.84.0/src/services/kservice.cpp:384
      in /usr/lib64/libKF5Service.so.5
    KServiceFactory::createEntry(int) const
      at ../kservice-5.84.0/src/services/kservicefactory.cpp:207
      in /usr/lib64/libKF5Service.so.5
    KServiceFactory::serviceOffers(int, int)
      at ../kservice-5.84.0/src/services/kservicefactory.cpp:301
      in /usr/lib64/libKF5Service.so.5
    KServiceTypeTrader::defaultOffers(QString const&, QString const&) const
      at ../kservice-5.84.0/src/services/kservicetypetrader.cpp:114
      in /usr/lib64/libKF5Service.so.5
    KServiceTypeTrader::query(QString const&, QString const&) const
      at ../kservice-5.84.0/src/services/kservicetypetrader.cpp:144
      in /usr/lib64/libKF5Service.so.5
    QString::~QString()
      at /usr/include/qt5/QtCore/qstring.h:1307
      in /usr/lib64/libnotificationmanager.so.1
    NotificationManager::Notification::Private::serviceForDesktopEntry(QString const&)
      at /usr/include/qt5/QtCore/qstring.h:1307
    QExplicitlySharedDataPointer<>::operator bool() const
      at /usr/include/qt5/QtCore/qshareddata.h:176
      in /usr/lib64/libnotificationmanager.so.1
    NotificationManager::Notification::Private::setDesktopEntry(QString const&)
      at ../plasma-workspace-5.22.3/libnotificationmanager/notification.cpp:303
    QString::~QString()
      at /usr/include/qt5/QtCore/qstring.h:1307
      in /usr/lib64/libnotificationmanager.so.1
    NotificationManager::Notification::Private::processHints(QMap<> const&)
      at ../plasma-workspace-5.22.3/libnotificationmanager/notification.cpp:345
    NotificationManager::ServerPrivate::Notify(QString const&, unsigned int, QString const&, QString const&, QString const&, QStringList const&, QMap<> const&, int)
      at ../plasma-workspace-5.22.3/libnotificationmanager/server_p.cpp:184
      in /usr/lib64/libnotificationmanager.so.1
    NotificationsAdaptor::Notify(QString const&, unsigned int, QString const&, QString const&, QString const&, QStringList const&, QMap<> const&, int)
      at libnotificationmanager/notificationsadaptor.cpp:70
      in /usr/lib64/libnotificationmanager.so.1
    NotificationsAdaptor::qt_static_metacall(QObject*, QMetaObject::Call, int, void**)
      at libnotificationmanager/notificationsadaptor.moc:185
      in /usr/lib64/libnotificationmanager.so.1
    NotificationsAdaptor::qt_metacall(QMetaObject::Call, int, void**)
      at libnotificationmanager/notificationsadaptor.moc:258
      in /usr/lib64/libnotificationmanager.so.1
    QDBusConnectionPrivate::deliverCall(QObject*, int, QDBusMessage const&, QVector<> const&, int)
      at ../../../qtbase-everywhere-src-5.15.2/src/dbus/qdbusintegrator.cpp:1001
      in /usr/lib64/libQt5DBus.so.5
    QDBusConnectionPrivate::activateCall(QObject*, int, QDBusMessage const&)
      at ../../../qtbase-everywhere-src-5.15.2/src/dbus/qdbusintegrator.cpp:912
      in /usr/lib64/libQt5DBus.so.5
    QDBusConnectionPrivate::activateObject(QDBusConnectionPrivate::ObjectTreeNode&, QDBusMessage const&, int)
      at ../../../qtbase-everywhere-src-5.15.2/src/dbus/qdbusintegrator.cpp:1497
      in /usr/lib64/libQt5DBus.so.5
    QDBusConnectionPrivate::activateObject(QDBusConnectionPrivate::ObjectTreeNode&, QDBusMessage const&, int)
      at ../../../qtbase-everywhere-src-5.15.2/src/dbus/qdbusintegrator.cpp:1461
      in /usr/lib64/libQt5DBus.so.5
    QDBusActivateObjectEvent::placeMetaCall(QObject*)
      at ../../../qtbase-everywhere-src-5.15.2/src/dbus/qdbusintegrator.cpp:1617
    QObjectPrivate::Sender::~Sender()
      at ../../../qtbase-everywhere-src-5.15.2/src/corelib/kernel/qobject_p.h:212
      in /usr/lib64/libQt5Core.so.5
    QObject::event(QEvent*)
      at ../../../qtbase-everywhere-src-5.15.2/src/corelib/kernel/qobject.cpp:1316
    QCoreApplicationPrivate::setEventSpontaneous(QEvent*, bool)
      at /usr/include/qt5/QtCore/5.15.2/QtCore/private/qcoreapplication_p.h:119
      in /usr/lib64/libQt5Widgets.so.5
    QApplicationPrivate::notify_helper(QObject*, QEvent*)
      at ../../../qtbase-everywhere-src-5.15.2/src/widgets/kernel/qapplication.cpp:3634
    QCoreApplication::notifyInternal2(QObject*, QEvent*)
      at ../../../qtbase-everywhere-src-5.15.2/src/corelib/kernel/qcoreapplication.cpp:1063
      in /usr/lib64/libQt5Core.so.5
    QScopedPointerDeleter<>::cleanup(QEvent*)
      at ../../../qtbase-everywhere-src-5.15.2/include/QtCore/../../src/corelib/tools/qscopedpointer.h:60
      in /usr/lib64/libQt5Core.so.5
    QScopedPointer<>::~QScopedPointer()
      at ../../../qtbase-everywhere-src-5.15.2/include/QtCore/../../src/corelib/tools/qscopedpointer.h:107
    QCoreApplicationPrivate::sendPostedEvents(QObject*, int, QThreadData*)
      at ../../../qtbase-everywhere-src-5.15.2/src/corelib/kernel/qcoreapplication.cpp:1822
    postEventSourceDispatch
      at ../../../qtbase-everywhere-src-5.15.2/src/corelib/kernel/qeventdispatcher_glib.cpp:278
      in /usr/lib64/libQt5Core.so.5
    g_main_dispatch
      at ../glib-2.68.3/glib/gmain.c:3347
      in /usr/lib64/libglib-2.0.so.0
    g_main_context_dispatch
      at ../glib-2.68.3/glib/gmain.c:4055
    g_main_context_iterate
      at ../glib-2.68.3/glib/gmain.c:4131
      in /usr/lib64/libglib-2.0.so.0
    g_main_context_iteration
      at ../glib-2.68.3/glib/gmain.c:4197
      in /usr/lib64/libglib-2.0.so.0
    QEventDispatcherGlib::processEvents(QFlags<>)
      at ../../../qtbase-everywhere-src-5.15.2/src/corelib/kernel/qeventdispatcher_glib.cpp:424
      in /usr/lib64/libQt5Core.so.5
    std::__atomic_base<>::load(std::memory_order) const
      at /usr/lib/gcc/x86_64-pc-linux-gnu/11.1.0/include/g++-v11/bits/atomic_base.h:481
      in /usr/lib64/libQt5Core.so.5
    int QAtomicOps<>::loadAcquire<>(std::atomic<> const&)
      at ../../../qtbase-everywhere-src-5.15.2/include/QtCore/../../src/corelib/thread/qatomic_cxx11.h:251
    QBasicAtomicInteger<>::loadAcquire() const
      at ../../../qtbase-everywhere-src-5.15.2/include/QtCore/../../src/corelib/thread/qbasicatomic.h:110
    QEventLoop::exec(QFlags<>)
      at ../../../qtbase-everywhere-src-5.15.2/src/corelib/kernel/qeventloop.cpp:231
    QCoreApplication::exec()
      at ../../../qtbase-everywhere-src-5.15.2/src/corelib/kernel/qcoreapplication.cpp:1371
      in /usr/lib64/libQt5Core.so.5
    main
      at ../plasma-workspace-5.22.3/shell/main.cpp:254
      in /usr/bin/plasmashell

Notice that QString::~QString() allegedly calls NotificationManager::Notification::Private::setDesktopEntry(QString const&), which leads to another call to QString::~QString(), which allegedly calls KServiceTypeTrader::query(QString const&, QString const&) const.

If we look at the actual implementation of QString::~QString(), we find:

inline QString::~QString() { if (!d->ref.deref()) Data::deallocate(d); }

The deref() call is just an atomic decrement of an integer, and Data::deallocate(d) just calls ::free(d) after making some assertions. There is no way that QString::~QString() is calling into those other functions.

You might think that there's some weird interaction between the compiler's optimizer and the debug info since QString::~QString() is an inlined function, but /usr/lib64/libnotificationmanager.so.1 was compiled with '-Og -ggdb', which is supposed to produce sensible debug info.

Another example: std::__atomic_base<>::load(std::memory_order) const supposedly makes calls to both QEventDispatcherGlib::processEvents(QFlags<>) and QString::resize(int). That's preposterous.

EXPECTED RESULT

Stack traces should not purport to contain function calls that do not appear in the source code.


SOFTWARE/OS VERSIONS
Heaptrack Version: 1.2.80 (66a7ba44c)
Linux/KDE Plasma: Gentoo Linux
KDE Plasma Version: 5.22.3
KDE Frameworks Version: 5.84.0
Qt Version: 5.15.2
GCC Version: Gentoo 11.1.0-r1 p2
Comment 1 Milian Wolff 2021-07-18 19:59:25 UTC
Hey there, thanks for your report.

Since you are using gentoo: Are the debug symbols you are using for plasmashell and qt and so forth matching the build that you profiled?

Is this easily reproducible for me somehow?

I totally agree that the traces are garbage - the question now "just" becomes figuring out why this happens for you. It could be a bug in libunwind, in the debug symbol matching, in libbacktrace or in heaptrack itself - so plenty of room for things to break apart :)
Comment 2 Matt Whitlock 2021-07-26 21:16:12 UTC
(In reply to Milian Wolff from comment #1)
> Since you are using gentoo: Are the debug symbols you are using for
> plasmashell and qt and so forth matching the build that you profiled?

Yes. The debug symbols get installed (beneath /usr/lib/debug) at the same time as the binaries are installed to the live file system by the package manager. The debug symbols are produced in the same build process that produces the executable/library binaries and are then split apart from the binaries to be filed separately. Gentoo doesn't use separate packages for debug symbols like some distros do.

> Is this easily reproducible for me somehow?

I'm not sure. Have you tried Heaptrack with GCC 11 yet? I am thinking there may be an issue with the debug info that the compiler is generating, as I am seeing impossible stack traces in GDB while interactively debugging my own programs too. What I observe is that a stack frame will show as belonging to a leaf function that has already returned, and in fact the frame that called the leaf function is not represented separately in the stack trace, but rather the frame that shows as the leaf function *is* the frame that called the leaf function, and the information about the caller of the leaf function is absent from the stack trace. It is as though GCC is *overwriting* the source location information in the current stack frame when it invokes the (presumably inlined) leaf function and then forgets to restore it when it's finished with the leaf function. That would explain why my Heaptrack output is making no sense to me and has been utterly useless for tracking down a memory leak.
Comment 3 Milian Wolff 2021-08-16 10:08:47 UTC
I think I now found a case where I get bogus backtraces too. I'll investigate and see if there's a way to fix this within heaptrack's libbacktrace

thanks
Comment 4 Milian Wolff 2021-08-16 19:22:25 UTC
Git commit 7531b8807642f4e265d57f2dd6bdcbafca85fccf by Milian Wolff.
Committed on 16/08/2021 at 18:42.
Pushed by mwolff into branch 'master'.

Fix backtrace symbolication for optimized code

For some reason, we actually get an IP address that points to the
instruction _after_ the one we need to use for symbolication. I'm
quite frankly totally flabbergasted by this breakage, as it seems
to be around for years, yet no-one ever complained? And testing this
on some of my code, I believe it used to work in the past without
this workaround... If anyone knows more about this, please chime
in over at: https://github.com/libunwind/libunwind/issues/287

For now, manually subtracting one from the addresses we get from
unwinding is enough to workaround this and get sane backtraces once
more.

M  +9    -2    src/track/libheaptrack.cpp

https://invent.kde.org/sdk/heaptrack/commit/7531b8807642f4e265d57f2dd6bdcbafca85fccf
Comment 5 Milian Wolff 2021-08-16 19:29:56 UTC
Hey Matt,

can you please try the latest master and see if you are still seeing this breakage?

Quite frankly, I'm super stumped by this breakage as I mentioned in the commit message for the workaround I implemented now... But in my tests it finally works again. I just wonder how it could ever have worked at this stage. Considering how many people have used heaptrack in the past...
Comment 6 Matt Whitlock 2021-08-17 00:00:29 UTC
(In reply to Milian Wolff from comment #5)
> can you please try the latest master and see if you are still seeing this
> breakage?

Yes, that fixed it. I no longer get nonsense traces in Heaptrack, and I actually managed to find and patch that leak in plasmashell!

> Quite frankly, I'm super stumped by this breakage as I mentioned in the
> commit message for the workaround I implemented now...

I think it should be expected that the instruction pointer addresses that are on the stack should point at the instruction just after the call instruction, as that's how the hardware works. (The call instruction pushes the address of the *next* instruction before branching to the target of the call. That saves having to re-decode the call instruction upon later return from the subroutine to determine how large the call instruction's operand is and thus where the next instruction begins.)

> I just wonder how it could ever have worked at this
> stage. Considering how many people have used heaptrack in the past...

Maybe Heaptrack's stack traces have been "good enough," or maybe libunwind used to adjust the IPs in the stack trace to point at the call instructions but stopped doing that at some point.

Technically, I am not sure that subtracting a constant 1 from the addresses is quite valid, as then you'll end up pointing at part of the call instruction's operand rather than at the call instruction's opcode, but maybe that's good enough to determine an accurate source code location for the function call.