Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Symbol resolution with multiple executable regions per module #1022

Merged
merged 2 commits into from
Mar 4, 2017

Conversation

goldshtn
Copy link
Collaborator

@goldshtn goldshtn commented Mar 3, 2017

The symbol resolution code used to assume for most purposes that
there is a single executable region per module. When there were
several, there was no crash, but symbols were not resolved correctly.
The reason is that the symbol offsets are relative to the first
executable region's start address, but bcc would resolve them
relative to the region in which they appeared. For example, given
the following regions and spans for a module libfoo.so loaded into
some process:

  1000-2000 r-xp libfoo.so
  2000-3000 rw-p libfoo.so
  3000-4000 r-xp libfoo.so
  4000-5000 r--- libfoo.so

Now, suppose there is a symbol bar() loaded at address 3500. In
the binary on disk, bar() is at offset 2500 from the beginning of
the module (but not the beginning of the 3000-4000 region!). When
we look at the candidate regions, we find 3000-4000, and discover
that 3500 lies within it. Then we subtract 3500-3000 to find the
offset from the beginning of the region, get 500, and now look
for a symbol that contains the relative address 500. As a result,
we might find some random symbol in the region 1000-2000, and
report that address 3500 corresponds to that random symbol rather
than to bar().

This commit fixes the situation by keeping only a single Module
instance for each module, even if that module spans multiple
executable regions. We remember all executable region start and
end ranges so we can determine whether an address (like 3500 in
the above example) lies within the module. But for the purpose of
finding the actual symbol, we need only the offset from the start
of the first executable region, and then need to look up a symbol
based on that.

This was discovered and fixed while tracing .NET Core processes on
Linux, where libcoreclr.so (the main CLR binary) has several
executable regions. Resolving symbols from any but the first region
would produce totally bogus results.

Also fixed an assertion error that would trigger in debug builds.

The symbol resolution code used to assume for most purposes that
there is a single executable region per module. When there were
several, there was no crash, but symbols were not resolved correctly.
The reason is that the symbol offsets are relative to the first
executable region's start address, but bcc would resolve them
relative to the region in which they appeared. For example, given
the following regions and spans for a module libfoo.so loaded into
some process:

  1000-2000 r-xp libfoo.so
  2000-3000 rw-p libfoo.so
  3000-4000 r-xp libfoo.so
  4000-5000 r--- libfoo.so

Now, suppose there is a symbol bar() loaded at address 3500. In
the binary on disk, bar() is at offset 2500 from the beginning of
the module (but not the beginning of the 3000-4000 region!). When
we look at the candidate regions, we find 3000-4000, and discover
that 3500 lies within it. Then we subtract 3500-3000 to find the
offset from the beginning of the region, get 500, and now look
for a symbol that contains the relative address 500. As a result,
we might find some random symbol in the region 1000-2000, and
report that address 3500 corresponds to that random symbol rather
than to bar().

This commit fixes the situation by keeping only a single `Module`
instance for each module, even if that module spans multiple
executable regions. We remember all executable region start and
end ranges so we can determine whether an address (like 3500 in
the above example) lies within the module. But for the purpose of
finding the actual symbol, we need only the offset from the start
of the _first_ executable region, and then need to look up a symbol
based on that.

This was discovered and fixed while tracing .NET Core processes on
Linux, where libcoreclr.so (the main CLR binary) has several
executable regions. Resolving symbols from any but the first region
would produce totally bogus results.
@4ast 4ast merged commit c924193 into iovisor:master Mar 4, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants