Tags · DynamoRIO/dynamorio

cronbuild-10.92.19896

i#6822 unscheduled: Add start-unscheduled support (#6851)

Adds support for threads starting out in an "unscheduled" state. This is
accomplished by always reading ahead in each input and looking for a
TRACE_MARKER_TYPE_SYSCALL_UNSCHEDULE marker *before* the first
instruction. Normally such a marker indicates the invocation of a system
call and is after the system call instruction; for start-unscheduled
threads it is present at the system call exit at the start of the trace.

Changes the scheduler's virtual method process_next_initial_record() to
make the booleans on finding certain markers input-and-output parameters
and moves filetype marker handling and timestamp recording into the
function. This also fixes a problem where an input's initial
next_timestamp was replaced with the 2nd timestamp if a subclass read
ahead.

The extra readahead causes complexities elsewhere which are addressed:
+ The reader caches the last cpuid to use for synthetic recores on
skipping.
+ Generalizes the existing scheduler handling of readahead (the
"recorded_in_schedule" field in input_info_t) to store a count of
pre-read instructions, which will generally be either 0 or 1. Adds a new
internal interface get_instr_ordinal() to get the input reader's
instruction ordinal minus the pre-read count.

Changes raw2trace's virtual function process_marker_additionally() to
process_marker() and moves all marker processing (including timestamps,
which are not markers in the raw format) there, to better support
subclasses inserting start-unscheduled markers and deciding whether to
insert new markers either before or after pre-existing markers.

Adds a scheduler test for the new feature.

Issue: #6822

Jun 22, 2024
304faf9
zip
tar.gz
Notes
Downloads

cronbuild-10.92.19888

i#6745: Fix timestamp gap in delayed tracing (#6746)

Fixes a timestamp gap between the first and second timestamps in the
trace caused when -trace_after_instrs is used.

Adds a unit test that reuses the existing window_test.cpp to reproduce
the bug and fails without the fix.

Fixes: #6745

Jun 14, 2024
3af401b
zip
tar.gz
Notes
Downloads

cronbuild-10.92.19881

i#6662 public traces, part 5: func_id_filter_t (#6820)

Adds a new filter: `func_id_filter_t` to record_filter, which filters
TRACE_MARKER_TYPE_FUNC_ markers based on the function ID.

The filter is enabled by `-filter_keep_func_ids` followed by a list of
integers that represent the function IDs bound to
TRACE_MARKER_TYPE_FUNC_ markers to keep in the trace.
Specifically, whenever we encounter a TRACE_MARKER_TYPE_FUNC_ID marker
whose marker value is in the list we set a per-shard flag to indicate
that all TRACE_MARKER_TYPE_FUNC_[ID | ARG | RETVAL | RETADDR] markers
related to that function ID need to be preserved. We remove the
TRACE_MARKER_TYPE_FUNC_ markers related to functions whose ID is not in
the list.

This filter can be invoked with:
```
drrun -t drmemtrace -tool record_filter -filter_keep_func_ids 1,2,3,4 -indir path/to/input/trace -outdir path/to/output/trace
```
To preserve TRACE_MARKER_TYPE_FUNC_ markers related to functions with
ID: 1, 2, 3, 4, and remove the TRACE_MARKER_TYPE_FUNC_ markers for all
other ID values.

We use this filter to preserve markers related to SYS_futex functions in
the public release of traces.

Issue #6662

Jun 7, 2024
d25d160
zip
tar.gz
Notes
Downloads

cronbuild-10.92.19874

Support subclassing drmemtrace syscall_mix data (#6834)

Adds a virtual destructor to the drmemtrace tool
syscall_mix_t::shard_data_t, to support subclassing that struct
for extended usage such as tracking callstacks for each syscall.

May 31, 2024
8419c47
zip
tar.gz
Notes
Downloads

cronbuild-10.92.19865

i#6814: Fix stack overflow on signal delivery to mid-detach thread (#…

…6815)

Fixes two stack overflow scenarios that occur when DR delivers an app
signal to the native signal handler for a thread that is mid-detach.

First case: when a thread is handling the suspend signal and is waiting
for the detacher thread to wake it up and tell it to continue detaching.
Currently, DR unblocks signals before starting the wait. If the signal is
delivered at this point, currently execute_native_handler() incorrectly
delivers the signal to the native handler on DR's own signal stack. To
fix this: we now do not unblock signals during this wait as it
complicates native signal delivery, also for the second case described
below.

Additionally, for a detaching thread, we now do not explicitly
restore the app's sigblocked mask; DR already restores the mask on the
signal frame, which would be restored automatically when the thread
returns from the DR detach signal handler. This avoids another case
where the app may be on DR's signal stack when the native signal is
delivered.

Second case: when the thread has been woken up by the detacher thread,
executed sig_detach, and reinstated the app signal stack (if available). If
the signal is delivered at this point, execute_native_handler() adds a new
signal frame on top of DR's own signal frame on the app stack and invokes
the native signal handler. This sometimes ends up taking too much stack
space which causes a stack overflow, as observed on an internal app with
frequent profiling signals that use the stack-intensive libunwind to get a
stack trace for all threads. To fix this: we reuse the same signal frame
for delivering the signal to the native signal handler, when the app
doesn't need a non-RT frame.

The new code is exercised by the existing detach_signal test. Also
modified the test to have some threads that have a very small sigstack,
which helps reproduce the crash originally seen on a real app. (There
was already a note in detach_signal test about using a large sigstack to
avoid this stack overflow.)

Tested on an internal app where failures reduce from ~136/4000 to
~1/4000.

Issue: #6814

May 22, 2024
427e33e
zip
tar.gz
Notes
Downloads

cronbuild-10.91.19860

i#6662 public traces, part 4: view tool (#6816)

Modifies the view tool to handle OFFLINE_FILE_TYPE_ARCH_REGDEPS traces,
leveraging the disassembly of DR_ISA_REGDEPS instructions.
When visualizing DR_ISA_REGDEPS instructions, the view tool still prints
the
instruction length and PC, which for OFFLINE_FILE_TYPE_ARCH_REGDEPS
traces are the same as those in the original trace.
Then, after the PC, the instruction encoding, categories, operation
size, and
registers are printed following the disassembly format of DR_ISA_REGDEPS
instructions (xref: #6799).

DR_ISA_REGDEPS instructions printed by the view tool look as follows:
```
[...] ifetch      10 byte(s) @ 0x00007f86ef03d107 00001931 04020204 load store [4byte]       %rv0 %rv2 %rv36 -> %rv0
[...]                                             00000026
```

We also fix a formatting bug in DR_ISA_REGDEPS instruction disassembly,
where we were missing a new line when the instruction encoding spills
into
a second line.

Issue: #6662

May 17, 2024
7db4ca9
zip
tar.gz
Notes
Downloads

cronbuild-10.91.19853

i#3544 RV64: Optimize private memcpy and memset (#6800)

1. Optimize private memcpy and memset for RV64.
2. Add test to compare private and libc memset.
3. Compare private memcpy with libc memcpy on more small sizes.
4. Fix a bug of core/CMakeLists.txt. For unit_tests, to compare private
and libc memcpy, we should link unit_tests to drmemfuncs but not link to
libc.

Compare original memcpy&memset, optimized private memcpy&memset and
glibc memcpy&memset.

Test command:
```
./bin64/unit_tests
```

When we use original memcpy and memset, outputs:
```
our_memcpy_time: size=1 time=0
libc_memcpy_time: size=1 time=2
our_memcpy_time: size=4 time=2
libc_memcpy_time: size=4 time=2
our_memcpy_time: size=128 time=16
libc_memcpy_time: size=128 time=4
our_memcpy_time: size=512 time=57
libc_memcpy_time: size=512 time=7
our_memcpy_time: size=8192 time=824
libc_memcpy_time: size=8192 time=79
our_memcpy_time: size=20480 time=2080
libc_memcpy_time: size=20480 time=183
our_memset_time: 4129
libc_memset_time: 292
io all done
testing string
done testing string
```

When we use optimized memcpy and memset, outputs:
```
our_memcpy_time: size=1 time=1
libc_memcpy_time: size=1 time=2
our_memcpy_time: size=4 time=1
libc_memcpy_time: size=4 time=3
our_memcpy_time: size=128 time=2
libc_memcpy_time: size=128 time=3
our_memcpy_time: size=512 time=7
libc_memcpy_time: size=512 time=7
our_memcpy_time: size=8192 time=72
libc_memcpy_time: size=8192 time=69
our_memcpy_time: size=20480 time=184
libc_memcpy_time: size=20480 time=175
our_memset_time: 307
libc_memset_time: 302
io all done
testing string
done testing string
```

Issue: #3544

May 10, 2024
ef1cd6f
zip
tar.gz
Notes
Downloads

cronbuild-10.90.19845

i#5036 Update scatter/gather docs with AArch64 details (#6795)

Adds details of the AArch64 scatter/gather expansion to the
scatter/gather expansion developer documentation.

Issue: #5036

May 2, 2024
3e1ec2f
zip
tar.gz
Notes
Downloads

cronbuild-10.90.19838

i#6426 sched stats: Print list of threads (#6791)

Adds the list of threads per cpu to the schedule_stats output. Updates
the schedule_stats tests to confirm some tids are printed.

Issue: #6426

Apr 25, 2024
3c22f53
zip
tar.gz
Notes
Downloads

cronbuild-10.90.19831

Update incorrect docs about scheduler window ids (#6781)

Updates the drmemtrace scheduler regions_of_interest docs which
incorrectly stated the window id markers were not inserted between
back-to-back regions: they are inserted, as the code confirms (with an
explicit comment) and the unit tests check.

Apr 18, 2024
138a781
zip
tar.gz
Notes
Downloads

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cronbuild-10.92.19896

cronbuild-10.92.19888

cronbuild-10.92.19881

cronbuild-10.92.19874

cronbuild-10.92.19865

cronbuild-10.91.19860

cronbuild-10.91.19853

cronbuild-10.90.19845

cronbuild-10.90.19838

cronbuild-10.90.19831

Tags: DynamoRIO/dynamorio