Skip to content

Tags: DynamoRIO/dynamorio

Tags

cronbuild-10.92.19896

Toggle cronbuild-10.92.19896's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
i#6822 unscheduled: Add start-unscheduled support (#6851)

Adds support for threads starting out in an "unscheduled" state. This is
accomplished by always reading ahead in each input and looking for a
TRACE_MARKER_TYPE_SYSCALL_UNSCHEDULE marker *before* the first
instruction. Normally such a marker indicates the invocation of a system
call and is after the system call instruction; for start-unscheduled
threads it is present at the system call exit at the start of the trace.

Changes the scheduler's virtual method process_next_initial_record() to
make the booleans on finding certain markers input-and-output parameters
and moves filetype marker handling and timestamp recording into the
function. This also fixes a problem where an input's initial
next_timestamp was replaced with the 2nd timestamp if a subclass read
ahead.

The extra readahead causes complexities elsewhere which are addressed:
+ The reader caches the last cpuid to use for synthetic recores on
skipping.
+ Generalizes the existing scheduler handling of readahead (the
"recorded_in_schedule" field in input_info_t) to store a count of
pre-read instructions, which will generally be either 0 or 1. Adds a new
internal interface get_instr_ordinal() to get the input reader's
instruction ordinal minus the pre-read count.

Changes raw2trace's virtual function process_marker_additionally() to
process_marker() and moves all marker processing (including timestamps,
which are not markers in the raw format) there, to better support
subclasses inserting start-unscheduled markers and deciding whether to
insert new markers either before or after pre-existing markers.

Adds a scheduler test for the new feature.

Issue: #6822

cronbuild-10.92.19888

Toggle cronbuild-10.92.19888's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
i#6745: Fix timestamp gap in delayed tracing (#6746)

Fixes a timestamp gap between the first and second timestamps in the
trace caused when -trace_after_instrs is used.

Adds a unit test that reuses the existing window_test.cpp to reproduce
the bug and fails without the fix.

Fixes: #6745

cronbuild-10.92.19881

Toggle cronbuild-10.92.19881's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
i#6662 public traces, part 5: func_id_filter_t (#6820)

Adds a new filter: `func_id_filter_t` to record_filter, which filters
TRACE_MARKER_TYPE_FUNC_ markers based on the function ID.

The filter is enabled by `-filter_keep_func_ids` followed by a list of
integers that represent the function IDs bound to
TRACE_MARKER_TYPE_FUNC_ markers to keep in the trace.
Specifically, whenever we encounter a TRACE_MARKER_TYPE_FUNC_ID marker
whose marker value is in the list we set a per-shard flag to indicate
that all TRACE_MARKER_TYPE_FUNC_[ID | ARG | RETVAL | RETADDR] markers
related to that function ID need to be preserved. We remove the
TRACE_MARKER_TYPE_FUNC_ markers related to functions whose ID is not in
the list.

This filter can be invoked with:
```
drrun -t drmemtrace -tool record_filter -filter_keep_func_ids 1,2,3,4 -indir path/to/input/trace -outdir path/to/output/trace
```
To preserve TRACE_MARKER_TYPE_FUNC_ markers related to functions with
ID: 1, 2, 3, 4, and remove the TRACE_MARKER_TYPE_FUNC_ markers for all
other ID values.

We use this filter to preserve markers related to SYS_futex functions in
the public release of traces.

Issue #6662

cronbuild-10.92.19874

Toggle cronbuild-10.92.19874's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Support subclassing drmemtrace syscall_mix data (#6834)

Adds a virtual destructor to the drmemtrace tool
syscall_mix_t::shard_data_t, to support subclassing that struct
for extended usage such as tracking callstacks for each syscall.

cronbuild-10.92.19865

Toggle cronbuild-10.92.19865's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
i#6814: Fix stack overflow on signal delivery to mid-detach thread (#…

…6815)

Fixes two stack overflow scenarios that occur when DR delivers an app
signal to the native signal handler for a thread that is mid-detach.

First case: when a thread is handling the suspend signal and is waiting
for the detacher thread to wake it up and tell it to continue detaching.
Currently, DR unblocks signals before starting the wait. If the signal is
delivered at this point, currently execute_native_handler() incorrectly
delivers the signal to the native handler on DR's own signal stack. To
fix this: we now do not unblock signals during this wait as it
complicates native signal delivery, also for the second case described
below.

Additionally, for a detaching thread, we now do not explicitly
restore the app's sigblocked mask; DR already restores the mask on the
signal frame, which would be restored automatically when the thread
returns from the DR detach signal handler. This avoids another case
where the app may be on DR's signal stack when the native signal is
delivered.

Second case: when the thread has been woken up by the detacher thread,
executed sig_detach, and reinstated the app signal stack (if available). If
the signal is delivered at this point, execute_native_handler() adds a new
signal frame on top of DR's own signal frame on the app stack and invokes
the native signal handler. This sometimes ends up taking too much stack
space which causes a stack overflow, as observed on an internal app with
frequent profiling signals that use the stack-intensive libunwind to get a
stack trace for all threads. To fix this: we reuse the same signal frame
for delivering the signal to the native signal handler, when the app
doesn't need a non-RT frame.

The new code is exercised by the existing detach_signal test. Also
modified the test to have some threads that have a very small sigstack,
which helps reproduce the crash originally seen on a real app. (There
was already a note in detach_signal test about using a large sigstack to
avoid this stack overflow.)

Tested on an internal app where failures reduce from ~136/4000 to
~1/4000.

Issue: #6814

cronbuild-10.91.19860

Toggle cronbuild-10.91.19860's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
i#6662 public traces, part 4: view tool (#6816)

Modifies the view tool to handle OFFLINE_FILE_TYPE_ARCH_REGDEPS traces,
leveraging the disassembly of DR_ISA_REGDEPS instructions.
When visualizing DR_ISA_REGDEPS instructions, the view tool still prints
the
instruction length and PC, which for OFFLINE_FILE_TYPE_ARCH_REGDEPS
traces are the same as those in the original trace.
Then, after the PC, the instruction encoding, categories, operation
size, and
registers are printed following the disassembly format of DR_ISA_REGDEPS
instructions (xref: #6799).

DR_ISA_REGDEPS instructions printed by the view tool look as follows:
```
[...] ifetch      10 byte(s) @ 0x00007f86ef03d107 00001931 04020204 load store [4byte]       %rv0 %rv2 %rv36 -> %rv0
[...]                                             00000026
```

We also fix a formatting bug in DR_ISA_REGDEPS instruction disassembly,
where we were missing a new line when the instruction encoding spills
into
a second line.

Issue: #6662

cronbuild-10.91.19853

Toggle cronbuild-10.91.19853's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
i#3544 RV64: Optimize private memcpy and memset (#6800)

1. Optimize private memcpy and memset for RV64.
2. Add test to compare private and libc memset.
3. Compare private memcpy with libc memcpy on more small sizes.
4. Fix a bug of core/CMakeLists.txt. For unit_tests, to compare private
and libc memcpy, we should link unit_tests to drmemfuncs but not link to
libc.

Compare original memcpy&memset, optimized private memcpy&memset and
glibc memcpy&memset.

Test command:
```
./bin64/unit_tests
```

When we use original memcpy and memset, outputs:
```
our_memcpy_time: size=1 time=0
libc_memcpy_time: size=1 time=2
our_memcpy_time: size=4 time=2
libc_memcpy_time: size=4 time=2
our_memcpy_time: size=128 time=16
libc_memcpy_time: size=128 time=4
our_memcpy_time: size=512 time=57
libc_memcpy_time: size=512 time=7
our_memcpy_time: size=8192 time=824
libc_memcpy_time: size=8192 time=79
our_memcpy_time: size=20480 time=2080
libc_memcpy_time: size=20480 time=183
our_memset_time: 4129
libc_memset_time: 292
io all done
testing string
done testing string
```

When we use optimized memcpy and memset, outputs:
```
our_memcpy_time: size=1 time=1
libc_memcpy_time: size=1 time=2
our_memcpy_time: size=4 time=1
libc_memcpy_time: size=4 time=3
our_memcpy_time: size=128 time=2
libc_memcpy_time: size=128 time=3
our_memcpy_time: size=512 time=7
libc_memcpy_time: size=512 time=7
our_memcpy_time: size=8192 time=72
libc_memcpy_time: size=8192 time=69
our_memcpy_time: size=20480 time=184
libc_memcpy_time: size=20480 time=175
our_memset_time: 307
libc_memset_time: 302
io all done
testing string
done testing string
```

Issue: #3544

cronbuild-10.90.19845

Toggle cronbuild-10.90.19845's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
i#5036 Update scatter/gather docs with AArch64 details (#6795)

Adds details of the AArch64 scatter/gather expansion to the
scatter/gather expansion developer documentation.

Issue: #5036

cronbuild-10.90.19838

Toggle cronbuild-10.90.19838's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
i#6426 sched stats: Print list of threads (#6791)

Adds the list of threads per cpu to the schedule_stats output. Updates
the schedule_stats tests to confirm some tids are printed.

Issue: #6426

cronbuild-10.90.19831

Toggle cronbuild-10.90.19831's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
Update incorrect docs about scheduler window ids (#6781)

Updates the drmemtrace scheduler regions_of_interest docs which
incorrectly stated the window id markers were not inserted between
back-to-back regions: they are inserted, as the code confirms (with an
explicit comment) and the unit tests check.