This repository has been archived by the owner on Nov 17, 2023. It is now read-only.
[OpPerf] Fixed native output ordering, added warmup & runs command line args #17571
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Here, I fix the ordering issue when operator perf results are generated with the native profiler. Previously, the ordering of keys in the output dict generated was arbitrary and illogical. There were cases where
avg_time_forward_
perf results were displayed afteravg_time_backward_
perf results (when running withopperf.py
), and where memory consumption was displayed between forward and backward perf results (when running categories of operators individually). Furthermore,inputs
were never displayed first in the output.To solve these problems, I traced the construction of the results dictionary back to the saving of forward/backward perf results and memory consumption data in
profiler_utils.py
. I found that the entry foravg_time_backward_
was consistently being added to the dictionary before the entry foravg_time_forward_
(since it appeared first in the raw profile results, and was consequently encountered first in the iteration). I implemented a fix that saved each result to a variable during the iteration, and added the variable values to the dict in the correct order after the iteration had finished.After making this change, I found another issue:
merge_map_list()
constructed the output map in a way that was insensitive to the order of the list of dictionaries (and the keys within those dictionaries) that it was passed. I rewrotemerge_map_list()
to be sensitive to the order of its input list, as wells as the order of the keys within the dictionaries of each list element. Then, inbenchmark_utils.py
, I passed theinput
map tomerge_map_list()
before the other entries to ensure its position as the first key in the resulting dictionary.Even after making these changes, I found that the output of
opperf.py
was still ordered incorrectly. I found the root cause of the issue: in thejson.dump
call incommon_utils.py
, thesort_keys
parameter was set toTrue
, which would sort the keys of all dictionaries in the output JSON alphabetically (overwriting my ordering of the output dictionary). I saw that the purpose of sorting the output dictionary was to display the operator keys alphabetically in the resulting JSON. To retain this alphabetical operator sorting while preserving the ordering I had established for the output of each operator, I disabled thesort_keys
parameter and sorted only the highest-level keys in the final output map inopperf.py
.When I finished making these changes, I wanted to quickly test them with
opperf.py
, but found that there was no way to specifywarmup
andruns
from the command line when calling opperf. I thought this would be a very useful feature for users testing operator performance, so I added it.Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments
Tested on Mac OS and Ubuntu 16.04 (on p2.16xl) with:
run_mx_misc_operators_benchmarks
)opperf.py
(full run of all ops) with JSON output.Performance Results
Full OpPerf test (CPU) - JSON format
Full OpPerf test (GPU) - JSON format
Full OpPerf test (CPU) - MD format
Full OpPerf test (GPU) - MD format
@apeforest @access2rohit @ChaiBapchya