[OpPerf] Implement remaining random sampling ops #17502

connorgoggins · 2020-02-01T02:49:19Z

Description

This PR serves to implement the remaining operators from the Random Sampling category (BilinearSampler, GridGenerator, and sample_multinomial) in opperf. To achieve this, I added a list to track ops requiring custom_data in op_registry_utils.py (same strategy @ChaiBapchya used) and added the new random sampling ops to the list, which allowed me to add custom data dimensions for each new op in default_params.py. I then added these new ops to the comments in random_sampling_operators.py.

Furthermore, I investigated the sample_multinomial op, a Random Sampling op which had previously been added to the unique_ops tuple in op_registry_utils.py. This inclusion of the op in unique_ops effectively removed the op from opperf. While the documentation claimed that this op only worked when all values on the last axis summed to 1, I tested the op extensively and found that the op did in fact work on random data. With this in mind, I added this op to opperf and assigned its data parameter to lower dimensional data than the standard data shape to facilitate quick testing (on the standard data shape, the op took several minutes to complete a single forward pass).

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage
Code is well-documented
To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

M benchmark/opperf/nd_operations/random_sampling_operators.py
M benchmark/opperf/utils/op_registry_utils.py
M benchmark/opperf/rules/default_params.py

Comments

Tested on p2.16xl w/ubuntu 16.04 and Mac OS with:

Checkout branch and call function run_mx_random_sampling_operators_benchmarks - runs all Random Sampling ops on relevant data
Checkout branch and run opperf.py (full run of all ops)

Performance Results

Group of operator test - all Random Sampling ops (CPU)
Full OpPerf test (CPU)

Group of operator test - all Random Sampling ops (GPU)
Full OpPerf test (GPU)

@apeforest @access2rohit @ChaiBapchya

connorgoggins · 2020-02-01T02:49:37Z

@mxnet-label-bot add [pr-awaiting-review]

benchmark/opperf/nd_operations/random_sampling_operators.py

benchmark/opperf/utils/op_registry_utils.py

benchmark/opperf/nd_operations/random_sampling_operators.py

benchmark/opperf/rules/default_params.py

benchmark/opperf/utils/op_registry_utils.py

apeforest · 2020-02-03T19:17:29Z

benchmark/opperf/rules/default_params.py

- "args": DEFAULT_ARGS}
+ "args": DEFAULT_ARGS,
+ "grid": DEFAULT_GRID,
+ "data_bilinearsampler": DEFAULT_DATA_BILINEAR,


Seems there are very operator specific. What if these arguments change in mxnet 2.0 does it break this opperf utility. Is there a way to generalize these op specific params, like passing them in as general args param?

If these arguments change (in terms of data type or arg name) the opperf utility will fail, as is the case for all of the more general args used in this file. I could use a general args param to pass in the op-specific arguments and establish a mapping between keywords and values on the fly, which might be more resilient to potential changes in arg structure (such as arg order or name changes). @ChaiBapchya any thoughts?

As discussed offline @apeforest this should be covered as part of #17515 where changes made to ops shouldn't break OpPerf.

ChaiBapchya · 2020-02-04T02:48:07Z

Incorrect way to rebase @connorgoggins

First get your master branch updated

git remote add upstream https://github.com/apache/incubator-mxnet
git checkout master
git fetch upstream master
git merge upstream/master
git push origin master

Once your fork master is sync'ed with remote master, rebase your branch on master

git checkout <branchname>
git rebase master
git push origin <branchname>

Let me know if this works!

ChaiBapchya · 2020-02-06T01:59:14Z

benchmark/opperf/utils/op_registry_utils.py

- if op_name in optimizer_ops and op_name not in unique_ops:
- optimizer_mx_operators[op_name] = mx_operators[op_name]
+ for op_name, op_params in mx_operators.items():
+ if op_name in optimizer_ops:


wait doesn't this give indent error?

Didn't throw an error, but fixed regardless.

ChaiBapchya

LGTM!

…r sample_multinomial as it does work with random data

* Added support for remaining random sampling ops, removed exception for sample_multinomial as it does work with random data * Dropped unused unique_ops variable * Added affine transform, dropped parentheses, changed 19 to 18 * Dropped unique_ops condition - no longer in use * Fixed indentation * Dropped unique_ops

lanking520 added the pr-awaiting-review PR is waiting for code review label Feb 1, 2020