Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Enhancements for MXTensor for custom operators #17204

Merged
merged 16 commits into from
Jan 8, 2020
Merged

Conversation

samskalicky
Copy link
Contributor

@samskalicky samskalicky commented Jan 2, 2020

Description

Enhancements to MXTensor for custom operators. Adds the following features:

  • version IDs to uniquely identify when a tensor has been modified or is unchanged between calls to operator (can save on data transfer to accelerator if tensor is unchanged)
  • function to compare MXTensors like we have for MXNet NDarray isSame API

Uprevs MX_LIBRARY_VERSION to 2 since MXTensor struct is changing

Removes opCallBkwd_t since we use opCallFCompute for both forward/backward functions (was dead code)

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

@samskalicky samskalicky changed the title [WIP] enhancements for MXTensor for custom operators Enhancements for MXTensor for custom operators Jan 3, 2020
@samskalicky
Copy link
Contributor Author

@mxnet-label-bot add [pr-awaiting-review]

@lanking520 lanking520 added the pr-awaiting-review PR is waiting for code review label Jan 3, 2020
@samskalicky
Copy link
Contributor Author

samskalicky commented Jan 3, 2020

@rondogency @mseth10 @wkcn @junrushao1994 for review

size_t ID)
: data_ptr(data_ptr), shape(shape), dtype(dtype), version(ID) {}

void update(void *dptr, MXDType type, size_t ver) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need this function? it doesn't have any checks, only copy pointers. I think we can copy them line by line in lib_api.h and keep MXTensor as simple as possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

per our discussion, lets move the the for loop to copy shape and call to setDLTensor inside this function. Change name to "setTensor"

: data_ptr(data_ptr), shape(shape), dtype(dtype) {}
MXTensor(void *data_ptr, const std::vector<int64_t> &shape, MXDType dtype,
size_t ID)
: data_ptr(data_ptr), shape(shape), dtype(dtype), version(ID) {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will be better to unify the naming across all places, like using verID in lib_api.h and here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@@ -277,6 +283,14 @@ struct MXTensor {
return size;
}

/*! \brief helper function to compare two MXTensors */
inline bool isSame(const MXTensor &oth) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we override operator==? since we won't support C anyway

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think operator== is confusing. For a tensor object, == usually means value comparison.
In the future, we may add other operators !=, <, >, etc.
It may be better and more consistent with MXNet NDArray API to use ‘isSame’.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comparing object is how c++ is doing for vector and usually for struct, and in NDArray we don't have operators !=, <, > either, so I don't think it is going to be confusing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I agree with @wkcn here, == should compare the values of the tensor not the "state" of the tensor (data_ptr, versionID, etc)

@mseth10 @eric-haibin-lin @haojin2 what do you guys think?

Copy link
Member

@wkcn wkcn Jan 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For C++ vector container, operator== compares the values.

#include <iostream>
#include <vector>
using namespace std;

int main() {
  vector<int> a{1,2,3};
  vector<int> b{1,2,3};
  vector<int> c{1,2,4};
  cout << (a == b) << endl; // 1
  cout << (a == c) << endl; // 0
  cout << (a.data() == b.data()) << endl; // 0
  return 0;
}

Although (a==b) is 1, a.data() is not equal to b.data().

@@ -277,6 +289,14 @@ struct MXTensor {
return size;
}

/*! \brief helper function to compare two MXTensors */
inline bool operator==(const MXTensor &oth) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will be more consistent with MXNet NDArray API to use ‘IsSame’, since operator== is confusing between same object and same tensor content.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry that I did not read the previous review.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

writing a response together with @rondogency:
For clarity, here were checking the following items: dtype, version ID, shape, and the pointer address of the data in memory. If two tensors have the same data pointer then they will have the same data values also. So here the two tensors will have the same content and will have the same tensor attributes (shape, type, version ID) also.

I dont think we need to be consistent with NDarray API here, since users writing custom ops wont necessarily be familiar with NDarray API. And here we're trying to make the right API for this use-case (custom operators). Our comparison API also checks the version number, and the NDarray API does not. So we're already diverging from being consistent with NDArray API.

For the record, here is the similar NDarray API:
https://github.com/apache/incubator-mxnet/blob/55e222b8c97f99193f832f785cf210f876632add/include/mxnet/ndarray.h#L212-L217

Copy link
Member

@wkcn wkcn Jan 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank @rondogency and you for the explanation in detail!

  1. If two tensors have the same data pointer then they will have the same data values also.
    It is right. However, if two tensors have the same data values, they may not have the same data pointer. I did a simple test on C++ vector.
#include <iostream>
#include <vector>
using namespace std;

int main() {
  vector<int> a{1,2,3};
  vector<int> b{1,2,3};
  cout << (a == b) << endl; // 1
  cout << (a.data() == b.data()) << endl; // 0
  return 0;
}
  1. we don't need to be consistent with NDarray API here.
    Agree : )

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @wkcn, if @rondogency agrees ill change it back to 'isSame'

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @wkcn on this. isSame is less confusing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok I agree with changing it back to isSame, thanks for the inspection on c++ vector!

Copy link
Member

@wkcn wkcn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thank you!

@samskalicky
Copy link
Contributor Author

@mxnet-label-bot update [pr-awaiting-merge]

@lanking520 lanking520 added pr-awaiting-merge Review and CI is complete. Ready to Merge and removed pr-awaiting-review PR is waiting for code review labels Jan 7, 2020
@samskalicky
Copy link
Contributor Author

@wkcn we're ready to merge!

Copy link
Contributor

@mseth10 mseth10 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@samskalicky
Copy link
Contributor Author

@wkcn will you merge this?

@wkcn wkcn merged commit c1f6d64 into apache:master Jan 8, 2020
@wkcn
Copy link
Member

wkcn commented Jan 8, 2020

Merged : ) Thank you!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
pr-awaiting-merge Review and CI is complete. Ready to Merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants