Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[numpy] add op matmul #16990

Merged
merged 6 commits into from
Feb 13, 2020
Merged

[numpy] add op matmul #16990

merged 6 commits into from
Feb 13, 2020

Conversation

JiangZhaoh
Copy link
Contributor

@JiangZhaoh JiangZhaoh commented Dec 6, 2019

Description

unary op: matmul
Because matmul has a special signature (i.e. (n?,k),(k,m?)->(n?,m?)) in official numpy, I comment out the following code segment in python/mxnet/numpy/multiarray.py :
if ufunc.signature is not None:
# we don't support generalised-ufuncs (gufuncs)
return NotImplemented

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

src/operator/numpy/np_matmul_op-inl.h Outdated Show resolved Hide resolved
@haojin2 haojin2 added this to In progress in numpy via automation Dec 6, 2019
@JiangZhaoh JiangZhaoh marked this pull request as ready for review December 9, 2019 02:08
@JiangZhaoh JiangZhaoh requested a review from szha as a code owner December 9, 2019 02:08
@sxjscience
Copy link
Member

sxjscience commented Dec 9, 2019 via email

@JiangZhaoh
Copy link
Contributor Author

I think you may call the BatchedGEMMStrided Kernel directly: https://devblogs.nvidia.com/cublas-strided-batched-matrix-multiply/ Get Outlook for iOShttps://aka.ms/o0ukef
________________________________ From: JiangZhaoh [email protected] Sent: Sunday, December 8, 2019 6:04:32 PM To: apache/incubator-mxnet [email protected] Cc: Xingjian SHI [email protected]; Comment [email protected] Subject: Re: [apache/incubator-mxnet] [numpy] add op matmul (#16990) @JiangZhaoh commented on this pull request.
________________________________ In src/operator/numpy/np_matmul_op-inl.h<#16990 (comment)>:

  • It is treated as a stack of matrices residing in the last two indexes and broadcast accordingly.
    • \param out - output: insert 'value' to 'arr' according to 'index'. + * \param a - input: the first argument. + * \param b - input: the second argument. + * \param ndim - ndim of a, b and output. Because of broadcast, regard their ndim as equal. + / + template + MSHADOW_XINLINE static void Map(int i, DType out, + const DType* a, const DType* b, + const mshadow::Shape<10> a_stride, + const mshadow::Shape<10> b_stride, + const mshadow::Shape<10> out_stride, + const mshadow::Shape<10> a_shape, + const mshadow::Shape<10> b_shape, + const mshadow::Shape<10> out_shape, + const size_t ndim){ You may refer to the implementation of batch_dot. There is no need to store the strides and shapes as Shape<10>. You can reshape the array to a 3D array and just dot the last two dimensions: https://github.com/apache/incubator-mxnet/blob/1aa1b5a9ab53bb57a3c653793fb824d01f2d5e81/src/operator/tensor/dot-inl.h#L1348-L1409 Thanks for your advice. But, may I ask how could I broadcast the shape if I don't store strides and shapes? e.g. Matrix A in shape (2, 1, 3, 4, 5) and matrix B in shape (3, 1, 5, 2), C=np.matmul(A, B) would broadcast A and B to shape (2, 3, 3, 4, 5) and (3, 3, 5, 2) respectively, and C's shape would be (2, 3, 3, 4, 2). If I didn't store shape and stride, I think I should copy the content in each array to get the consistent shape firstly, and use your method then. Is this your mean? — You are receiving this because you commented. Reply to this email directly, view it on GitHub<[numpy] add op matmul #16990?email_source=notifications&email_token=ABHQH3VQJPJ2ICENM5STTDTQXWRTBA5CNFSM4JWMPQEKYY3PNVWWK3TUL52HS4DFWFIHK3DMKJSXC5LFON2FEZLWNFSXPKTDN5WW2ZLOORPWSZGOCOLORVI#discussion_r355237851>, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABHQH3R4ZQCQQAG34ABFOA3QXWRTBANCNFSM4JWMPQEA.

May I confirm your mean again?
Firstly, I broadcast the input arrays into a consistent shape manually by copy their content.
Then, I use the kernel you mentioned above to implement forward compute.
Right?

Copy link
Member

@sxjscience sxjscience left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After offline discussion with @haojin2 , we think that we could add transpose flags (like in batch_dot) to matmul and make it an internal operator. So we can implement the backward pass of matmul by constructing new matmul nodes.

numpy automation moved this from In progress to Needs review Dec 16, 2019
Copy link
Contributor

@haojin2 haojin2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the sanity problems, and address open comments

@JiangZhaoh JiangZhaoh force-pushed the add_op_matmul branch 2 times, most recently from e2e90f4 to b4bf7b4 Compare December 30, 2019 04:09
@haojin2 haojin2 merged commit f5a1014 into apache:master Feb 13, 2020
numpy automation moved this from Needs review to Done Feb 13, 2020
zheyuye pushed a commit to zheyuye/incubator-mxnet that referenced this pull request Feb 19, 2020
* add op matmul - for interoperability error

fix interoperability error and some unimportant details

remove needless addition

* fix website error

* use BatchedGEMM

* check req type / sanity / logic etc.

* resolve identation mistakes

* fix doc mistakes

resolve comment
anirudh2290 pushed a commit to anirudh2290/mxnet that referenced this pull request May 29, 2020
* add op matmul - for interoperability error

fix interoperability error and some unimportant details

remove needless addition

* fix website error

* use BatchedGEMM

* check req type / sanity / logic etc.

* resolve identation mistakes

* fix doc mistakes

resolve comment
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
No open projects
numpy
  
Done
Development

Successfully merging this pull request may close these issues.

None yet

3 participants