[MXNET-1401] adding more operators to test support for Large Tensor #14944

access2rohit · 2019-05-14T18:51:27Z

Description

Test to check support for operators: tostype, split, argmin, tile, take and diag

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

The PR title starts with [MXNET-1401], where [MXNET-1401] adding more operators to test support for Large Tensor #14944 refers to the relevant JIRA issue created (except PRs with tiny changes)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
Check the API doc at http:https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

access2rohit · 2019-05-14T18:52:29Z

@mxnet-label-bot add [pr-work-in-progress]

access2rohit · 2019-05-14T21:04:59Z

@mxnet-label-bot add [pr-awaiting-review]

access2rohit · 2019-05-14T21:05:11Z

@apeforest please review

apeforest · 2019-05-14T21:35:04Z

tests/nightly/test_large_array.py

@@ -28,7 +28,6 @@
 SMALL_Y = 50
 LARGE_SIZE = LARGE_X * SMALL_Y

-


keep it. It's Pep8 style

Sure! Are you referring to this: https://www.python.org/dev/peps/pep-0008/#blank-lines
Please verify.

apeforest · 2019-05-14T21:35:12Z

tests/nightly/test_large_array.py

@@ -37,26 +36,31 @@ def test_gluon_embedding():
 assert b.shape == (MEDIUM_X, SMALL_Y, MEDIUM_X)
 assert b.asnumpy().size == LARGE_SIZE

-


keep it. It's Pep8 style

apeforest · 2019-05-14T21:35:16Z

tests/nightly/test_large_array.py

 def test_ndarray_zeros():
 a = nd.zeros(shape=(LARGE_X, SMALL_Y))
 assert a[-1][0] == 0
 assert a.shape == (LARGE_X, SMALL_Y)
 assert a.size == LARGE_SIZE

-


keep it. It's Pep8 style

apeforest · 2019-05-15T16:28:38Z

tests/nightly/test_large_array.py

+
+
+def test_argmin():
+ a = nd.arange(0, LARGE_X).reshape(LARGE_X, 1)


Why do we need to broadcast? Can we just create a super long array and perform argmin?

apeforest · 2019-05-15T16:29:17Z

tests/nightly/test_large_array.py

+
+def test_take():
+ a = nd.ones(shape=(LARGE_X, SMALL_Y))
+ idx = nd.arange(LARGE_X-1000, LARGE_X)


add space between LARGE_X and -

apeforest · 2019-05-15T16:30:42Z

tests/nightly/test_large_array.py

@@ -217,6 +260,33 @@ def numpy_space_to_depth(x, blocksize):
 output = mx.nd.space_to_depth(data, 2)
 assert_almost_equal(output.asnumpy(), expected, atol=1e-3, rtol=1e-3)

+
+def test_diag():
+ h = np.random.randint(2,9)


why use randint? Can we just use deterministic values to test diag?

@apeforest : using random values is better since we are checking diagonal. Having different values actually ensures that the operation is performed correctly. Checking just the shape is fine but this ensures total correctness. I think if we can perform this check then it ensures total correctness

apeforest · 2019-05-21T21:14:31Z

tests/nightly/test_large_array.py

+ a = nd.arange(0, LARGE_X * SMALL_Y).reshape(LARGE_X, SMALL_Y)
+ outs = nd.split(a, num_outputs=SMALL_Y, axis=1)
+ sum = 0
+ for i, out in enumerate(outs):


You may make this more pythonic:

Suggested change

for i, out in enumerate(outs):

result = sum(1 for i, v in enumerate(outs) if i == v)

You also need to convert outs into outs.asnumpy() first in the line above.

apeforest · 2019-05-21T21:16:44Z

tests/nightly/test_large_array.py

+def test_diag():
+ h = np.random.randint(2,9)
+ w = np.random.randint(2,9)
+ a_np = np.random.random((LARGE_X, 64)).astype(np.float32)


Why do you need to convert to np.float32 explicitly here?

@apeforest by default np.random.random((LARGE_X, 64)) makes it float64 by default.

x = np.random.random((50000000, 64))
x
array([[0.77259927, 0.17861939, 0.73446367, ..., 0.61712618, 0.00411382,
0.80232583],
[0.99131563, 0.29269588, 0.34078989, ..., 0.14693316, 0.68376766,
0.13507782],
[0.26176515, 0.54208053, 0.42594753, ..., 0.76032471, 0.30179728,
0.83745653],
...,
[0.45731445, 0.84679834, 0.02738181, ..., 0.27567623, 0.94721731,
0.77850831],
[0.65551258, 0.92503922, 0.43253718, ..., 0.23874736, 0.55155928,
0.33177961],
[0.95565209, 0.86454407, 0.75257631, ..., 0.31738383, 0.90281116,
0.93569039]])
x.dtype
dtype('float64')

I am trying to keep it consistent with: https://github.com/apache/incubator-mxnet/blob/f680255fbff25818ed3eee0d4541d80b3c7b9d9d/tests/python/unittest/test_operator.py#L8029
So, I think float32 is required here.

ok. mx.nd.array(a_np) is float32 by default.

y = mx.nd.array(x)
y.dtype
<class 'numpy.float32'>

I will remove astype('float32') from next line

apeforest · 2019-05-21T21:17:25Z

tests/nightly/test_large_array.py

+ assert_almost_equal(r.asnumpy(), np.diag(a_np, k=k))
+
+ # random k
+ k = np.random.randint(-min(LARGE_X,64) + 1, min(h,w))


nit: add space after comma

apeforest · 2019-05-21T22:48:06Z

tests/nightly/test_large_array.py

+ assert_almost_equal(r.asnumpy(), np.diag(a_np, k=k))
+
+ # random k
+ k = np.random.randint(-min(LARGE_X, 64) + 1, min(h,w))


nit: same between h and w

apeforest

LGTM

…pache#14944)

marcoabreu added the pr-work-in-progress PR is still work in progress label May 14, 2019

access2rohit changed the title ~~[WIP]adding more operators to test support for Large Tensor~~ [MXNET-1401] adding more operators to test support for Large Tensor May 14, 2019

access2rohit force-pushed the master branch from 6cb821b to 935ca90 Compare May 14, 2019 21:04

marcoabreu added the pr-awaiting-review PR is waiting for code review label May 14, 2019

apeforest reviewed May 14, 2019

View reviewed changes

access2rohit force-pushed the master branch from 935ca90 to dfede7d Compare May 14, 2019 22:41

apeforest reviewed May 15, 2019

View reviewed changes

access2rohit force-pushed the master branch 3 times, most recently from 26f85d3 to 41df5d6 Compare May 21, 2019 18:03

apeforest reviewed May 21, 2019

View reviewed changes

access2rohit force-pushed the master branch from 41df5d6 to ef51a0e Compare May 21, 2019 22:23

apeforest reviewed May 21, 2019

View reviewed changes

access2rohit force-pushed the master branch from ef51a0e to d1ddaa7 Compare May 21, 2019 23:43

[MXNET-1401] adding more operators to test support for Large Tensor

c691811

access2rohit force-pushed the master branch from d1ddaa7 to c691811 Compare May 21, 2019 23:45

apeforest approved these changes May 22, 2019

View reviewed changes

apeforest merged commit e5316b1 into apache:master May 22, 2019

haohuanw pushed a commit to haohuanw/incubator-mxnet that referenced this pull request Jun 23, 2019

[MXNET-1401] adding more operators to test support for Large Tensor (a…

18de5c1

…pache#14944)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MXNET-1401] adding more operators to test support for Large Tensor #14944

[MXNET-1401] adding more operators to test support for Large Tensor #14944

access2rohit commented May 14, 2019 •

edited

Loading

access2rohit commented May 14, 2019

access2rohit commented May 14, 2019

access2rohit commented May 14, 2019

apeforest May 14, 2019

access2rohit May 14, 2019 •

edited

Loading

apeforest May 16, 2019

apeforest May 14, 2019

apeforest May 14, 2019

apeforest May 15, 2019

apeforest May 15, 2019 •

edited

Loading

apeforest May 15, 2019

access2rohit May 15, 2019

apeforest May 21, 2019 •

edited

Loading

access2rohit May 21, 2019

apeforest May 21, 2019

access2rohit May 21, 2019

access2rohit May 21, 2019 •

edited

Loading

apeforest May 21, 2019 •

edited

Loading

access2rohit May 21, 2019

apeforest May 21, 2019

apeforest left a comment

		@@ -37,26 +36,31 @@ def test_gluon_embedding():
		assert b.shape == (MEDIUM_X, SMALL_Y, MEDIUM_X)
		assert b.asnumpy().size == LARGE_SIZE



		def test_argmin():
		a = nd.arange(0, LARGE_X).reshape(LARGE_X, 1)

	for i, out in enumerate(outs):
	result = sum(1 for i, v in enumerate(outs) if i == v)

[MXNET-1401] adding more operators to test support for Large Tensor #14944

[MXNET-1401] adding more operators to test support for Large Tensor #14944

Conversation

access2rohit commented May 14, 2019 • edited Loading

Description

Checklist

Essentials

access2rohit commented May 14, 2019

access2rohit commented May 14, 2019

access2rohit commented May 14, 2019

Choose a reason for hiding this comment

access2rohit May 14, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apeforest May 15, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apeforest May 21, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

access2rohit May 21, 2019 • edited Loading

Choose a reason for hiding this comment

apeforest May 21, 2019 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apeforest left a comment

Choose a reason for hiding this comment

access2rohit commented May 14, 2019 •

edited

Loading

access2rohit May 14, 2019 •

edited

Loading

apeforest May 15, 2019 •

edited

Loading

apeforest May 21, 2019 •

edited

Loading

access2rohit May 21, 2019 •

edited

Loading

apeforest May 21, 2019 •

edited

Loading