Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

lstm_bucketing example has evenly divide problem with self create dataset #11430

Closed
liyujiel opened this issue Jun 27, 2018 · 1 comment
Closed

Comments

@liyujiel
Copy link
Contributor

liyujiel commented Jun 27, 2018

Description

problem happened with a self-created dataset for lstm_bucketing example on CPU

But dataset with rnn-time-major works good

Environment info (Required)

----------Python Info----------
Version      : 3.6.4
Compiler     : GCC 7.2.0
Build        : ('default', 'Jan 16 2018 18:10:19')
Arch         : ('64bit', '')
------------Pip Info-----------
Version      : 9.0.1
Directory    : /home/ubuntu/anaconda3/lib/python3.6/site-packages/pip
----------MXNet Info-----------
/home/ubuntu/anaconda3/lib/python3.6/site-packages/h5py/__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`.
  from ._conv import register_converters as _register_converters
Version      : 1.3.0
Directory    : /home/ubuntu/anaconda3/lib/python3.6/site-packages/mxnet
Commit Hash   : 5550c0afe4b202c573a1cc0e2387447c8a888769
----------System Info----------
Platform     : Linux-4.4.0-1061-aws-x86_64-with-debian-stretch-sid
system       : Linux
node         : ip-172-31-25-194
release      : 4.4.0-1061-aws
version      : #70-Ubuntu SMP Fri May 25 21:47:34 UTC 2018
----------Hardware Info----------
machine      : x86_64
processor    : x86_64
Architecture:          x86_64
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                32
On-line CPU(s) list:   0-31
Thread(s) per core:    2
Core(s) per socket:    16
Socket(s):             1
NUMA node(s):          1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 79
Model name:            Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
Stepping:              1
CPU MHz:               2699.625
CPU max MHz:           3000.0000
CPU min MHz:           1200.0000
BogoMIPS:              4600.08
Hypervisor vendor:     Xen
Virtualization type:   full
L1d cache:             32K
L1i cache:             32K
L2 cache:              256K
L3 cache:              46080K
NUMA node0 CPU(s):     0-31
Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single kaiser fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx xsaveopt
----------Network Test----------
Setting timeout: 10
Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0021 sec, LOAD: 0.3551 sec.
Timing for Gluon Tutorial(en): http:https://gluon.mxnet.io, DNS: 0.0739 sec, LOAD: 0.0567 sec.
Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 0.0908 sec, LOAD: 0.4070 sec.
Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 0.0037 sec, LOAD: 0.9360 sec.
Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0038 sec, LOAD: 0.1142 sec.
Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.0030 sec, LOAD: 0.0294 sec.

Package used (Python/R/Scala/Julia):

Python

MXNet commit hash:
7c1acb4

Build config:
(Paste the content of config.mk, or the build command.)

Error Message:

/anaconda3/lib/python3.6/site-packages/h5py/init.py:36: FutureWarning: Conversion of the second argument of issubdtype from float to np.floating is deprecated. In future, it will be treated as np.float64 == np.dtype(float).type.
from ._conv import register_converters as _register_converters
WARNING: discarded 0 sentences longer than the largest bucket.
WARNING: discarded 0 sentences longer than the largest bucket.
Traceback (most recent call last):
File "/anaconda3/lib/python3.6/site-packages/mxnet/symbol/symbol.py", line 1521, in simple_bind
ctypes.byref(exe_handle)))
File "/anaconda3/lib/python3.6/site-packages/mxnet/base.py", line 210, in check_call
raise MXNetError(py_str(LIB.MXGetLastError()))
mxnet.base.MXNetError: Error in operator split0: [13:51:38] src/operator/./slice_channel-inl.h:208: Check failed: dshape[real_axis] % param
.num_outputs == 0U (10 vs. 0) You are trying to split the 1-th axis of input tensor with shape [32,30,200] into num_outputs=20 evenly sized chunks, but this is not possible because 20 does not evenly divide 30

Stack trace returned 10 entries:
[bt] (0) 0 libmxnet.so 0x000000011c0dcab4 libmxnet.so + 19124
[bt] (1) 1 libmxnet.so 0x000000011c0dc86f libmxnet.so + 18543
[bt] (2) 2 libmxnet.so 0x000000011d6873f1 MXTVMBridge + 3552369
[bt] (3) 3 libmxnet.so 0x000000011d3170ce MXNDListFree + 1644494
[bt] (4) 4 libmxnet.so 0x000000011d1d477a MXNDListFree + 323194
[bt] (5) 5 libmxnet.so 0x000000011d1cce8c MXNDListFree + 292236
[bt] (6) 6 libmxnet.so 0x000000011d1bf8d6 MXNDListFree + 237526
[bt] (7) 7 libmxnet.so 0x000000011d1c544a MXNDListFree + 260938
[bt] (8) 8 libmxnet.so 0x000000011d151e40 MXExecutorSimpleBind + 8656
[bt] (9) 9 libffi.6.dylib 0x000000010dea8884 ffi_call_unix64 + 76

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "lstm_bucketing.py", line 126, in
batch_end_callback = mx.callback.Speedometer(args.batch_size, args.disp_batches, auto_reset=False))
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/base_module.py", line 515, in fit
self.forward_backward(data_batch)
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/base_module.py", line 194, in forward_backward
self.forward(data_batch, is_train=True)
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/bucketing_module.py", line 455, in forward
data_batch.provide_label)
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/bucketing_module.py", line 376, in switch_bucket
force_rebind=False, shared_module=self._buckets[self._default_bucket_key])
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/module.py", line 430, in bind
state_names=self._state_names)
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/executor_group.py", line 279, in init
self.bind_exec(data_shapes, label_shapes, shared_group)
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/executor_group.py", line 375, in bind_exec
shared_group))
File "/anaconda3/lib/python3.6/site-packages/mxnet/module/executor_group.py", line 662, in bind_ith_exec
shared_buffer=shared_data_arrays, **input_shapes)
File "/anaconda3/lib/python3.6/site-packages/mxnet/symbol/symbol.py", line 1527, in simple_bind
raise RuntimeError(error_msg)
RuntimeError: simple_bind error. Arguments:
data: (32, 30)
softmax_label: (32, 30)
Error in operator split0: [13:51:38] src/operator/./slice_channel-inl.h:208: Check failed: dshape[real_axis] % param
.num_outputs == 0U (10 vs. 0) You are trying to split the 1-th axis of input tensor with shape [32,30,200] into num_outputs=20 evenly sized chunks, but this is not possible because 20 does not evenly divide 30

Stack trace returned 10 entries:
[bt] (0) 0 libmxnet.so 0x000000011c0dcab4 libmxnet.so + 19124
[bt] (1) 1 libmxnet.so 0x000000011c0dc86f libmxnet.so + 18543
[bt] (2) 2 libmxnet.so 0x000000011d6873f1 MXTVMBridge + 3552369
[bt] (3) 3 libmxnet.so 0x000000011d3170ce MXNDListFree + 1644494
[bt] (4) 4 libmxnet.so 0x000000011d1d477a MXNDListFree + 323194
[bt] (5) 5 libmxnet.so 0x000000011d1cce8c MXNDListFree + 292236
[bt] (6) 6 libmxnet.so 0x000000011d1bf8d6 MXNDListFree + 237526
[bt] (7) 7 libmxnet.so 0x000000011d1c544a MXNDListFree + 260938
[bt] (8) 8 libmxnet.so 0x000000011d151e40 MXExecutorSimpleBind + 8656
[bt] (9) 9 libffi.6.dylib 0x000000010dea8884 ffi_call_unix64 + 76

Steps to reproduce

(Paste the commands you ran that produced the error.)

  1. python lstm_bucketing.py
@frankfliu
Copy link
Contributor

Hi @liyujiel thanks for submitting the issue, @sandeep-krishnamurthy requesting this be labeled.

szha pushed a commit that referenced this issue Jul 18, 2018
* fix BucketSentenceIter bug

* add test case for smallest empty bucket
@szha szha closed this as completed Jul 18, 2018
KellenSunderland pushed a commit to KellenSunderland/incubator-mxnet that referenced this issue Jul 21, 2018
* fix BucketSentenceIter bug

* add test case for smallest empty bucket
XinYao1994 pushed a commit to XinYao1994/incubator-mxnet that referenced this issue Aug 29, 2018
* fix BucketSentenceIter bug

* add test case for smallest empty bucket
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants