Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

peformance issue #4048

Open
kernel8liang opened this issue Dec 1, 2016 · 2 comments
Open

peformance issue #4048

kernel8liang opened this issue Dec 1, 2016 · 2 comments

Comments

@kernel8liang
Copy link
Contributor

kernel8liang commented Dec 1, 2016

I compiled mxnet with MXNET_USE_CUDA=1, sometimes I didn't use gpu, in this situation it made me to wait almost 2 minutes to watch the iterations running, compare with before I didn't use MXNET_USE_CUDA=1.

debuged around, found in src/kvstore/comm.h

   Comm() {
     pinned_ctx_ = (MXNET_USE_CUDA != 0) ? Context::CPUPinned(0) : Context::CPU();
   }

Comm() use MXNET_USE_CUDA a compile macro to get the context, which is called from _initialize_kvstore in model.py, finally, it case to active gpu, this made me to wait a long time(8 Tesla M40 crads on board) even i didn't use gpu.

I think, it's better to determine pinned_ctx by a runtime variable like a command argument, rather than a compile macro.

@mli mli added the enhancement label Dec 2, 2016
@mli mli self-assigned this Dec 2, 2016
@mli
Copy link
Member

mli commented Jan 5, 2017

should be fixed by #4550

@ChaiBapchya
Copy link
Contributor

@szha @mli since #4550 has merged is this good to close?
OR
@nswamy is this pending?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
No open projects
Development

No branches or pull requests

5 participants