Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

Dual stream cudnn Convolution backward() with MXNET_GPU_WORKER_NSTREAMS=2. #14006

Merged
merged 11 commits into from
Feb 24, 2019
Prev Previous commit
Next Next commit
Fix cpplint.
  • Loading branch information
DickJC123 committed Jan 29, 2019
commit 1cf5c67370b32d8e41b9669f03071dd8560383f6
3 changes: 2 additions & 1 deletion src/operator/nn/cudnn/cudnn_convolution-inl.h
Original file line number Diff line number Diff line change
Expand Up @@ -1038,7 +1038,8 @@ class CuDNNConvolutionOp {
// Always allocates at least one word.
mshadow::Tensor<gpu, 1, DType> AllocateTempWorkspace(const OpContext &ctx, size_t size_bytes) {
mshadow::Stream<gpu> *s = ctx.get_stream<gpu>();
size_t size_words = std::max<size_t>(1, RoundToMultiple(size_bytes, sizeof(DType)) / sizeof(DType));
size_t size_words =
std::max<size_t>(1, RoundToMultiple(size_bytes, sizeof(DType)) / sizeof(DType));
return ctx.requested[conv::kTempSpace].get_space_typed<gpu, 1, DType>(
mshadow::Shape1(size_words), s);
}
Expand Down