the recall_model->exec->Forward cost most time, how can I reduce the cost time? #15450

gxkevin · 2019-07-03T07:14:27Z

I have a problem is that when I do the dnn predict, where I will use the SyncCopyFromCPU and the Forward， the batch_size and fea_num is 40，default blas is openblas（I have also tried the Intel mkl，but it doesn't work），the cpu is broadwell，58 logical core total。

I have 32-58 worker thread，each thead only have 1 openmp thread，I worry that open too many openmp thread will decrease the performance。

After the test, I found that , the predict totally cost 13.9ms, SyncCopyFromCPU cost 275us, but the Forward cost 11ms， have can i reduce the forward cost time ?

    dnn_model->model_data["data"].SyncCopyFromCPU(batch_data.data(), batch_size * fea_num);
    mxnet::cpp::NDArray::WaitAll();
    dnn_model->exec->Forward(false);
    mxnet::cpp::NDArray::WaitAll();

The text was updated successfully, but these errors were encountered:

gxkevin · 2019-07-03T07:15:20Z

56 logical cores

lanking520 · 2019-07-08T16:12:51Z

Hi @leleamol could you please take a look

lanking520 added C++ Related to C++ Performance labels Jul 8, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

the recall_model->exec->Forward cost most time, how can I reduce the cost time? #15450

the recall_model->exec->Forward cost most time, how can I reduce the cost time? #15450

gxkevin commented Jul 3, 2019

gxkevin commented Jul 3, 2019

lanking520 commented Jul 8, 2019

the recall_model->exec->Forward cost most time, how can I reduce the cost time? #15450

the recall_model->exec->Forward cost most time, how can I reduce the cost time? #15450

Comments

gxkevin commented Jul 3, 2019

gxkevin commented Jul 3, 2019

lanking520 commented Jul 8, 2019