Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dlrm benchmark test #375

Open
wants to merge 37 commits into
base: main
Choose a base branch
from
Open

Dlrm benchmark test #375

wants to merge 37 commits into from

Conversation

ShawnXuan
Copy link
Contributor

dlrm benchmark test scripts

@ShawnXuan
Copy link
Contributor Author

ShawnXuan commented Aug 5, 2022

关于下面这些选项:

export CUDA_DEVICE_MAX_CONNECTIONS=32
export ONEFLOW_EP_CUDA_STREAM_FLAGS=1
export ONEFLOW_RAW_READER_PREFETCHING_QUEUE_DEPTH=512
export ONEFLOW_RAW_READER_NUM_WORKERS=1

export LD_PRELOAD=/usr/lib64/libjemalloc.so.1

numactl --interleave=all \

做了一组实验,记录了74000轮的平均latency(ms)结果如下:

ON OFF
1.41855692 1.44409019
1.42942288 1.43027312
1.42626776 1.43327031
1.43100398 1.43726633
1.43247646 1.43108837
1.43085669 1.4360571
1.4250376 1.43052549
1.4246417 1.44208097
1.42638928 1.43673026
1.43390266 1.43774178
1.42238418 1.43597748
1.43701162 1.43563187
1.42529816 1.43994857
1.42365005 1.43631018
1.43174504 1.43489774
1.42973357 1.43393828
1.4347752  
1.43040477  

统计结果如下:

  ON OFF
mean 1.4285 1.4360
max 1.4370 1.4441
min 1.4186 1.4303
std 0.0048 0.0039

都打开的时候有8us左右的提升,其实很微小,先不保留这些选项。

RecommenderSystems/dlrm/tools/parquet_to_raw.py Outdated Show resolved Hide resolved
RecommenderSystems/dlrm/train_dlrm_benchmark_fp32.sh Outdated Show resolved Hide resolved
RecommenderSystems/dlrm/train_dlrm_benchmark_fp32.sh Outdated Show resolved Hide resolved
RecommenderSystems/dlrm/train_dlrm_benchmark_fp32.sh Outdated Show resolved Hide resolved
RecommenderSystems/dlrm/train_dlrm_benchmark_fp32.sh Outdated Show resolved Hide resolved
RecommenderSystems/dlrm/tools/parquet_to_raw.py Outdated Show resolved Hide resolved
RecommenderSystems/dlrm/train_dlrm_benchmark.sh Outdated Show resolved Hide resolved
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants