Skip to content

Latest commit

 

History

History

benchmarks

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

Caikit NLP Runtime Performance Benchmarks

Runtime performance benchmarking results for various model on various hardware configurations.

Llama2-7b

Date Executed Hardware Training Set Epoch Precision Batch Size Max Source Length Training Runtime (s) Samples Per Second Train Steps Per Second Loss Notes
2023-09-05 1 x A100 80GB Glue / RTE 1 bfloat16 6 4096 350 21.325 0.22 1.65 4096 is the context size for Llama2
2023-09-05 1 x A100 80GB Glue / RTE 1 bfloat16 6 1024 350 21.333 0.22 1.65 batch size of 7 fails CUDA OOM
2023-09-06 1 x A100 80GB Glue / RTE 1 bfloat16 6 512 348 21.44 0.22 1.65 batch size of 7 fails CUDA OOM
2023-09-05 1 x A100 80GB Glue / RTE 1 bfloat16 8 256 356 20.939 0.16 1.70 batch size of 9 fails CUDA OOM
2023-09-05 1 x A100 80GB Glue / RTE 1 bfloat16 19 128 254 29.332 0.09 1.94 batch size of 20 fails CUDA OOM