Why is GGML so much faster than PyTorch? #382

wizardforcel · 2023-07-13T06:19:08Z

Test data on my computer, with TR-3970x and RTX3080Ti

Whisper Medium+PyTorch CPU:

Takes 3 hours for audio per hour

Whisper Medium+PyTorch GPU:

10 min per hour

Whisper Large V2+GGML CPU:

30 min per hour

gordicaleksa · 2023-07-13T14:40:12Z

Are you sure you're running the models with the same hyperparams? (beam size, etc.)
GGML is bare bone, "close to metal" (written in C), hence more optimal but less developer friendly which is a necessary tradeoff Georgi had to make

ggerganov · 2023-07-14T08:18:17Z

Yes, it's most likely due to different beam size, greedy vs beam-search decoder, etc.
If you match the parameters 1-to-1 then it is not much faster. Should be comparable

ggerganov closed this as completed Jul 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is GGML so much faster than PyTorch? #382

Why is GGML so much faster than PyTorch? #382

wizardforcel commented Jul 13, 2023 •

edited

Loading

gordicaleksa commented Jul 13, 2023

ggerganov commented Jul 14, 2023

Why is GGML so much faster than PyTorch? #382

Why is GGML so much faster than PyTorch? #382

Comments

wizardforcel commented Jul 13, 2023 • edited Loading

gordicaleksa commented Jul 13, 2023

ggerganov commented Jul 14, 2023

wizardforcel commented Jul 13, 2023 •

edited

Loading