ggml : generalize `quantize_fns` for simpler FP16 handling #286

ggerganov · 2023-06-25T08:43:16Z

This task is described well in ggerganov/llama.cpp#1237

The WIP implementation in that PR might be a bit outdated by now, so one can either attempt to update it or implement it from scratch on top of the current code base.

ggerganov · 2023-07-05T17:43:32Z

@goerch I just synced the unit tests from llama.cpp as you proposed in #317

Will close the issue as completed now.
Maybe in the future we can more similar simplifications for other ops that have quantization branches (e.g. ggml_cpy())

ggerganov added good first issue Good for newcomers refactoring Refactoring labels Jun 25, 2023

goerch added a commit to goerch/ggml that referenced this issue Jun 28, 2023

Fix ggerganov#286

58ed3ad

goerch mentioned this issue Jun 28, 2023

[WIP]: Fix #286 #317

Closed

goerch added a commit to goerch/ggml that referenced this issue Jul 2, 2023

Merge branch 'master' into fix-ggerganov#286

f863199

ggerganov mentioned this issue Jul 5, 2023

Generalize quantize_fns for simpler FP16 handling ggerganov/llama.cpp#1237

Merged

ggerganov closed this as completed Jul 5, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml : generalize `quantize_fns` for simpler FP16 handling #286

ggml : generalize `quantize_fns` for simpler FP16 handling #286

ggerganov commented Jun 25, 2023

ggerganov commented Jul 5, 2023 •

edited

Loading

ggml : generalize quantize_fns for simpler FP16 handling #286

ggml : generalize quantize_fns for simpler FP16 handling #286

Comments

ggerganov commented Jun 25, 2023

ggerganov commented Jul 5, 2023 • edited Loading

ggml : generalize `quantize_fns` for simpler FP16 handling #286

ggml : generalize `quantize_fns` for simpler FP16 handling #286

ggerganov commented Jul 5, 2023 •

edited

Loading