ggml : better way to express implicit node dependencies in a graph #502

ggerganov · 2023-09-04T18:51:51Z

Operations on views can introduce implicit dependencies between the nodes in the compute graph which often lead to bugs when we forget about these dependencies.

See ggerganov/llama.cpp#3012 for more context

We need to find some better way to handle such cases.
One option is to introduce ggml_depends_on() which if not ideal, should at least make the code more explicit less error prone.

The text was updated successfully, but these errors were encountered:

* Retire the ggml_mul_mat() for transposed src0 - It can always be made contiguous with ggml_cpy() - The code is now simplified - The results are deterministic in respect to num threads * SIMD-ify dequantize_row_q4_0() for ARM_NEON (ggerganov#502) * Attempt to SIMD-ify dequantize_row_q4_0() for ARM_NEON * Fix dequantization - forgot to interleave the quants

ggerganov added enhancement New feature or request refactoring Refactoring labels Sep 4, 2023

ggerganov mentioned this issue Sep 4, 2023

metal : bug with ggml_cont ggerganov/llama.cpp#3012

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml : better way to express implicit node dependencies in a graph #502

ggml : better way to express implicit node dependencies in a graph #502

ggerganov commented Sep 4, 2023

ggml : better way to express implicit node dependencies in a graph #502

ggml : better way to express implicit node dependencies in a graph #502

Comments

ggerganov commented Sep 4, 2023