You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Operations on views can introduce implicit dependencies between the nodes in the compute graph which often lead to bugs when we forget about these dependencies.
We need to find some better way to handle such cases.
One option is to introduce ggml_depends_on() which if not ideal, should at least make the code more explicit less error prone.
The text was updated successfully, but these errors were encountered:
* Retire the ggml_mul_mat() for transposed src0
- It can always be made contiguous with ggml_cpy()
- The code is now simplified
- The results are deterministic in respect to num threads
* SIMD-ify dequantize_row_q4_0() for ARM_NEON (ggerganov#502)
* Attempt to SIMD-ify dequantize_row_q4_0() for ARM_NEON
* Fix dequantization - forgot to interleave the quants
Operations on views can introduce implicit dependencies between the nodes in the compute graph which often lead to bugs when we forget about these dependencies.
See ggerganov/llama.cpp#3012 for more context
We need to find some better way to handle such cases.
One option is to introduce
ggml_depends_on()
which if not ideal, should at least make the code more explicit less error prone.The text was updated successfully, but these errors were encountered: