add google magika inference example #748

slaren · 2024-02-25T15:08:08Z

The model can be converted with convert.py from model.h5

Example:

$ build/bin/magika model.h5.gguf examples/sam/example.jpg README.md src/ggml.c
examples/sam/example.jpg      : jpeg (100.00%) pptx (0.00%) smali (0.00%) shell (0.00%) sevenzip (0.00%)
README.md                     : markdown (100.00%) txt (0.00%) yaml (0.00%) ppt (0.00%) shell (0.00%)
src/ggml.c                    : c (99.97%) asm (0.01%) txt (0.01%) javascript (0.00%) html (0.00%)

ggerganov · 2024-02-25T16:06:43Z

Requires F32 GELU, as the default F16 GELU will result in nan. Not sure how to address this.

Would this patch make it work with the F16 GELU define?

diff --git a/src/ggml.c b/src/ggml.c
index d710fe7..42a7d45 100644
--- a/src/ggml.c
+++ b/src/ggml.c
@@ -1560,9 +1560,15 @@ inline static void ggml_vec_gelu_f16(const int n, ggml_fp16_t * y, const ggml_fp
 inline static void ggml_vec_gelu_f32(const int n, float * y, const float * x) {
     uint16_t t;
     for (int i = 0; i < n; ++i) {
-        ggml_fp16_t fp16 = GGML_FP32_TO_FP16(x[i]);
-        memcpy(&t, &fp16, sizeof(uint16_t));
-        y[i] = GGML_FP16_TO_FP32(ggml_table_gelu_f16[t]);
+        if (x[i] < -10.0f) {
+            y[i] = 0.0f;
+        } else if (x[i] > 10.0f) {
+            y[i] = x[i];
+        } else {
+            ggml_fp16_t fp16 = GGML_FP32_TO_FP16(x[i]);
+            memcpy(&t, &fp16, sizeof(uint16_t));
+            y[i] = GGML_FP16_TO_FP32(ggml_table_gelu_f16[t]);
+        }
     }
 }
 #else

slaren · 2024-02-25T17:02:55Z

Yep, it does work. The problem was that activations generated by this model can exceed the FP16 maximum finite value of 65519, which results in inf when converted to FP16.

examples/magika/main.cpp

ggerganov

Very cool!

Does it work with CUDA?
I just tried Metal, but it currently lacks the POOL_1D op.

slaren · 2024-02-25T17:27:38Z

CUDA does not support POOL_1D either, so I guess it is CPU only for now.

ggml-ci

add magika inference example

49810a5

slaren linked an issue Feb 25, 2024 that may be closed by this pull request

ggml : add Magika inference #734

Closed

ggml : fix unaligned accesses in custom ops

bceac44

ggml : fix FP32 GELU for values that exceed the FP16 range

a1890e3

slaren marked this pull request as ready for review February 25, 2024 17:03

ggerganov reviewed Feb 25, 2024

View reviewed changes

examples/magika/main.cpp Outdated Show resolved Hide resolved

use ggml_pool_1d

ef2ca43

ggerganov approved these changes Feb 25, 2024

View reviewed changes

slaren added 4 commits February 25, 2024 19:57

add README

9cfa0b9

Update README.md

ab0f941

pad inputs if the files are too small

a1d6d4e

cleanup

b3925bb

ggml-ci

slaren merged commit b458250 into master Feb 25, 2024
9 checks passed

slaren deleted the sl/magika branch February 25, 2024 19:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add google magika inference example #748

add google magika inference example #748

slaren commented Feb 25, 2024 •

edited

Loading

ggerganov commented Feb 25, 2024

slaren commented Feb 25, 2024 •

edited

Loading

ggerganov left a comment

slaren commented Feb 25, 2024

add google magika inference example #748

add google magika inference example #748

Conversation

slaren commented Feb 25, 2024 • edited Loading

ggerganov commented Feb 25, 2024

slaren commented Feb 25, 2024 • edited Loading

ggerganov left a comment

Choose a reason for hiding this comment

slaren commented Feb 25, 2024

slaren commented Feb 25, 2024 •

edited

Loading

slaren commented Feb 25, 2024 •

edited

Loading