Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add google magika inference example #748

Merged
merged 8 commits into from
Feb 25, 2024
Merged

add google magika inference example #748

merged 8 commits into from
Feb 25, 2024

Conversation

slaren
Copy link
Collaborator

@slaren slaren commented Feb 25, 2024

The model can be converted with convert.py from model.h5

Example:

$ build/bin/magika model.h5.gguf examples/sam/example.jpg README.md src/ggml.c
examples/sam/example.jpg      : jpeg (100.00%) pptx (0.00%) smali (0.00%) shell (0.00%) sevenzip (0.00%)
README.md                     : markdown (100.00%) txt (0.00%) yaml (0.00%) ppt (0.00%) shell (0.00%)
src/ggml.c                    : c (99.97%) asm (0.01%) txt (0.01%) javascript (0.00%) html (0.00%)

@slaren slaren linked an issue Feb 25, 2024 that may be closed by this pull request
@ggerganov
Copy link
Owner

Requires F32 GELU, as the default F16 GELU will result in nan. Not sure how to address this.

Would this patch make it work with the F16 GELU define?

diff --git a/src/ggml.c b/src/ggml.c
index d710fe7..42a7d45 100644
--- a/src/ggml.c
+++ b/src/ggml.c
@@ -1560,9 +1560,15 @@ inline static void ggml_vec_gelu_f16(const int n, ggml_fp16_t * y, const ggml_fp
 inline static void ggml_vec_gelu_f32(const int n, float * y, const float * x) {
     uint16_t t;
     for (int i = 0; i < n; ++i) {
-        ggml_fp16_t fp16 = GGML_FP32_TO_FP16(x[i]);
-        memcpy(&t, &fp16, sizeof(uint16_t));
-        y[i] = GGML_FP16_TO_FP32(ggml_table_gelu_f16[t]);
+        if (x[i] < -10.0f) {
+            y[i] = 0.0f;
+        } else if (x[i] > 10.0f) {
+            y[i] = x[i];
+        } else {
+            ggml_fp16_t fp16 = GGML_FP32_TO_FP16(x[i]);
+            memcpy(&t, &fp16, sizeof(uint16_t));
+            y[i] = GGML_FP16_TO_FP32(ggml_table_gelu_f16[t]);
+        }
     }
 }
 #else

@slaren
Copy link
Collaborator Author

slaren commented Feb 25, 2024

Yep, it does work. The problem was that activations generated by this model can exceed the FP16 maximum finite value of 65519, which results in inf when converted to FP16.

@slaren slaren marked this pull request as ready for review February 25, 2024 17:03
examples/magika/main.cpp Outdated Show resolved Hide resolved
Copy link
Owner

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very cool!

Does it work with CUDA?
I just tried Metal, but it currently lacks the POOL_1D op.

@slaren
Copy link
Collaborator Author

slaren commented Feb 25, 2024

CUDA does not support POOL_1D either, so I guess it is CPU only for now.

@slaren slaren merged commit b458250 into master Feb 25, 2024
9 checks passed
@slaren slaren deleted the sl/magika branch February 25, 2024 19:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ggml : add Magika inference
2 participants