feat: add changes to handle jina v2 base code #7596

JoanFM · 2024-05-28T18:45:18Z

PR to allow using jinaai/jina-embeddings-v2-base-code with llama.cpp. It has an extra normalization layer compared to other models of the JinaV2 family and this is why it is considered independently.

github-actions · 2024-05-28T21:06:36Z

📈 llama.cpp server for bench-server-baseline on Standard_NC4as_T4_v3 for phi-2-q4_0: 527 iterations 🚀

Expand details for performance related PR only

Concurrent users: 8, duration: 10m
HTTP request : avg=8891.02ms p(95)=21954.48ms fails=, finish reason: stop=474 truncated=53
Prompt processing (pp): avg=104.81tk/s p(95)=444.45tk/s
Token generation (tg): avg=45.37tk/s p(95)=46.03tk/s
ggml-org/models/phi-2/ggml-model-q4_0.gguf parallel=8 ctx-size=16384 ngl=33 batch-size=2048 ubatch-size=256 pp=1024 pp+tg=2048 branch=feat-jina-v2-base-code commit=4c4d877d23dd27fc7e323b4a2623db825e8bd29f

More

---
config:
    xyChart:
        titleFontSize: 12
        width: 900
        height: 600
    themeVariables:
        xyChart:
            titleColor: "#000000"
---
xychart-beta
    title "llama.cpp bench-server-baseline on Standard_NC4as_T4_v3
 duration=10m 527 iterations"
    y-axis "llamacpp:prompt_tokens_seconds"
    x-axis "llamacpp:prompt_tokens_seconds" 1717659739 --> 1717660371
    line [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 493.19, 493.19, 493.19, 493.19, 493.19, 857.68, 857.68, 857.68, 857.68, 857.68, 709.43, 709.43, 709.43, 709.43, 709.43, 733.5, 733.5, 733.5, 733.5, 733.5, 767.28, 767.28, 767.28, 767.28, 767.28, 784.63, 784.63, 784.63, 784.63, 784.63, 791.91, 791.91, 791.91, 791.91, 791.91, 802.36, 802.36, 802.36, 802.36, 802.36, 800.36, 800.36, 800.36, 800.36, 800.36, 819.58, 819.58, 819.58, 819.58, 819.58, 843.83, 843.83, 843.83, 843.83, 843.83, 841.72, 841.72, 841.72, 841.72, 841.72, 845.53, 845.53, 845.53, 845.53, 845.53, 851.97, 851.97, 851.97, 851.97, 851.97, 844.57, 844.57, 844.57, 844.57, 844.57, 842.59, 842.59, 842.59, 842.59, 842.59, 849.22, 849.22, 849.22, 849.22, 849.22, 846.64, 846.64, 846.64, 846.64, 846.64, 845.18, 845.18, 845.18, 845.18, 845.18, 840.62, 840.62, 840.62, 840.62, 840.62, 837.89, 837.89, 837.89, 837.89, 837.89, 845.37, 845.37, 845.37, 845.37, 845.37, 848.1, 848.1, 848.1, 848.1, 848.1, 859.19, 859.19, 859.19, 859.19, 859.19, 825.7, 825.7, 825.7, 825.7, 825.7, 828.95, 828.95, 828.95, 828.95, 828.95, 830.58, 830.58, 830.58, 830.58, 830.58, 843.53, 843.53, 843.53, 843.53, 843.53, 842.45, 842.45, 842.45, 842.45, 842.45, 842.1, 842.1, 842.1, 842.1, 842.1, 846.88, 846.88, 846.88, 846.88, 846.88, 848.49, 848.49, 848.49, 848.49, 848.49, 845.74, 845.74, 845.74, 845.74, 845.74, 849.84, 849.84, 849.84, 849.84, 849.84, 861.69, 861.69, 861.69, 861.69, 861.69, 864.26, 864.26, 864.26, 864.26, 864.26, 859.05, 859.05, 859.05, 859.05, 859.05, 858.1, 858.1, 858.1, 858.1, 858.1, 855.1, 855.1, 855.1, 855.1, 855.1, 855.09, 855.09, 855.09, 855.09, 855.09, 859.42, 859.42, 859.42, 859.42, 859.42, 860.42, 860.42, 860.42, 860.42, 860.42, 864.41, 864.41, 864.41, 864.41, 864.41, 856.43, 856.43, 856.43, 856.43, 856.43, 845.97, 845.97, 845.97, 845.97, 845.97, 844.92, 844.92, 844.92, 844.92, 844.92, 843.12, 843.12, 843.12, 843.12, 843.12, 845.46, 845.46, 845.46, 845.46, 845.46, 846.38, 846.38, 846.38, 846.38, 846.38, 845.88, 845.88, 845.88, 845.88, 845.88, 847.71, 847.71, 847.71, 847.71, 847.71, 850.33, 850.33, 850.33, 850.33, 850.33, 853.11, 853.11, 853.11, 853.11, 853.11, 856.25, 856.25, 856.25, 856.25, 856.25, 854.69, 854.69, 854.69, 854.69, 854.69, 859.26, 859.26, 859.26, 859.26, 859.26, 860.76, 860.76, 860.76, 860.76, 860.76, 862.36, 862.36, 862.36, 862.36, 862.36, 861.9, 861.9, 861.9, 861.9, 861.9, 862.47, 862.47, 862.47, 862.47, 862.47, 862.06, 862.06]

More

---
config:
    xyChart:
        titleFontSize: 12
        width: 900
        height: 600
    themeVariables:
        xyChart:
            titleColor: "#000000"
---
xychart-beta
    title "llama.cpp bench-server-baseline on Standard_NC4as_T4_v3
 duration=10m 527 iterations"
    y-axis "llamacpp:predicted_tokens_seconds"
    x-axis "llamacpp:predicted_tokens_seconds" 1717659739 --> 1717660371
    line [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 36.24, 36.24, 36.24, 36.24, 36.24, 38.3, 38.3, 38.3, 38.3, 38.3, 28.79, 28.79, 28.79, 28.79, 28.79, 29.62, 29.62, 29.62, 29.62, 29.62, 30.21, 30.21, 30.21, 30.21, 30.21, 30.95, 30.95, 30.95, 30.95, 30.95, 32.21, 32.21, 32.21, 32.21, 32.21, 32.32, 32.32, 32.32, 32.32, 32.32, 33.43, 33.43, 33.43, 33.43, 33.43, 33.73, 33.73, 33.73, 33.73, 33.73, 33.57, 33.57, 33.57, 33.57, 33.57, 33.77, 33.77, 33.77, 33.77, 33.77, 33.22, 33.22, 33.22, 33.22, 33.22, 32.28, 32.28, 32.28, 32.28, 32.28, 31.88, 31.88, 31.88, 31.88, 31.88, 31.02, 31.02, 31.02, 31.02, 31.02, 29.89, 29.89, 29.89, 29.89, 29.89, 30.23, 30.23, 30.23, 30.23, 30.23, 30.23, 30.23, 30.23, 30.23, 30.23, 29.83, 29.83, 29.83, 29.83, 29.83, 29.3, 29.3, 29.3, 29.3, 29.3, 29.3, 29.3, 29.3, 29.3, 29.3, 29.46, 29.46, 29.46, 29.46, 29.46, 29.66, 29.66, 29.66, 29.66, 29.66, 29.48, 29.48, 29.48, 29.48, 29.48, 29.69, 29.69, 29.69, 29.69, 29.69, 29.89, 29.89, 29.89, 29.89, 29.89, 30.0, 30.0, 30.0, 30.0, 30.0, 29.76, 29.76, 29.76, 29.76, 29.76, 29.92, 29.92, 29.92, 29.92, 29.92, 30.18, 30.18, 30.18, 30.18, 30.18, 30.23, 30.23, 30.23, 30.23, 30.23, 30.39, 30.39, 30.39, 30.39, 30.39, 30.52, 30.52, 30.52, 30.52, 30.52, 30.61, 30.61, 30.61, 30.61, 30.61, 30.47, 30.47, 30.47, 30.47, 30.47, 30.39, 30.39, 30.39, 30.39, 30.39, 30.03, 30.03, 30.03, 30.03, 30.03, 29.67, 29.67, 29.67, 29.67, 29.67, 29.78, 29.78, 29.78, 29.78, 29.78, 29.8, 29.8, 29.8, 29.8, 29.8, 29.96, 29.96, 29.96, 29.96, 29.96, 30.07, 30.07, 30.07, 30.07, 30.07, 29.92, 29.92, 29.92, 29.92, 29.92, 29.68, 29.68, 29.68, 29.68, 29.68, 29.53, 29.53, 29.53, 29.53, 29.53, 28.64, 28.64, 28.64, 28.64, 28.64, 28.28, 28.28, 28.28, 28.28, 28.28, 28.23, 28.23, 28.23, 28.23, 28.23, 28.23, 28.23, 28.23, 28.23, 28.23, 28.28, 28.28, 28.28, 28.28, 28.28, 28.33, 28.33, 28.33, 28.33, 28.33, 28.41, 28.41, 28.41, 28.41, 28.41, 28.45, 28.45, 28.45, 28.45, 28.45, 28.33, 28.33, 28.33, 28.33, 28.33, 28.32, 28.32, 28.32, 28.32, 28.32, 28.29, 28.29, 28.29, 28.29, 28.29, 28.32, 28.32, 28.32, 28.32, 28.32, 28.55, 28.55, 28.55, 28.55, 28.55, 28.67, 28.67, 28.67, 28.67, 28.67, 28.75, 28.75]

Details

More

---
config:
    xyChart:
        titleFontSize: 12
        width: 900
        height: 600
    themeVariables:
        xyChart:
            titleColor: "#000000"
---
xychart-beta
    title "llama.cpp bench-server-baseline on Standard_NC4as_T4_v3
 duration=10m 527 iterations"
    y-axis "llamacpp:kv_cache_usage_ratio"
    x-axis "llamacpp:kv_cache_usage_ratio" 1717659739 --> 1717660371
    line [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.13, 0.13, 0.13, 0.13, 0.13, 0.43, 0.43, 0.43, 0.43, 0.43, 0.19, 0.19, 0.19, 0.19, 0.19, 0.14, 0.14, 0.14, 0.14, 0.14, 0.17, 0.17, 0.17, 0.17, 0.17, 0.14, 0.14, 0.14, 0.14, 0.14, 0.09, 0.09, 0.09, 0.09, 0.09, 0.15, 0.15, 0.15, 0.15, 0.15, 0.14, 0.14, 0.14, 0.14, 0.14, 0.17, 0.17, 0.17, 0.17, 0.17, 0.16, 0.16, 0.16, 0.16, 0.16, 0.3, 0.3, 0.3, 0.3, 0.3, 0.29, 0.29, 0.29, 0.29, 0.29, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.43, 0.25, 0.25, 0.25, 0.25, 0.25, 0.15, 0.15, 0.15, 0.15, 0.15, 0.12, 0.12, 0.12, 0.12, 0.12, 0.33, 0.33, 0.33, 0.33, 0.33, 0.35, 0.35, 0.35, 0.35, 0.35, 0.1, 0.1, 0.1, 0.1, 0.1, 0.12, 0.12, 0.12, 0.12, 0.12, 0.19, 0.19, 0.19, 0.19, 0.19, 0.2, 0.2, 0.2, 0.2, 0.2, 0.1, 0.1, 0.1, 0.1, 0.1, 0.11, 0.11, 0.11, 0.11, 0.11, 0.16, 0.16, 0.16, 0.16, 0.16, 0.31, 0.31, 0.31, 0.31, 0.31, 0.22, 0.22, 0.22, 0.22, 0.22, 0.12, 0.12, 0.12, 0.12, 0.12, 0.13, 0.13, 0.13, 0.13, 0.13, 0.18, 0.18, 0.18, 0.18, 0.18, 0.11, 0.11, 0.11, 0.11, 0.11, 0.18, 0.18, 0.18, 0.18, 0.18, 0.31, 0.31, 0.31, 0.31, 0.31, 0.32, 0.32, 0.32, 0.32, 0.32, 0.31, 0.31, 0.31, 0.31, 0.31, 0.38, 0.38, 0.38, 0.38, 0.38, 0.13, 0.13, 0.13, 0.13, 0.13, 0.1, 0.1, 0.1, 0.1, 0.1, 0.15, 0.15, 0.15, 0.15, 0.15, 0.15, 0.15, 0.15, 0.15, 0.15, 0.27, 0.27, 0.27, 0.27, 0.27, 0.57, 0.57, 0.57, 0.57, 0.57, 0.58, 0.58, 0.58, 0.58, 0.58, 0.66, 0.66, 0.66, 0.66, 0.66, 0.38, 0.38, 0.38, 0.38, 0.38, 0.29, 0.29, 0.29, 0.29, 0.29, 0.22, 0.22, 0.22, 0.22, 0.22, 0.24, 0.24, 0.24, 0.24, 0.24, 0.13, 0.13, 0.13, 0.13, 0.13, 0.21, 0.21, 0.21, 0.21, 0.21, 0.21, 0.21, 0.21, 0.21, 0.21, 0.33, 0.33, 0.33, 0.33, 0.33, 0.24, 0.24, 0.24, 0.24, 0.24, 0.2, 0.2, 0.2, 0.2, 0.2, 0.11, 0.11, 0.11, 0.11, 0.11, 0.14, 0.14, 0.14, 0.14, 0.14, 0.11, 0.11, 0.11, 0.11, 0.11, 0.12, 0.12, 0.12, 0.12, 0.12, 0.21, 0.21]

More

---
config:
    xyChart:
        titleFontSize: 12
        width: 900
        height: 600
    themeVariables:
        xyChart:
            titleColor: "#000000"
---
xychart-beta
    title "llama.cpp bench-server-baseline on Standard_NC4as_T4_v3
 duration=10m 527 iterations"
    y-axis "llamacpp:requests_processing"
    x-axis "llamacpp:requests_processing" 1717659739 --> 1717660371
    line [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 7.0, 7.0, 7.0, 7.0, 7.0, 6.0, 6.0, 6.0, 6.0, 6.0, 7.0, 7.0, 7.0, 7.0, 7.0, 3.0, 3.0, 3.0, 3.0, 3.0, 7.0, 7.0, 7.0, 7.0, 7.0, 5.0, 5.0, 5.0, 5.0, 5.0, 8.0, 8.0, 8.0, 8.0, 8.0, 7.0, 7.0, 7.0, 7.0, 7.0, 4.0, 4.0, 4.0, 4.0, 4.0, 7.0, 7.0, 7.0, 7.0, 7.0, 6.0, 6.0, 6.0, 6.0, 6.0, 4.0, 4.0, 4.0, 4.0, 4.0, 3.0, 3.0, 3.0, 3.0, 3.0, 5.0, 5.0, 5.0, 5.0, 5.0, 8.0, 8.0, 8.0, 8.0, 8.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 4.0, 4.0, 4.0, 4.0, 4.0, 7.0, 7.0, 7.0, 7.0, 7.0, 5.0, 5.0, 5.0, 5.0, 5.0, 6.0, 6.0, 6.0, 6.0, 6.0, 8.0, 8.0, 8.0, 8.0, 8.0, 6.0, 6.0, 6.0, 6.0, 6.0, 4.0, 4.0, 4.0, 4.0, 4.0, 3.0, 3.0, 3.0, 3.0, 3.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 7.0, 4.0, 4.0, 4.0, 4.0, 4.0, 3.0, 3.0, 3.0, 3.0, 3.0, 4.0, 4.0, 4.0, 4.0, 4.0, 7.0, 7.0, 7.0, 7.0, 7.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 7.0, 7.0, 7.0, 7.0, 7.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 6.0, 0.0, 0.0, 0.0, 0.0, 0.0, 5.0, 5.0, 5.0, 5.0, 5.0, 3.0, 3.0, 3.0, 3.0, 3.0, 6.0, 6.0, 6.0, 6.0, 6.0, 7.0, 7.0, 7.0, 7.0, 7.0, 8.0, 8.0, 8.0, 8.0, 8.0, 7.0, 7.0, 7.0, 7.0, 7.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 7.0, 7.0, 7.0, 7.0, 7.0, 6.0, 6.0, 6.0, 6.0, 6.0, 3.0, 3.0, 3.0, 3.0, 3.0, 4.0, 4.0, 4.0, 4.0, 4.0, 3.0, 3.0, 3.0, 3.0, 3.0, 8.0, 8.0, 8.0, 8.0, 8.0, 2.0, 2.0, 2.0, 2.0, 2.0, 8.0, 8.0, 8.0, 8.0, 8.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 5.0, 4.0, 4.0, 4.0, 4.0, 4.0, 7.0, 7.0, 7.0, 7.0, 7.0, 3.0, 3.0]

…t-jina-v2-base-code

teleprint-me · 2024-05-31T18:33:24Z

It's how the tokens are handled in llama.cpp. I'm in the middle of figuring out how tokenizers operates under the hood and seeing if there's a way to create a bridge between the two. Actually, your input would be invaluable (#7379). Or if you know someone that's better suited and has a deeper understanding of tokenizers (e.g. BPE/WPM) in general. I'm interested in Jina because the english version uses WPM. The spanish and dutch versions use BPE. I'm more focused on Llama-2 and Llama-3 for BPE.

Aside: I have no idea how the CD/CI is setup here. I have some experience with Jenkins, but all of this is outside of the scope of what I'm focused on. Also, I'm just a contributor. I just chime in when I think I might have something of value to add.

JoanFM · 2024-06-04T10:16:14Z

It's how the tokens are handled in llama.cpp. I'm in the middle of figuring out how tokenizers operates under the hood and seeing if there's a way to create a bridge between the two. Actually, your input would be invaluable (#7379). Or if you know someone that's better suited and has a deeper understanding of tokenizers (e.g. BPE/WPM) in general. I'm interested in Jina because the english version uses WPM. The spanish and dutch versions use BPE. I'm more focused on Llama-2 and Llama-3 for BPE.

Aside: I have no idea how the CD/CI is setup here. I have some experience with Jenkins, but all of this is outside of the scope of what I'm focused on. Also, I'm just a contributor. I just chime in when I think I might have something of value to add.

Hey @teleprint-me ,

To be honest, I found it quite hard to work with tokenizer logic here, but I do not quite understand what you aim to achieve in #7379. If you want we can jump in a call to discuss and make this process more agile.

ggerganov · 2024-06-04T10:18:05Z

could you also guide me on how to fix the CI problems?

Rebase on latest master and the CI should work

JoanFM · 2024-06-04T10:21:35Z

could you also guide me on how to fix the CI problems?

Rebase on latest master and the CI should work

I will, thanks

ggerganov · 2024-06-04T10:30:45Z

So ):\ tit should not be matched. Is there any logic in the code that eliminates these patterns \ from the vocab?

Hm, not sure why this happens. We don't escape strings in the vocab - only in the prompt input:

llama.cpp/common/common.cpp

Lines 249 to 257 in 3b38d48

 if (params.escape) { 

 string_process_escapes(params.prompt); 

 string_process_escapes(params.input_prefix); 

 string_process_escapes(params.input_suffix); 

 string_process_escapes(sparams.cfg_negative_prompt); 

 for (auto & antiprompt : params.antiprompt) { 

 string_process_escapes(antiprompt); 

 } 

 }

JoanFM · 2024-06-04T11:45:24Z

So ):\ tit should not be matched. Is there any logic in the code that eliminates these patterns \ from the vocab?

Hm, not sure why this happens. We don't escape strings in the vocab - only in the prompt input:

llama.cpp/common/common.cpp

Lines 249 to 257 in 3b38d48

if (params.escape) {

string_process_escapes(params.prompt);

string_process_escapes(params.input_prefix);

string_process_escapes(params.input_suffix);

string_process_escapes(sparams.cfg_negative_prompt);

for (auto & antiprompt : params.antiprompt) {

string_process_escapes(antiprompt);

}

}

I will try to investigate this

JoanFM · 2024-06-04T11:50:07Z

@ggerganov,

I am also trying to see if I can add support for chinese model and I manage to get it to work for English, but not for the Chinese characters. Is there a supported model in Chinese? So I can see if I can inspire on which tokenizers they use, etc ...?

ggerganov · 2024-06-04T11:57:58Z

I believe the most recent model that we added and also supports Chinese is https://huggingface.co/deepseek-ai/DeepSeek-V2. See if @fairydreaming's PR could be of any help: #7519

JoanFM · 2024-06-04T13:29:16Z

Hey @ggerganov ,

I am starting to think that it is not a problem of the tokenizer.

Here is my observation.

I am tryng to run this code to check how the embedding behaves:

gdb --args ../build/bin/embedding -m ./jina-embeddings-v2-base-code.gguf --threads 1 --verbose-prompt -p "for idx, x in enumerate(xs):\n    print(idx, x)"

and this is what gdb is telling me:

(gdb) run
Starting program: /home/joan/workspace/ollama/llm/llama.cpp/build/bin/embedding -m ./jina-embeddings-v2-base-code.gguf --threads 1 --verbose-prompt -p for\ idx,\ x\ in\ enumerate\(xs\):\\n\ \ \ \ print\(idx,\ x\)

Look at all the \ that have been added. This seems to be the reason why I get different tokenization, in Python If I add an extra \ before \\n I get the same encoding.

I am not sure if it is a problem of how the standard input is encoded or something? Do you happen to have any clue about this?

if I hardcode this sentence and avoid the split lines.

params.prompt = "for idx, x in enumerate(xs):\n    print(idx, x)";

I get the same behavior as in Python

ggerganov · 2024-06-04T14:13:21Z

I see, does adding -e to the command-line argument fix the issue?

 ../build/bin/embedding -m ./jina-embeddings-v2-base-code.gguf --threads 1 --verbose-prompt -e -p "for idx, x in enumerate(xs):\n    print(idx, x)"

JoanFM · 2024-06-04T14:57:40Z

I see, does adding -e to the command-line argument fix the issue?

 ../build/bin/embedding -m ./jina-embeddings-v2-base-code.gguf --threads 1 --verbose-prompt -e -p "for idx, x in enumerate(xs):\n    print(idx, x)"

Oh, it does!

…t-jina-v2-base-code

convert-hf-to-gguf.py

JoanFM · 2024-06-05T06:26:32Z

I see, does adding -e to the command-line argument fix the issue?

 ../build/bin/embedding -m ./jina-embeddings-v2-base-code.gguf --threads 1 --verbose-prompt -e -p "for idx, x in enumerate(xs):\n    print(idx, x)"

@ggerganov ,

how then can we be sure this behavior is available in the server? I see this escape option only available in the example itself.

ggerganov · 2024-06-05T06:40:09Z

I believe server already escapes these through the JSON parsing library. Btw all examples now escape by default since #7675, so no need to even add -e explicitly

llama.cpp

JoanFM · 2024-06-05T13:19:04Z

Hey @ggerganov,

Is there something from my code that may have caused this CI to fail?

ggerganov · 2024-06-05T14:03:39Z

Probably just a fluke, will restart the workflows now

…t-jina-v2-base-code

JoanFM · 2024-06-06T07:00:24Z

@ggerganov I tested the behavior in server and works, I consider this is ready to be reviewed.

JoanFM added 2 commits May 28, 2024 20:45

feat: add changes to handle jina v2 base code

cc0ac09

fix: do not complicate things

21936dd

JoanFM force-pushed the feat-jina-v2-base-code branch from dd42a71 to 21936dd Compare May 28, 2024 19:11

github-actions bot added the python python script changes label May 28, 2024

mofosyne added the Review Complexity : Medium Generally require more time to grok but manageable by beginner to medium expertise level label May 29, 2024

fix: fix the usage of the code model

9a65c7a

JoanFM force-pushed the feat-jina-v2-base-code branch from 17a5e9f to 9a65c7a Compare May 31, 2024 13:21

Merge branch 'master' of https://github.com/JoanFM/llama.cpp into fea…

96a6f55

…t-jina-v2-base-code

JoanFM force-pushed the feat-jina-v2-base-code branch from 4117c40 to 96a6f55 Compare May 31, 2024 13:52

Merge branch 'master' of https://github.com/JoanFM/llama.cpp into fea…

0fc775e

…t-jina-v2-base-code

JoanFM marked this pull request as ready for review June 4, 2024 15:02

JoanFM commented Jun 4, 2024

View reviewed changes

convert-hf-to-gguf.py Outdated Show resolved Hide resolved

convert-hf-to-gguf.py Outdated Show resolved Hide resolved

convert-hf-to-gguf.py Outdated Show resolved Hide resolved

convert-hf-to-gguf.py Outdated Show resolved Hide resolved

fix: fix comments

4bce30c

fix: fix linting issues

3b44f8f

JoanFM force-pushed the feat-jina-v2-base-code branch from 0481e5f to 3b44f8f Compare June 5, 2024 07:05

JoanFM commented Jun 5, 2024

View reviewed changes

llama.cpp Outdated Show resolved Hide resolved

llama.cpp Outdated Show resolved Hide resolved

llama.cpp Outdated Show resolved Hide resolved

fix: remove ollama patches

05659d3

JoanFM force-pushed the feat-jina-v2-base-code branch from 404daca to 05659d3 Compare June 5, 2024 07:18

Merge branch 'master' of https://github.com/JoanFM/llama.cpp into fea…

7ab6023

…t-jina-v2-base-code

style : minor

4c4d877

ggerganov approved these changes Jun 6, 2024

View reviewed changes

ggerganov merged commit f5d7b26 into ggerganov:master Jun 6, 2024
62 of 69 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add changes to handle jina v2 base code #7596

feat: add changes to handle jina v2 base code #7596

JoanFM commented May 28, 2024 •

edited

Loading

github-actions bot commented May 28, 2024 •

edited

Loading

teleprint-me commented May 31, 2024 •

edited

Loading

JoanFM commented Jun 4, 2024

ggerganov commented Jun 4, 2024

JoanFM commented Jun 4, 2024

ggerganov commented Jun 4, 2024

JoanFM commented Jun 4, 2024

JoanFM commented Jun 4, 2024

ggerganov commented Jun 4, 2024

JoanFM commented Jun 4, 2024 •

edited

Loading

ggerganov commented Jun 4, 2024

JoanFM commented Jun 4, 2024

JoanFM commented Jun 5, 2024

ggerganov commented Jun 5, 2024

JoanFM commented Jun 5, 2024

ggerganov commented Jun 5, 2024

JoanFM commented Jun 6, 2024

feat: add changes to handle jina v2 base code #7596

feat: add changes to handle jina v2 base code #7596

Conversation

JoanFM commented May 28, 2024 • edited Loading

github-actions bot commented May 28, 2024 • edited Loading

teleprint-me commented May 31, 2024 • edited Loading

JoanFM commented Jun 4, 2024

ggerganov commented Jun 4, 2024

JoanFM commented Jun 4, 2024

ggerganov commented Jun 4, 2024

JoanFM commented Jun 4, 2024

JoanFM commented Jun 4, 2024

ggerganov commented Jun 4, 2024

JoanFM commented Jun 4, 2024 • edited Loading

ggerganov commented Jun 4, 2024

JoanFM commented Jun 4, 2024

JoanFM commented Jun 5, 2024

ggerganov commented Jun 5, 2024

JoanFM commented Jun 5, 2024

ggerganov commented Jun 5, 2024

JoanFM commented Jun 6, 2024

JoanFM commented May 28, 2024 •

edited

Loading

github-actions bot commented May 28, 2024 •

edited

Loading

teleprint-me commented May 31, 2024 •

edited

Loading

JoanFM commented Jun 4, 2024 •

edited

Loading