model qwen2-7b-instruct #2553

cesinsingapore · 2024-06-12T04:59:28Z

AI reply not makesense

cesinsingapore · 2024-06-12T06:27:23Z

its working fine with another model

AlexM4H · 2024-06-13T07:43:03Z

Same behaviour for me.

Temporary Workaroud: GPU_LAYERS: 0

Further infos:

https://huggingface.co/bartowski/Qwen2-7B-Instruct-GGUF/discussions/1

"You can also enable flash attention for llamacpp which should be able to work around the issue"

Is flash attention already set in the actual docker images?

cesinsingapore · 2024-06-13T08:18:41Z

I'm using docker-compose direct image from localai latest

Docker-compose.yml

version: "3.9"
services:
api:
image: localai/localai:latest-aio-gpu-nvidia-cuda-12
healthcheck:
test: ["CMD", "curl", "-f", "https://localhost:8080/readyz"]
interval: 1m
timeout: 20m
retries: 5
ports:
- 8080:8080
environment:
- DEBUG=true
# ...
volumes:
- ./models:/build/models:cached
# decomment the following piece if running with Nvidia GPUs
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]`

AlexM4H · 2024-06-14T05:56:53Z

Have you entered flash_attention: true in your model yaml file?

cesinsingapore · 2024-06-14T06:17:50Z

do you mean like this ? but it still generate like that after i restarted

models/qwen.yaml

root@a4681b4b3146:/build/models# cat qwen2-7b-instruct.yaml

context_size: 4096
f16: true
mmap: true
name: qwen2-7b-instruct
flash_attention: true
parameters:
model: Qwen2-7B-Instruct-Q4_K_M.gguf
stopwords:

<|im_end|>

template:
chat: |
{{.Input -}}
<|im_start|>assistant
chat_message: |
<|im_start|>{{ .RoleName }}
{{ if .FunctionCall -}}
Function call:
{{ else if eq .RoleName "tool" -}}
Function response:
{{ end -}}
{{ if .Content -}}
{{.Content }}
{{ end -}}
{{ if .FunctionCall -}}
{{toJson .FunctionCall}}
{{ end -}}<|im_end|>
completion: |
{{.Input}}
function: |
<|im_start|>system
You are a function calling AI model. You are provided with functions to execute. You may call one or more functions to assist with the user query. Don't make assumptions about what values to plug into functions. Here are the available tools:
{{range .Functions}}
{'type': 'function', 'function': {'name': '{{.Name}}', 'description': '{{.Description}}', 'parameters': {{toJson .Parameters}} }}
{{end}}
For each function call return a json object with function name and arguments
<|im_end|>
{{.Input -}}
<|im_start|>assistant

AlexM4H · 2024-06-14T07:58:45Z

Yes, so it works for me.

AlexM4H · 2024-06-17T12:03:40Z

@cesinsingapore did you solve your problem?

cesinsingapore · 2024-06-19T02:55:37Z

nope its not

cesinsingapore added bug Something isn't working unconfirmed labels Jun 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

model qwen2-7b-instruct #2553

model qwen2-7b-instruct #2553

cesinsingapore commented Jun 12, 2024

cesinsingapore commented Jun 12, 2024

AlexM4H commented Jun 13, 2024 •

edited

Loading

cesinsingapore commented Jun 13, 2024

AlexM4H commented Jun 14, 2024

cesinsingapore commented Jun 14, 2024 •

edited

Loading

AlexM4H commented Jun 14, 2024

AlexM4H commented Jun 17, 2024

cesinsingapore commented Jun 19, 2024

model qwen2-7b-instruct #2553

model qwen2-7b-instruct #2553

Comments

cesinsingapore commented Jun 12, 2024

cesinsingapore commented Jun 12, 2024

AlexM4H commented Jun 13, 2024 • edited Loading

cesinsingapore commented Jun 13, 2024

Docker-compose.yml

AlexM4H commented Jun 14, 2024

cesinsingapore commented Jun 14, 2024 • edited Loading

models/qwen.yaml

AlexM4H commented Jun 14, 2024

AlexM4H commented Jun 17, 2024

cesinsingapore commented Jun 19, 2024

AlexM4H commented Jun 13, 2024 •

edited

Loading

cesinsingapore commented Jun 14, 2024 •

edited

Loading