Openai_api_server Error: 400 status code (no body) #3439

GianlucaDeStefano · 2024-07-07T23:01:52Z

I spawn an openai compatible server using the following docker-compose:

version: "3"
services:
  fastchat-controller:
    build:
      context: .
      dockerfile: Dockerfile
    image: fastchat:latest
    ports:
      - "21001:21001"
    entrypoint: ["python3.9", "-m", "fastchat.serve.controller", "--host", "0.0.0.0", "--port", "21001"]
  fastchat-model-worker:
    build:
      context: .
      dockerfile: Dockerfile
    volumes:
      - /home/gianluca/.cache/huggingface:/root/.cache/huggingface
    image: fastchat:latest
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: 2
              capabilities: [gpu]
    ipc: host
    entrypoint: ["python3.9", "-m", "fastchat.serve.vllm_worker", "--model-path", "microsoft/Phi-3-mini-4k-instruct", "--worker-address", "https://fastchat-model-worker:21002", "--controller-address", "https://fastchat-controller:21001", "--host", "0.0.0.0", "--port", "21002", "--num-gpus","2"]
  fastchat-api-server:
    build:
      context: .
      dockerfile: Dockerfile
    image: fastchat:latest
    ports:
      - "8000:8000"
    entrypoint: ["python3.9", "-m", "fastchat.serve.openai_api_server", "--controller-address", "https://fastchat-controller:21001", "--host", "0.0.0.0", "--port", "8000"]

The container instead is:

  FROM nvidia/cuda:12.2.0-runtime-ubuntu20.04

RUN apt-get update -y && apt-get install -y python3.9 python3.9-distutils curl
RUN curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
RUN python3.9 get-pip.py
RUN pip3 install fschat vllm
RUN pip3 install fschat[model_worker,webui]

Everything works, but when the prompt length is close to 4000 tokens (that is size of the the model's context window).
When I approach the limit I keep getting the following error back: Error: 400 status code (no body).
Could someone help me debug the issue? The length of the prompt is still under the length of the context window and the error message is not useful. Is there a debug mode I can use to gather more information on what is happening in the backend?

The text was updated successfully, but these errors were encountered:

GianlucaDeStefano changed the title ~~Openai_api_server error 400 (no body)~~ Openai_api_server Error: 400 status code (no body) Jul 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Openai_api_server Error: 400 status code (no body) #3439

Openai_api_server Error: 400 status code (no body) #3439

GianlucaDeStefano commented Jul 7, 2024 •

edited

Loading

Openai_api_server Error: 400 status code (no body) #3439

Openai_api_server Error: 400 status code (no body) #3439

Comments

GianlucaDeStefano commented Jul 7, 2024 • edited Loading

GianlucaDeStefano commented Jul 7, 2024 •

edited

Loading