Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenAI support for response_format: json_object #373

Closed
simonw opened this issue Dec 9, 2023 · 7 comments
Closed

OpenAI support for response_format: json_object #373

simonw opened this issue Dec 9, 2023 · 7 comments
Labels
enhancement New feature or request
Milestone

Comments

@simonw
Copy link
Owner

simonw commented Dec 9, 2023

New feature released at DevDay - you can now pass "response_format": {"type": "json_object"} to most of the OpenAI models (not GPT-4 Vision yet) to force the result to be returned as valid JSON:

https://platform.openai.com/docs/api-reference/chat/create#chat-create-response_format

response_format: object, Optional

An object specifying the format that the model must output.

Setting to { "type": "json_object" } enables JSON mode, which guarantees the message the model generates is valid JSON.

Important: when using JSON mode, you must also instruct the model to produce JSON yourself via a system or user message. Without this, the model may generate an unending stream of whitespace until the generation reaches the token limit, resulting in a long-running and seemingly "stuck" request. Also note that the message content may be partially cut off if finish_reason="length", which indicates the generation exceeded max_tokens or the conversation exceeded the max context length.

@simonw simonw added the enhancement New feature or request label Dec 9, 2023
@simonw
Copy link
Owner Author

simonw commented Dec 9, 2023

The OpenAI API actually has its own validation that checks the word "json" was used in the system or user prompt, so I'll let that raise an error rather than adding my own validation that might not be necessary in the future.

@simonw
Copy link
Owner Author

simonw commented Dec 9, 2023

I'm going to add a -o json 1 option to the OpenAI Chat model - but not the Completion model, since that doesn't support this new option.

@simonw
Copy link
Owner Author

simonw commented Jan 25, 2024

I'm going to go with -o json_object 1 instead because I ran into problems with the reserved word json - plus json_object is what you pass to their API and it correctly reflects that you can only ever get back a root level object, not a root level list.

@simonw
Copy link
Owner Author

simonw commented Jan 25, 2024

llm -m gpt-4-turbo '3 names and short bios for pet pelicans' -o json_object 1
Error: 'messages' must contain the word 'json' in some form, to use 'response_format' of type 'json_object'.
llm -m gpt-4-turbo '3 names and short bios for pet pelicans in JSON' -o json_object 1
{
  "pelicans": [
    {
      "name": "Gus",
      "bio": "Gus is a curious young pelican with an insatiable appetite for adventure. He's known amongst the dockworkers for playfully snatching sunglasses. Gus spends his days exploring the marina and is particularly fond of performing aerial tricks for treats."
    },
    {
      "name": "Sophie",
      "bio": "Sophie is a graceful pelican with a gentle demeanor. She's become somewhat of a local celebrity at the beach, often seen meticulously preening her feathers or posing patiently for tourists' photos. Sophie has a special spot where she likes to watch the sunset each evening."
    },
    {
      "name": "Captain Beaky",
      "bio": "Captain Beaky is the unofficial overseer of the bay, with a stern yet endearing presence. As a seasoned veteran of the coastal skies, he enjoys leading his flock on fishing expeditions and is always the first to spot the fishing boats returning to the harbor. He's respected by both his pelican peers and the fishermen alike."
    }
  ]
}

@simonw simonw closed this as completed Jan 25, 2024
@simonw simonw added this to the 0.13 milestone Jan 26, 2024
simonw added a commit that referenced this issue Jan 26, 2024
@kkukshtel
Copy link

The GPT-4 options that appear when you run llm models --options state that GPT-4 can still take a json_object variable as part of the request, but running this:

model = llm.get_model("gpt4")
response = model.prompt(
    "Five surprising names for a pet pelican",
    system="Answer like GlaDOS",
    seed=0,
    json_object=True
)

Results in the following error:

Error code: 400 - {'error': {'message': "Invalid parameter: 'response_format' of type 'json_object' is not supported with this model.", 'type': 'invalid_request_error', 'param': 'response_format', 'code': None}}

Is forcing the json_object parameter messing up the OpenAI request?

@simonw
Copy link
Owner Author

simonw commented May 4, 2024

That's because json_object isn't supported by GPT-4, but it is supported by GPT-4 Turbo.

Try this:

model = llm.get_model("gpt-4-turbo-preview")
response = model.prompt(
    "Five surprising names for a pet pelican as JSON",
    system="Answer like GlaDOS",
    seed=0,
    json_object=True
)
print(response)

Note that you have to include the word "JSON" in your prompt or you'll get a different error back from OpenAI.

I got this:

{
  "surprising_pet_pelican_names": [
    {
      "name": "Mr. Pockets",
      "reason": "Because who would expect a pelican to carry around more than fish in their beak pouch?"
    },
    {
      "name": "Sir Nibsalot",
      "reason": "It sounds more like a name suited for a tiny, nippy pet rather than a grand, majestic pelican."
    },
    {
      "name": "Duchess Beaky",
      "reason": "A title of nobility for a bird? How preposterously delightful!"
    },
    {
      "name": "Professor Waddles",
      "reason": "One would envision a penguin with this name, not a sleek pelican."
    },
    {
      "name": "Dr. Fishenstein",
      "reason": "Attributing a doctorate in fish science to a pelican is both absurd and genius."
  }
  ]
}

@kkukshtel
Copy link

kkukshtel commented May 4, 2024

@simonw running llm models --options provides this for GPT4:

OpenAI Chat: gpt-4 (aliases: 4, gpt4)
  temperature: float
  max_tokens: int
  top_p: float
  frequency_penalty: float
  presence_penalty: float
  stop: str
  logit_bias: dict, str
  seed: int
  json_object: boolean

Seeing json_object: boolean there made me assume it could take a bool for that field - is that not the case? turbo, etc. produce the same output, so I assumed it did as well. If 4 doesn't support it, is it possible to remove the field from displaying when you run llm models --options?

I'm on 0.13.1 if that makes any difference.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants