vision
is a simple OpenAI CLI and GPTScript Tool for interacting with vision models.
- NodeJS
- OpenAI API key
-
Clone this repository or download the source code:
git clone [email protected]:gptscript-ai/vision.git cd vision
-
Install the
npm
dependenciesnpm install
$ node index.js --help
Usage: index [options] <prompt> <images...>
Utility for processing images with the OpenAI API
Arguments:
prompt Prompt to send to the vision model
images List of image URIs to process. Supports file:https:// and https:// protocols. Images must be jpeg or png.
Options:
--openai-api-key <key> OpenAI API Key (env: OPENAI_API_KEY)
--openai-base-url <string> OpenAI base URL (env: OPENAI_BASE_URL)
--openai-org-id <string> OpenAI Org ID to use (env: OPENAI_ORG_ID)
--max-tokens <number> Max tokens to use (default: 2048, env: MAX_TOKENS)
--model <model> Model to process images with (choices: "gpt-4-vision-preview", default: "gpt-4-vision-preview", env: MODEL)
--detail <detail> Fidelity to use when processing images (choices: "low", "high", "auto", default: "auto", env: DETAIL)
-h, --help display help for command
node index.js 'Describe the picture' 'file:https://examples/eiffel-tower.png'
node index.js 'Describe the picture' 'https://github.com/gptscript-ai/vision/blob/main/examples/eiffel-tower.png?raw=true'
node index.js 'Do you think these two portraits are by the same artist?' 'https://github.com/gptscript-ai/vision/blob/main/examples/eiffel-tower.png?raw=true' 'file:https://examples/eiffel-tower.png'