The OpenAI Compatibility APIs are now only partially available for our
vlm-agent
server. Our SDKs have evolved to support a whole range of features with multi-modal data types that make it difficult to maintain parity with the OpenAI API. If you are still interested in using the OpenAI compatibility, please contact us.Configure the default endpoint and API key
In order to use the VLM Run Agents API, you simply need to override the default endpoint and API key when using the OpenAI Python SDK. You can do this in the following ways:1. Environment variables
Override the default endpoint and API key by setting the following environment variables:2. Client configuration
Override the default endpoint and API key by initializing the OpenAI client with the following configuration:Usage: Chat Completion
Once you have set the endpoint and API key, you can use the OpenAI Python SDK as you normally would. Note that the only change required to theclient.chat.completions.create
method is the extra_body
field that allows you to specify the domain
and additional request metadata
.
For example:
Extra Body
Theextra_body
field allows you to specify additional request metadata that is used by the VLM Run Agents API (outside of the OpenAI Python SDK), as indicated by the vlmrun
field. This metadata is used to specify other request metadata such as allow_training
, environment
etc.
For example, the following code specifies the request metadata
:
Request Metadata
Currently, the VLM Run Agents API supports submitting request metadata along with the chat completions request via theextra_body
keyword argument. For example, the VLM Run Agents API accepts the following request metadata:
For more details on the request metadata, please refer to the Request Metadata section of the API reference.
vlmrun.metadata
fields:
environment
(dev
,staging
,prod
): This property specifies the environment in which the request is being made. This can be useful for tracking requests across different environments. By default, this property is set toprod
.session_id
: This property is a string identifier for the session, which can be used to track requests across different sessions.allow_training
: This property flags the request as a potential candidate for our training dataset. If set totrue
, the request may be used for training our base models. If set tofalse
, the request will be used for inference only. By default, this property is set totrue
.allow_retention
: This property flags the request as a potential candidate for our retention dataset. If set totrue
, the request may be used for retention of the data. If set tofalse
, the request will be used for inference only. By default, this property is set totrue
.allow_logging
: This property flags the request as a potential candidate for our logging dataset. If set totrue
, the request may be used for logging of the data. If set tofalse
, the request will be used for inference only. By default, this property is set totrue
.extra
: This property is a dictionary of extra metadata that can be used to track the request.
Token Usage
The OpenAI Python SDK provides usage statistics for your account on every API call. This can be useful for monitoring your usage and costs when using the API. We refer the user to the VLM Run Pricing page for more information on pricing and usage.Compatibility Differences
Unlike the OpenAI API, the VLM Run Agents API adds support to the following fields:- messages can now contain
input_file
objects{"type": "input_file", "file_id": "<file_id>"}
wherefile_id
is the id of the file uploaded to the VLM Run Agents API. These are especially useful for processing large files such as videos, images, etc. max_tokens
: Themax_tokens
field inchat.completions.create
is currently not respected by our server. This means that in case the token outputs exceed the limit, the server will still return the full output.logprobs
,logit_bias
,top_logprobs
,presence_penalty
,frequency_penalty
,n
,stream
,stop
: These fields are not currently supported by the VLM Run Agents API. We will be adding support for these features in the near future.