OpenAI Compatibility
Run VLM-1 with the OpenAI Python SDK with just 2 lines of code change.
With our new OpenAI-compatible API, you can use the OpenAI Python SDK to interact with VLM Run. This allows developers to trivially switch between OpenAI and VLM Run APIs without having to change any code.
Configure the default endpoint and API key
In order to use the VLM Run API, you simply need to override the default endpoint and API key when using the OpenAI Python SDK.
You can do this in the following ways:
1. Environment variables
Override the default endpoint and API key by setting the following environment variables:
export OPENAI_API_BASE="https://api.vlm.run/v1/openai"
export OPENAI_API_KEY="<VLMRUN_API_KEY>"
2. Client configuration
Override the default endpoint and API key by initializing the OpenAI client with the following configuration:
import openai
client = openai.OpenAI(
base_url="https://api.vlm.run/v1/openai",
api_key="<VLMRUN_API_KEY>"
)
Usage: Chat Completion
Once you have set the endpoint and API key, you can use the OpenAI Python SDK as you normally would. Note that the only change required to the client.chat.completions.create
method is the extra_body
field that allows you to specify the domain
and additional request metadata
.
For example:
import openai
# Initialize the OpenAI client
client = openai.OpenAI(
base_url="https://api.vlm.run/v1/openai". api_key="<VLMRUN_API_KEY>"
)
# Example: Chat completion with an image input
messages = [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": [
{"type": "text", "text": "What's in this image?"},
{"type": "image_url", "image_url": {"url": encode_image(image), "detail": "auto"}},
]}
]
# Perform chat completion
chat_completion = client.chat.completions.create(
model="vlm-1",',
messages=messages,
temperature=0.7,
extra_body={
"vlmrun": {
"domain": "document.presentation"
}
}
)
print(chat_completion.choices[0].message.content)
Request Metadata
Currently, the VLM Run API supports submitting request metadata along with the chat completions request via the extra_body
keyword argument. For example, the VLM Run API accepts the following request metadata:
chat_completion = client.chat.completions.create(
model="vlm-1",
messages=messages,
temperature=0.7,
extra_body={
"vlmrun": {
"metadata": {
"allow_training": False,
}
}
}
)
The VLM Run API supports the following request vlmrun.metadata
fields:
allow_training
: This property flags the request as a potential candidate for our training dataset. If set totrue
, the request may be used for training our base models. If set tofalse
, the request will be used for inference only. By default, this property is set totrue
.environment
(dev
,staging
,prod
): This property specifies the environment in which the request is being made. This can be useful for tracking requests across different environments. By default, this property is set toprod
.
Token Usage
The OpenAI Python SDK provides usage statistics for your account on every API call. This can be useful for monitoring your usage and costs when using the API. We refer the user to the VLM Run Pricing page for more information on pricing and usage.
Compatibility Differences
Unlike the OpenAI API, the VLM Run API does not support the following features:
detail
: Thedetail
field inimage_url
objects is not currently supported. We will be adding support for this feature in the near future.max_tokens
: Themax_tokens
field inchat.completions.create
is currently not respected by our server. This means that in case the token outputs exceed the limit, the server will still return the full output.logprobs
,logit_bias
,top_logprobs
,presence_penalty
,frequency_penalty
,n
,stream
,stop
: These fields are not currently supported by the VLM Run API. We will be adding support for these features in the near future.