Skip to main content
POST
/
v1
/
openai
/
chat
/
completions
!pip install vlmrun

from vlmrun.client import VLMRun

# Initialize the VLM Run client
client = VLMRun(base_url="https://agent.vlm.run/v1", api_key="<VLMRUN_API_KEY>")

# Create a chat completion
response = client.agent.completions.create(
  model="vlmrun-orion-1:auto",
  messages=[{"role": "user", "content": "Who are you and what can you do?"}],
  temperature=0.7,
)
{
  "detail": [
    {
      "loc": [
        "<string>"
      ],
      "msg": "<string>",
      "type": "<string>"
    }
  ]
}
!pip install vlmrun

from vlmrun.client import VLMRun

# Initialize the VLM Run client
client = VLMRun(base_url="https://agent.vlm.run/v1", api_key="<VLMRUN_API_KEY>")

# Create a chat completion
response = client.agent.completions.create(
  model="vlmrun-orion-1:auto",
  messages=[{"role": "user", "content": "Who are you and what can you do?"}],
  temperature=0.7,
)

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Headers

user-agent
string | null

Body

application/json

Request payload for the OpenAI chat completions API for vlmrun-orion-1

messages
Message · object[]
required

Messages to complete

id
string

ID of the completion

model
default:vlmrun-orion-1:auto

VLM Run Agent model to use for completion

Available options:
vlmrun-orion-1,
vlmrun-orion-1:auto,
vlmrun-orion-1:fast,
vlmrun-orion-1:pro
max_tokens
integer
default:32768

Maximum number of tokens to generate

n
integer | null
default:1

Number of completions to generate

temperature
number
default:0

Temperature of the sampling distribution

top_p
number
default:1

Cumulative probability of parameter highest probability vocabulary tokens to keep for nucleus sampling

top_k
integer | null

Number of highest probability vocabulary tokens to keep for top-k-filtering

logprobs
integer | null

Include the log probabilities on the logprobs most likely tokens, as well the chosen tokens

stream
boolean
default:false

Whether to stream the response or not

preview
boolean | null

Whether to generate previews for the response or not

response_format
JSONSchemaResponseFormat · object

Response format for JSON schema mode as per Fireworks AI specification.

session_id
string

Session UUID for persisting the chat history

metadata
Metadata · object

Additional metadata for the request (e.g., dataset_name, experiment_id, etc.)

skills
AgentSkill · object[] | null

List of agent skills to enable for this request.

Response

Successful Response