Documentation Index
Fetch the complete documentation index at: https://docs.vlm.run/llms.txt
Use this file to discover all available pages before exploring further.
The client.agent object allows you to interact with VLM Run’s Orion Agents for multi-modal chat completions.
Chat Completions
Generate responses from the agent using the chat completions:
from vlmrun.client import VLMRun
# Initialize the client
client = VLMRun(api_key="<VLMRUN_API_KEY>")
# Basic text completion
response = client.agent.completions.create(
model="vlmrun-orion-1:auto",
messages=[
{"role": "user", "content": "What is the capital of France?"}
]
)
print(response.choices[0].message.content)
Image Analysis
Analyze images using the agent:
from vlmrun.client import VLMRun
client = VLMRun(api_key="<VLMRUN_API_KEY>")
# Analyze an image
response = client.agent.completions.create(
model="vlmrun-orion-1:auto",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Describe this image in detail"},
{"type": "image_url", "image_url": {"url": "https://example.com/image.jpg", "detail": "auto"}}
]
}
]
)
print(response.choices[0].message.content)
Video Analysis
Analyze videos using the agent:
from vlmrun.client import VLMRun
client = VLMRun(api_key="<VLMRUN_API_KEY>")
# Analyze a video
response = client.agent.completions.create(
model="vlmrun-orion-1:auto",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Summarize this video"},
{"type": "video_url", "video_url": {"url": "https://example.com/video.mp4"}}
]
}
]
)
print(response.choices[0].message.content)
Structured Outputs
Get structured JSON responses using Pydantic schemas:
from vlmrun.client import VLMRun
from pydantic import BaseModel, Field
# Define response schema
class ImageCaption(BaseModel):
caption: str = Field(..., description="Detailed caption of the image")
tags: list[str] = Field(..., description="Tags describing the image")
client = VLMRun(api_key="<VLMRUN_API_KEY>")
# Get structured response
response = client.agent.completions.create(
model="vlmrun-orion-1:auto",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Generate a caption and tags for this image"},
{"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
]
}
],
response_format={"type": "json_schema", "schema": ImageCaption.model_json_schema()}
)
# Validate and parse the response
result = ImageCaption.model_validate_json(response.choices[0].message.content)
print(result)
Document Analysis
Analyze documents and PDFs:
from vlmrun.client import VLMRun
client = VLMRun(api_key="<VLMRUN_API_KEY>")
# Analyze a PDF document
response = client.agent.completions.create(
model="vlmrun-orion-1:auto",
messages=[
{
"role": "user",
"content": [
{"type": "text", "text": "Extract key information from this document"},
{"type": "file_url", "file_url": {"url": "https://example.com/document.pdf"}}
]
}
]
)
print(response.choices[0].message.content)
SDK Reference
client.agent.completions.create()
Create a chat completion with the agent.
Parameters:
| Parameter | Type | Description |
|---|
model | str | Model to use (e.g., vlmrun-orion-1:auto) |
messages | list[dict] | List of messages in the conversation |
response_format | dict | Optional JSON schema for structured output |
stream | bool | Enable streaming responses (default: False) |
Returns: ChatCompletionResponse
Message Content Types
| Type | Description |
|---|
text | Plain text content |
image_url | Image URL for image analysis |
video_url | Video URL for video analysis |
file_url | File URL for document analysis |