Skip to main content
The client.agent object allows you to interact with VLM Run’s Orion Agents for multi-modal chat completions.

Chat Completions

Generate responses from the agent using the chat completions:
from vlmrun.client import VLMRun

# Initialize the client
client = VLMRun(api_key="<VLMRUN_API_KEY>", base_url="https://agent.vlm.run/v1")

# Basic text completion
response = client.agent.completions.create(
    model="vlmrun-orion-1:auto",
    messages=[
        {"role": "user", "content": "What is the capital of France?"}
    ]
)

print(response.choices[0].message.content)

Image Analysis

Analyze images using the agent:
from vlmrun.client import VLMRun

client = VLMRun(api_key="<VLMRUN_API_KEY>", base_url="https://agent.vlm.run/v1")

# Analyze an image
response = client.agent.completions.create(
    model="vlmrun-orion-1:auto",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Describe this image in detail"},
                {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg", "detail": "auto"}}
            ]
        }
    ]
)

print(response.choices[0].message.content)

Video Analysis

Analyze videos using the agent:
from vlmrun.client import VLMRun

client = VLMRun(api_key="<VLMRUN_API_KEY>", base_url="https://agent.vlm.run/v1")

# Analyze a video
response = client.agent.completions.create(
    model="vlmrun-orion-1:auto",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Summarize this video"},
                {"type": "video_url", "video_url": {"url": "https://example.com/video.mp4"}}
            ]
        }
    ]
)

print(response.choices[0].message.content)

Structured Outputs

Get structured JSON responses using Pydantic schemas:
from vlmrun.client import VLMRun
from pydantic import BaseModel, Field

# Define response schema
class ImageCaption(BaseModel):
    caption: str = Field(..., description="Detailed caption of the image")
    tags: list[str] = Field(..., description="Tags describing the image")

client = VLMRun(api_key="<VLMRUN_API_KEY>", base_url="https://agent.vlm.run/v1")

# Get structured response
response = client.agent.completions.create(
    model="vlmrun-orion-1:auto",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Generate a caption and tags for this image"},
                {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
            ]
        }
    ],
    response_format={"type": "json_schema", "schema": ImageCaption.model_json_schema()}
)

# Validate and parse the response
result = ImageCaption.model_validate_json(response.choices[0].message.content)
print(result)

Document Analysis

Analyze documents and PDFs:
from vlmrun.client import VLMRun

client = VLMRun(api_key="<VLMRUN_API_KEY>", base_url="https://agent.vlm.run/v1")

# Analyze a PDF document
response = client.agent.completions.create(
    model="vlmrun-orion-1:auto",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Extract key information from this document"},
                {"type": "file_url", "file_url": {"url": "https://example.com/document.pdf"}}
            ]
        }
    ]
)

print(response.choices[0].message.content)

SDK Reference

client.agent.completions.create()

Create a chat completion with the agent. Parameters:
ParameterTypeDescription
modelstrModel to use (e.g., vlmrun-orion-1:auto)
messageslist[dict]List of messages in the conversation
response_formatdictOptional JSON schema for structured output
streamboolEnable streaming responses (default: False)
Returns: ChatCompletionResponse

Message Content Types

TypeDescription
textPlain text content
image_urlImage URL for image analysis
video_urlVideo URL for video analysis
file_urlFile URL for document analysis