Skip to main content
The agent component provides methods for interacting with VLM Run’s Orion Agents for multi-modal chat completions.

Chat Completions

Generate responses from the agent using the chat completions:
import { VlmRun } from "vlmrun";

// Initialize the client
const client = new VlmRun({
  apiKey: "<VLMRUN_API_KEY>",
  baseURL: "https://agent.vlm.run/v1"
});

// Basic text completion
const response = await client.agent.completions.create({
  model: "vlmrun-orion-1:auto",
  messages: [
    { role: "user", content: "What is the capital of France?" }
  ]
});

console.log(response.choices[0].message.content);

Image Analysis

Analyze images using the agent:
import { VlmRun } from "vlmrun";

const client = new VlmRun({
  apiKey: "<VLMRUN_API_KEY>",
  baseURL: "https://agent.vlm.run/v1"
});

// Analyze an image
const response = await client.agent.completions.create({
  model: "vlmrun-orion-1:auto",
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "Describe this image in detail" },
        { type: "image_url", image_url: { url: "https://example.com/image.jpg", detail: "auto" } }
      ]
    }
  ]
});

console.log(response.choices[0].message.content);

Video Analysis

Analyze videos using the agent:
import { VlmRun } from "vlmrun";

const client = new VlmRun({
  apiKey: "<VLMRUN_API_KEY>",
  baseURL: "https://agent.vlm.run/v1"
});

// Analyze a video
const response = await client.agent.completions.create({
  model: "vlmrun-orion-1:auto",
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "Summarize this video" },
        { type: "video_url", video_url: { url: "https://example.com/video.mp4" } }
      ]
    }
  ]
});

console.log(response.choices[0].message.content);

Structured Outputs

Get structured JSON responses using TypeScript interfaces:
import { VlmRun } from "vlmrun";

// Define response schema
interface ImageCaption {
  caption: string;
  tags: string[];
}

const client = new VlmRun({
  apiKey: "<VLMRUN_API_KEY>",
  baseURL: "https://agent.vlm.run/v1"
});

// Get structured response
const response = await client.agent.completions.create({
  model: "vlmrun-orion-1:auto",
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "Generate a caption and tags for this image" },
        { type: "image_url", image_url: { url: "https://example.com/image.jpg" } }
      ]
    }
  ],
  response_format: {
    type: "json_schema",
    schema: {
      type: "object",
      properties: {
        caption: { type: "string", description: "Detailed caption of the image" },
        tags: { type: "array", items: { type: "string" }, description: "Tags describing the image" }
      },
      required: ["caption", "tags"]
    }
  }
});

// Parse the response
const result: ImageCaption = JSON.parse(response.choices[0].message.content);
console.log(result);

Document Analysis

Analyze documents and PDFs:
import { VlmRun } from "vlmrun";

const client = new VlmRun({
  apiKey: "<VLMRUN_API_KEY>",
  baseURL: "https://agent.vlm.run/v1"
});

// Analyze a PDF document
const response = await client.agent.completions.create({
  model: "vlmrun-orion-1:auto",
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "Extract key information from this document" },
        { type: "file_url", file_url: { url: "https://example.com/document.pdf" } }
      ]
    }
  ]
});

console.log(response.choices[0].message.content);

SDK Reference

client.agent.completions.create()

Create a chat completion with the agent. Parameters:
ParameterTypeDescription
modelstringModel to use (e.g., vlmrun-orion-1:auto)
messagesMessage[]List of messages in the conversation
response_formatResponseFormatOptional JSON schema for structured output
streambooleanEnable streaming responses (default: false)
Returns: Promise<ChatCompletionResponse>

Message Content Types

TypeDescription
textPlain text content
image_urlImage URL for image analysis
video_urlVideo URL for video analysis
file_urlFile URL for document analysis

Best Practices

  1. Structured Outputs
    • Define clear JSON schemas for predictable responses
    • Use TypeScript interfaces for type safety
  2. Multi-Modal Inputs
    • Use appropriate content types (image_url, video_url, file_url)
    • Set detail level for images based on analysis needs
  3. Error Handling
    • Always wrap API calls in try-catch blocks
    • Handle rate limits and timeouts appropriately