Overview

The VLM Run API is a unified platform for production-ready multimodal AI. Use it to extract structured data from documents, images, videos, and audio — or run complex multi-step workflows with visual agents.

Base URL: https://api.vlm.run/v1
Authentication: Authorization: Bearer <VLMRUN_API_KEY>
Models Supported:
- Requests: vlm-1
- Agent Executions / Chat Completions: vlmrun-orion-1:auto, vlmrun-orion-1:fast, vlmrun-orion-1:pro

See Ways to Use VLM Run for a side-by-side comparison of Requests, Executions, Chat Completions, and the Chat UI.

Access your API keys in our dashboard.

Structured Extraction

Use the Generate endpoints to extract structured JSON from images, documents, audio, and video.

from pathlib import Path
from vlmrun.client import VLMRun
from vlmrun.client.types import PredictionResponse

# Initialize the client
client = VLMRun(api_key="<VLMRUN_API_KEY>")

# Document -> JSON
response: PredictionResponse = client.document.generate(
    file=Path("path/to/document.pdf"),
    model="vlm-1",
    domain="document.invoice",
)

import { VLMRun } from "vlmrun";

// Initialize the client
const client = new VLMRun({
    apiKey: "<VLMRUN_API_KEY>",
});

// Upload a document
const file = await client.files.upload({
  filePath: "path/to/invoice.pdf",
});

// Process a document using file ID
const response = await client.document.generate({
  fileId: file.id,
  model: "vlm-1",
  domain: "document.markdown",
});
console.log(response);

Chat Completions & Agent Executions

Use the Chat Completions endpoint for interactive multi-modal conversations, or the Agent Executions endpoint for batch execution workflows.

from vlmrun.client import VLMRun

# Initialize the VLM Run client
client = VLMRun(api_key="<VLMRUN_API_KEY>")

# Create a chat completion
response = client.agent.completions.create(
    model="vlmrun-orion-1:auto",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What do you see in this image?" },
                {"type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
            ]
        }
    ],
    max_tokens=1000
)

import { VlmRun } from "vlmrun";

// Initialize the VLM Run client
const client = new VlmRun({
  apiKey: "<VLMRUN_API_KEY>"
});

// Create a chat completion
const response = await client.agent.completions.create({
  model: "vlmrun-orion-1:auto",
  messages: [
    {
      role: "user",
      content: [
        { type: "text", text: "What do you see in this image?" },
        { type: "image_url", image_url: { url: "https://example.com/image.jpg" } }
      ]
    }
  ],
  max_tokens: 1000
});
console.log(response.choices[0].message.content);

curl -X POST https://api.vlm.run/v1/openai/chat/completions \
  -H "Authorization: Bearer $VLMRUN_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "vlmrun-orion-1:auto",
    "messages": [
      {
        "role": "user",
        "content": [
          { "type": "text", "text": "What do you see in this image?" },
          { "type": "image_url", "image_url": {"url": "https://example.com/image.jpg"}}
        ]
      }
    ],
    "max_tokens": 1000
  }'

Get Started

Health

Generate

Predictions

Chat

Agent

Executions

Files

Models / Hub

Skills

Artifacts

Feedback

Structured Extraction

Chat Completions & Agent Executions

​Structured Extraction

​Chat Completions & Agent Executions

Structured Extraction

Chat Completions & Agent Executions