VLM Run exposes four distinct entry points into our models, each tuned for a different kind of workload, from single-shot structured extraction to fully agentic, multi-step pipelines to interactive chat. This doc walks through each method, when to reach for it, and how they compare.Documentation Index
Fetch the complete documentation index at: https://docs.vlm.run/llms.txt
Use this file to discover all available pages before exploring further.
1. Requests
Model:vlm-1
Requests are the simplest way to use VLM Run: provide a single file and a domain, and get structured JSON back. They’re designed for ETL-style workloads where you have a fixed prompt (the domain) and want flexibility on the schema.
- Input: a single document, image, audio file, or video
- Output: JSON
- Execution: batch. Submit a request and poll for the prediction by ID
- Best for: single-step extraction at scale (invoices, receipts, IDs, medical forms, etc.)
2. Executions
Model:vlmrun-orion-1
Executions are for agentic, multi-step workloads. Where a Request is one model call against one file, an Execution runs an agent that can classify, extract, redact, transform, and combine outputs across multiple files, all orchestrated through a skill.
- Input: multiple documents, images, and/or videos
- Output: JSON
- Execution: batch. Submit an execution and poll for the result by ID
- Configured via: a skill, defined primarily by a
SKILL.mdfile, with optional reference files, schemas, and examples - Best for: anything open-ended or multi-step (document packages, cross-file reasoning, redaction pipelines, classification-then-extraction flows)
SKILL.md gives the agent its instructions, and supporting files (schemas, examples, reference docs) ground its behavior. This is the most powerful and flexible surface we offer.
3. Chat Completions API
Model:vlmrun-orion-1
The Chat Completions API is a drop-in replacement for the OpenAI Chat Completions API. Point the OpenAI SDK at our base URL and you’re using Orion with full visual-tool and artifact support, with no other code changes required.
- Input: multiple files via standard chat messages (text + image/file parts)
- Output: Text, JSON
- Execution: both streaming and non-streaming
- Logging: programmatic calls are logged in the chat completions table
- Best for: interactive or conversational multimodal use cases, and any app already built against the OpenAI SDK
4. Chat UI
Model:vlmrun-orion-1
The Chat UI (chat.vlm.run) is our hosted chat interface. It’s powered by the same Chat Completions API under the hood, giving you Orion with visual tools and artifacts in a browser, with no code required.
- Input: files and messages via the web UI
- Output: Text, JSON (artifacts rendered in the browser)
- Logging: Chat UI sessions are not logged in the chat completions table (only programmatic API calls are)
- Best for: exploration, demos, one-off tasks, and iterating on prompts before writing code
Summary Table
| Requests | Executions | Chat Completions | Chat UI | |
|---|---|---|---|---|
| Model | vlm-1 | vlmrun-orion-1 | vlmrun-orion-1 | vlmrun-orion-1 |
| Input | Single file (doc, image, audio, or video) | Multiple files | Multiple files | Multiple files |
| Output | JSON | JSON | Text, JSON | Text, JSON |
| Mode | Batch | Batch | Streaming + non-streaming | Streaming |
| Prompt model | Fixed prompt, flexible schema | Open-ended; defined by SKILL.md + reference files | Free-form messages | Free-form messages |
| Visual tool calling | ❌ | ✅ | ✅ | ✅ |
| Artifacts | ❌ | ✅ | ✅ | ✅ |
| Visual grounding (bounding boxes) | ✅ | ❌ | ❌ | ❌ |
| Workload shape | Single-step ETL | Multi-step / agentic | Conversational / multimodal | Conversational / multimodal |
| Best for | High-volume structured extraction | Complex multi-file reasoning and actions | Apps built on the OpenAI chat completions API | Playground for exploration and demos |
| Example usage | Known document types with fixed output schemas (invoices, receipts, IDs) | Custom multi-file pipelines (classify, extract, redact across a document package) | Multimodal chatbots and OpenAI-SDK apps using Orion | Exploring a new domain or iterating on a skill quickly |