# VLM Run > Documentation for VLM Run, the unified gateway for visual intelligence. Understand, reason over, and act on images, video, and documents with a single API. ## Docs - [FAQ](https://docs.vlm.run/FAQ.md): Frequently Asked Questions. - [Multi-modal Artifacts](https://docs.vlm.run/agents/artifacts.md): Retrieve generated images, videos, audio, and documents from agent responses - [Layout Detection](https://docs.vlm.run/agents/capabilities/document/layout-understanding.md): Identify and analyze document structure with visual result previews showing highlighted extractions and field overlays - [Multi-Page Analysis](https://docs.vlm.run/agents/capabilities/document/multi-page-analysis.md): Process and analyze documents across multiple pages with context preservation and cross-document correlation - [Visual Grounding](https://docs.vlm.run/agents/capabilities/document/visual-grounding.md): Connect text elements with their visual locations in documents for precise content understanding - [Caption & Tag](https://docs.vlm.run/agents/capabilities/image/captioning.md): Generate detailed captions and tags for images using advanced vision models. - [Detection](https://docs.vlm.run/agents/capabilities/image/detection.md): Detect and locate objects, faces or people in images with bounding boxes and confidence scores - [Generate & Edit](https://docs.vlm.run/agents/capabilities/image/generation.md): Generate and edit images from text prompts, sketches, or existing images with creative control - [Pointing](https://docs.vlm.run/agents/capabilities/image/pointing.md): Detect and predict key anatomical points and structural features in images with sub-pixel accuracy - [Segmentation](https://docs.vlm.run/agents/capabilities/image/segmentation.md): Create precise pixel-level segmentation masks for objects, regions, and features in images - [Image Tools](https://docs.vlm.run/agents/capabilities/image/tools.md): Image tools for cropping, rotating, enhancing, and transforming images - [UI Parsing](https://docs.vlm.run/agents/capabilities/image/ui-parsing.md): Analyze and understand user interface elements in screenshots and application images - [Caption & Tag](https://docs.vlm.run/agents/capabilities/video/captioning.md): Generate detailed captions and tags for videos using advanced vision models. - [Generate & Edit](https://docs.vlm.run/agents/capabilities/video/generation.md): Create and edit videos with AI-powered tools for content creation and manipulation - [Video Tools](https://docs.vlm.run/agents/capabilities/video/tools.md): Video tools for trimming, sampling, and extracting segments from videos - [Chat with Orion](https://docs.vlm.run/agents/chat.md): Bridging computer-vision tools to AI agents through language. - [Multi-modal Inputs](https://docs.vlm.run/agents/inputs.md): Encode images, videos, documents, and other media in a consistent format for agent execution and chat completions - [Instructor Compatibility](https://docs.vlm.run/agents/integrations/integrations-instructor.md): Run VLM Run Agents with the Instructor Python SDK with minimal code changes. - [OpenAI Compatibility](https://docs.vlm.run/agents/integrations/integrations-openai-compatibility.md): Run VLM Run Agents with the OpenAI Python SDK with just 2 lines of code change. - [Introduction](https://docs.vlm.run/agents/introduction.md): Introducing VLM Run Orion – the first visual agent that sees, reasons, and acts. - [Pricing](https://docs.vlm.run/agents/pricing.md): Credit-based pricing for Orion agents. - [Structured Responses](https://docs.vlm.run/agents/structured-responses.md): Agents that reliably return JSON via chat completions – with schema validation. - [Overview](https://docs.vlm.run/api-reference/index.md) - [Delete File](https://docs.vlm.run/api-reference/v1/files/delete-file.md): Delete a file by ID. Only available for Pro and Enterprise users. - [Get File by ID](https://docs.vlm.run/api-reference/v1/files/get-files-by-id.md): Get a file by ID. - [List Files](https://docs.vlm.run/api-reference/v1/files/get-files-list.md): Get all files uploaded by the user with pagination. - [Upload File](https://docs.vlm.run/api-reference/v1/files/post-file-upload.md): Upload a file. - [Get Artifact](https://docs.vlm.run/api-reference/v1/get-artifact-by-id.md): Retrieve an artifact by session ID or execution ID. - [List models](https://docs.vlm.run/api-reference/v1/get-models.md): Get the list of supported models. - [Health](https://docs.vlm.run/api-reference/v1/health.md): Health check endpoint. - [List domains](https://docs.vlm.run/api-reference/v1/hub/get-domains.md): Get the list of supported domains. - [Audio → JSON](https://docs.vlm.run/api-reference/v1/post-audio-generate.md): Generate structured prediction for the given audio file. - [Doc → JSON](https://docs.vlm.run/api-reference/v1/post-document-generate.md): Generate structured prediction for the given document. - [Image → JSON](https://docs.vlm.run/api-reference/v1/post-image-generate.md): Generate structured prediction for the given image. - [Get Schema](https://docs.vlm.run/api-reference/v1/post-schema.md) - [Submit Feedback](https://docs.vlm.run/api-reference/v1/post-submit-feedback.md): Submit feedback for a request, execution, or chat by its ID. - [Video → JSON](https://docs.vlm.run/api-reference/v1/post-video-generate.md): Generate structured prediction for the given video file. - [Get Prediction by ID](https://docs.vlm.run/api-reference/v1/predictions/get-predictions-by-id.md): Get prediction JSON by request ID. - [Get Predictions](https://docs.vlm.run/api-reference/v1/predictions/get-predictions-list.md): Get all predictions uploaded by the user with pagination. - [Custom Schemas](https://docs.vlm.run/capabilities/custom-schemas.md): Define custom schemas for visual extraction purposes. - [GraphQL](https://docs.vlm.run/capabilities/graphql.md): Query a subset of schema fields to improve efficiency for querying and document ETL. - [Long-context Outputs](https://docs.vlm.run/capabilities/long-context-outputs.md): Support for long-output contexts for domains like audio/video transcription, exceeding 8K token limits. - [Structured Responses](https://docs.vlm.run/capabilities/structured-responses.md): Extract JSON from images, videos, and documents with type-safety. - [Temporal Grounding](https://docs.vlm.run/capabilities/temporal-grounding.md): Ground extracted data with start/end times for audio/video segments and speaker identification. - [Visual Grounding](https://docs.vlm.run/capabilities/visual-grounding.md): Ground extracted data with location (bounding box) coordinates and confidence scores. - [Changelog](https://docs.vlm.run/changelog.md): Changelog for VLM Run. - [chat](https://docs.vlm.run/cli/chat.md): Chat with Orion to process images, videos, and documents - [files](https://docs.vlm.run/cli/files.md): Upload, list, retrieve, and delete files - [generate](https://docs.vlm.run/cli/generate.md): Generate structured predictions from images and documents - [Getting Started](https://docs.vlm.run/cli/getting-started.md): Install and configure the VLM Run CLI - [hub & models](https://docs.vlm.run/cli/hub.md): Browse domains, schemas, and available models - [predictions](https://docs.vlm.run/cli/predictions.md): List and retrieve prediction results - [skills](https://docs.vlm.run/cli/skills.md): Create, list, lookup, update, and download skills - [Error Codes](https://docs.vlm.run/error-codes.md): List of error codes that you may encounter when using the API - [Transcribing Audio](https://docs.vlm.run/guides/audio-ai/guide-audio-transcription.md): Learn how to transcribe and analyze long-form audio. - [Classifying Documents](https://docs.vlm.run/guides/doc-ai/guide-classifying-documents.md): Learn how to classify documents into categories like invoices, bank statements, and utility bills. - [Document Redaction & Edit](https://docs.vlm.run/guides/doc-ai/guide-document-redaction.md): Automatically detect and redact or replace sensitive information in documents with enterprise-grade compliance. - [Parsing Intake Forms](https://docs.vlm.run/guides/doc-ai/guide-healthcare-parsing-intake-forms.md): Extract structured data from healthcare documents like patient referrals, intake forms, and insurance cards. - [Parsing Documents](https://docs.vlm.run/guides/doc-ai/guide-parsing-documents.md): Extract structured data from long documents and reports. - [Parsing Invoices](https://docs.vlm.run/guides/doc-ai/guide-parsing-invoices.md): Extract structured data from invoices. - [Providing Feedback](https://docs.vlm.run/guides/feedback.md): Improve model performance through feedback collection and fine-tuning. - [Cataloging Images](https://docs.vlm.run/guides/image-ai/guide-cataloging-images.md): Learn how to generate captions, tags and descriptions for images. - [Classifying Images](https://docs.vlm.run/guides/image-ai/guide-classifying-images.md): Learn how to classify images into categories like animals, landscapes, and objects using AI. - [Best Practices](https://docs.vlm.run/guides/schema/schema-best-practices.md): Best practices for designing schemas for visual inputs. - [MarkdownPage](https://docs.vlm.run/guides/schema/schema-markdown-page.md): A visual guide to the MarkdownPage schema used for document extraction and processing. - [Transcribing Video](https://docs.vlm.run/guides/video-ai/guide-video-transcription.md): Learn how to transcribe and analyze hours-long video content using our Video Transcription API. - [Supported Domains](https://docs.vlm.run/hub.md): Pre-built schemas and domain definitions for common data extraction tasks. - [MongoDB](https://docs.vlm.run/integrations/integrations-mongodb.md) - [n8n](https://docs.vlm.run/integrations/integrations-n8n.md) - [Voxel51 FiftyOne](https://docs.vlm.run/integrations/integrations-voxel51.md) - [Zapier](https://docs.vlm.run/integrations/integrations-zapier.md) - [Introduction](https://docs.vlm.run/introduction.md): Extract JSON from images, videos, and documents with a unified API. - [Chat](https://docs.vlm.run/platform/chat.md): The interactive playground for chatting with Orion, VLM Run's visual agent - [Completions](https://docs.vlm.run/platform/observe/completions.md): Review model completions, token usage, and response quality on the VLM Run platform - [Evaluations](https://docs.vlm.run/platform/observe/evaluations.md): Measure and track the accuracy of your skills, agents, and request domains using feedback as ground truth. - [Executions](https://docs.vlm.run/platform/observe/executions.md): Track agent and skill executions end to end on the VLM Run platform - [Observe](https://docs.vlm.run/platform/observe/overview.md): Full observability for your visual AI: requests, executions, completions, and usage metrics - [Requests](https://docs.vlm.run/platform/observe/requests.md): View, filter, and inspect every API request on the VLM Run platform - [Platform](https://docs.vlm.run/platform/overview.md): The VLM Run platform: chat with visual agents, build skills, and observe every request in one place - [Settings](https://docs.vlm.run/platform/settings.md): Manage API keys, team members, billing, and account preferences - [Skills](https://docs.vlm.run/platform/skills/overview.md): Create, edit, and manage reusable visual extraction skills on the VLM Run platform - [Pricing](https://docs.vlm.run/pricing.md): Flexible pricing plans for developers and enterprises to build with VLM Run. - [Rate Limits](https://docs.vlm.run/rate-limits.md): Rate limits to consider when using the API. - [client.agent](https://docs.vlm.run/sdk-reference/components/agent.md): Agent Chat Completions - [Client Reference](https://docs.vlm.run/sdk-reference/components/client.md): Detailed guide to the VLM Run Python SDK client - [client.files](https://docs.vlm.run/sdk-reference/components/files.md): Manage files with the VLM Run Python SDK - [client.hub](https://docs.vlm.run/sdk-reference/components/hub.md): Hub API Reference - [client.models](https://docs.vlm.run/sdk-reference/components/models.md): Models API Reference - [SDK Overview](https://docs.vlm.run/sdk-reference/components/overview.md): Core concepts and components of the VLM Run Python SDK - [client.predictions](https://docs.vlm.run/sdk-reference/components/predictions.md): Manage predictions with the VLM Run Python SDK - [Getting Started](https://docs.vlm.run/sdk-reference/getting-started.md): How to get started with the VLM Run Python SDK - [client.agent](https://docs.vlm.run/sdk-reference/node/components/agent.md): Learn how to use Agent Chat Completions with the VLM Run Node.js SDK - [client](https://docs.vlm.run/sdk-reference/node/components/client.md): VLM Run Node.js SDK Client Configuration and Usage - [client.files](https://docs.vlm.run/sdk-reference/node/components/files.md): Learn how to upload and manage files with the VLM Run Node.js SDK - [client.hub](https://docs.vlm.run/sdk-reference/node/components/hub.md): Hub API Reference for the VLM Run Node.js SDK - [client.models](https://docs.vlm.run/sdk-reference/node/components/models.md): Learn how to work with models in the VLM Run Node.js SDK - [client.predictions](https://docs.vlm.run/sdk-reference/node/components/predictions.md): Manage predictions with the VLM Run Node.js SDK - [Getting Started](https://docs.vlm.run/sdk-reference/node/getting-started.md): Learn how to install and use the VLM Run Node.js SDK - [client.audio](https://docs.vlm.run/sdk-reference/node/predictions/audio.md): Learn how to process audio files with the VLM Run Node.js SDK - [client.document](https://docs.vlm.run/sdk-reference/node/predictions/document.md): Learn how to process documents with the VLM Run Node.js SDK - [client.image](https://docs.vlm.run/sdk-reference/node/predictions/image.md): Learn how to process images with the VLM Run Node.js SDK - [client.video](https://docs.vlm.run/sdk-reference/node/predictions/video.md): Video Processing API for the VLM Run Node.js SDK - [client.audio](https://docs.vlm.run/sdk-reference/predictions/audio.md): Audio Processing API - [client.document](https://docs.vlm.run/sdk-reference/predictions/document.md): Document Processing API - [client.image](https://docs.vlm.run/sdk-reference/predictions/image.md): Image Processing API - [client.video](https://docs.vlm.run/sdk-reference/predictions/video.md): Video Processing API - [Orion Skills](https://docs.vlm.run/skills/introduction.md): Modular, reusable capabilities for visual extraction and agent workflows - [Create Skills](https://docs.vlm.run/skills/manage/create.md): Create skills from skill folders, prompts, or chat sessions - [List & Lookup](https://docs.vlm.run/skills/manage/list-lookup.md): List and search for available skills - [Update Skills](https://docs.vlm.run/skills/manage/update.md): Create new versions of existing skills - [Quickstart](https://docs.vlm.run/skills/quickstart.md): Use a skill to extract structured data in under 2 minutes - [Reference](https://docs.vlm.run/skills/reference.md): AgentSkill object and skill specification reference - [Skill Structure](https://docs.vlm.run/skills/spec/overview.md): How a skill directory is organized - [schema.json](https://docs.vlm.run/skills/spec/schema-json.md): JSON Schema for validating skill output - [SKILL.md](https://docs.vlm.run/skills/spec/skill-md.md): Skill metadata and instructions format - [vlmrun.yaml](https://docs.vlm.run/skills/spec/vlmrun-yaml.md): Execution configuration for agent-powered skills - [Agent Execution](https://docs.vlm.run/skills/usage/agent.md): Use skills with the agent execution endpoint - [Chat Completions](https://docs.vlm.run/skills/usage/chat.md): Use skills with the chat completions endpoint - [Model Request](https://docs.vlm.run/skills/usage/generation.md): Use skills with model requests - [Version Pinning](https://docs.vlm.run/skills/usage/version-pinning.md): Pin skill versions for reproducible results - [Supported Files](https://docs.vlm.run/supported-files.md): File formats supported by VLM Run for document, image, video, and audio processing. - [Ways to Use VLM Run](https://docs.vlm.run/ways-to-use-vlm-run.md): The four entry points into VLM Run (Requests, Executions, Chat Completions API, and Chat UI) and when to reach for each. - [Webhooks](https://docs.vlm.run/webhooks.md): Receive real-time notifications when your async processing jobs complete ## OpenAPI Specs - [openapi](https://api.vlm.run/openapi.json)