VLM Run MCP Server

The VLM Run MCP server gives any MCP-compatible AI agent the ability to see and understand visual content - a capability that’s typically missing in LLMs. No complex API integrations needed - just connect your AI agent to our hosted MCP server and instantly unlock the power to process images, documents, videos, and other visual content.

MCP Quickstart

Connect to Claude Desktop and start processing visual content immediately.

Explore MCP Tools

See the complete catalog of visual AI capabilities and tools.

Explore MCP Examples

See examples of how to use VLM Run MCP tools to build your own document processing pipeline.

Sign-up Today

Sign-up for an API key and start building visual AI into your agents today.

Bridging computer-vision tools to AI agents through language

VLM Run MCP instantly augments your LLM agents with advanced visual-processing capabilities without manual integration. With VLM Run MCP tools, your AI agent can analyze images, extract data from rich visually-complex PDFs, and even process audio/videos. The LLM agent automatically selects the right tool for each task.

Let’s take a look at a few use-cases that can be automated with VLM Run MCP tools.

Capability Showcase

Face Detection / Blur

Document Redaction

Visual Search

Video Editing

Installation

Get your API key

Head over to the VLM Run Dashboard to get your API key ($VLMRUN_API_KEY). We’ll use this to authenticate your requests to the MCP server next.

Add the server to your MCP client

Add our hosted server with the following syntax to your MCP client configuration. Works with Claude Desktop, OpenAI API, Gemini SDK, or any MCP-compatible platform.

https://mcp.vlm.run/mcp/sse

Authentication is now handled via Bearer token in the Authorization header. See the quickstart guide for detailed client configuration examples.

When prompted for the access token or API key, use your VLM Run API key $VLMRUN_API_KEY. In the request header, this will be the value of the Authorization header.

Authorization: Bearer $VLMRUN_API_KEY

Ping the MCP server to test

Copy-paste the server URL linked above in your browser and you should see a ping response from the MCP server.

Start building your agent with VLM Run MCP

Head on over to the quickstart page and tools page to get started with VLM Run MCP tools. Our intro MCP notebook is also a great place to start.

Current Capabilities

Take a quick look at the current catalog of visual AI tools available through VLM Run MCP server. We’re constantly adding new tools and capabilities, so this list is always evolving. Join our Discord channel to stay updated on the latest features and capabilities, and feel free to request new tools.

Core Processing Tools

I/O Tools: Load images, files, and other objects into the system for processing by other tools.
Document AI Tools: Extract structured data from invoices, receipts, contracts, forms, and any document type
Image AI Tools: Classify images, extract text, analyze visual content, and understand scenes
Video AI Tools: Transcribe videos with scene descriptions, search content, and analyze meetings
Hub: Browse 50+ pre-built domains and schemas

How it works

VLM Run MCP Server follows the Model Context Protocol standard, acting as the bridge between your AI client and powerful visual processing capabilities.

Configure your MCP client

Add our hosted server https://mcp.vlm.run/mcp/sse to your MCP client configuration with Bearer token authentication. Works with Claude Desktop, OpenAI API, Gemini SDK, or any MCP-compatible platform.

Agent discovers available tools

Your agent automatically discovers all VLM Run tools: parse_image, parse_document, put_image_url, put_file_url, and so on.

Natural conversation triggers tools

Simply ask your agent to process visual content. Behind the scenes, it calls the appropriate VLM Run MCP tools with your files and requirements.

Get actionable results

Your AI receives structured data and can immediately use it - extract invoice totals for accounting, create meeting summaries from videos, or generate privacy-compliant documents for sharing.

Try our MCP server today

Head over to our MCP server to start building your own document processing pipeline with VLM Run. Sign-up for access on our platform.

Get Started

MCP Quickstart

Explore MCP Tools

Explore MCP Examples

Sign-up Today

​Bridging computer-vision tools to AI agents through language

​Installation

​Current Capabilities

​Core Processing Tools

​How it works

​Try our MCP server today

Bridging computer-vision tools to AI agents through language

Installation

Current Capabilities

Core Processing Tools

How it works

Try our MCP server today