Skip to main content
You can use the VLM Run Python SDK to interact with the VLM Run API.

Installation

Basic Installation

You can install the basic Python SDK using pip:
pip install vlmrun --upgrade
Need specific features? Choose an optional dependency pack:
# For video processing
pip install "vlmrun[video]"

# For document processing
pip install "vlmrun[doc]"

# For all features
pip install "vlmrun[all]"

Set Up Authentication

Grab your API key from the VLM Run dashboard and set it as an environment variable:
# On Linux/macOS
export VLMRUN_API_KEY="your-api-key"

# On Windows
set VLMRUN_API_KEY=your-api-key

Your First API Call

Let’s process an image to extract structured data:
from vlmrun.client import VLMRun

# Initialize the client
client = VLMRun()

# Process an image from a URL
response = client.image.generate(
    urls=["https://storage.googleapis.com/vlm-data-public-prod/hub/examples/document.invoice/invoice_1.jpg"],
    domain="document.invoice"
)

# Check if processing completed
if response.status == "completed":
    # Access the structured data
    invoice = response.response
    print(f"Invoice #: {invoice.invoice_number}")
    print(f"Total: ${invoice.total_amount}")

What’s Next?

With the client initialized, you can now:
  • Process other media types (documents, audio, video)
  • Use different domains for specialized extraction
  • Upload and manage files
  • Create custom extraction schemas
Check out the SDK Overview for key concepts or jump into the Client Reference for detailed examples.

Quick Examples

Process a Document

# Extract data from a PDF
response = client.document.generate(
    url="https://storage.googleapis.com/vlm-data-public-prod/hub/examples/document.form/form_1.pdf",
    domain="document.form"
)

Transcribe Audio

# Transcribe an audio file
response = client.audio.generate(
    url="https://storage.googleapis.com/vlm-data-public-prod/examples/audio/sample.mp3",
    domain="audio.transcription"
)

Process a Local Image

# Using a local file
from PIL import Image
image = Image.open("invoice.jpg")

response = client.image.generate(
    images=[image],
    domain="document.invoice"
)
I