You can use the VLM Run Python SDK to interact with the VLM Run API.

Installation

Basic Installation

You can install the basic Python SDK using pip:

pip install vlmrun --upgrade

Need specific features? Choose an optional dependency pack:

# For video processing
pip install "vlmrun[video]"

# For document processing
pip install "vlmrun[doc]"

# For all features
pip install "vlmrun[all]"

Set Up Authentication

Grab your API key from the VLM Run dashboard and set it as an environment variable:

# On Linux/macOS
export VLMRUN_API_KEY="your-api-key"

# On Windows
set VLMRUN_API_KEY=your-api-key

Your First API Call

Let’s process an image to extract structured data:

from vlmrun.client import VLMRun

# Initialize the client
client = VLMRun()

# Process an image from a URL
response = client.image.generate(
    urls=["https://storage.googleapis.com/vlm-data-public-prod/hub/examples/document.invoice/invoice_1.jpg"],
    domain="document.invoice"
)

# Check if processing completed
if response.status == "completed":
    # Access the structured data
    invoice = response.response
    print(f"Invoice #: {invoice.invoice_number}")
    print(f"Total: ${invoice.total_amount}")

What’s Next?

With the client initialized, you can now:

  • Process other media types (documents, audio, video)
  • Use different domains for specialized extraction
  • Upload and manage files
  • Create custom extraction schemas

Check out the SDK Overview for key concepts or jump into the Client Reference for detailed examples.

Quick Examples

Process a Document

# Extract data from a PDF
response = client.document.generate(
    url="https://storage.googleapis.com/vlm-data-public-prod/hub/examples/document.form/form_1.pdf",
    domain="document.form"
)

Transcribe Audio

# Transcribe an audio file
response = client.audio.generate(
    url="https://storage.googleapis.com/vlm-data-public-prod/examples/audio/sample.mp3",
    domain="audio.transcription"
)

Process a Local Image

# Using a local file
from PIL import Image
image = Image.open("invoice.jpg")

response = client.image.generate(
    images=[image],
    domain="document.invoice"
)