The client.document object allows you to process documents and extract structured data.

Generate Predictions

from vlmrun.client import VLMRun
from vlmrun.client.types import PredictionResponse

# Initialize the client
client = VLMRun()

# Process a PDF document with a predefined schema
# Note: Since the file is passed as a file path, it will be uploaded to the VLM Run server.
response: PredictionResponse = client.document.generate(
    file="path/to/document.pdf",
    domain="document.markdown",
)

Get Usage

from vlmrun.client.types import CreditUsage

usage: CreditUsage = response.usage
print(usage)

Document Utilities

The VLM Run SDK provides several document-processing utilities for encoding and downloading documents.

from vlmrun.common.pdf import pdf_images

# Read a PDF file and return an iterator of images
images: Iterator[Image.Image] = pdf_images("path/to/document.pdf")
for image in images:
    print(image)