The client.document object allows you to process documents and extract structured data.

Generate Predictions

from pathlib import Path
from vlmrun.client import VLMRun
from vlmrun.client.types import PredictionResponse

# Initialize the client
client = VLMRun()

# Process a PDF document with a predefined schema
# Note: Since the file is passed as a file path, it will be uploaded to the VLM Run server.
response: PredictionResponse = client.document.generate(
    file=Path("path/to/document.pdf"),
    domain="document.markdown",
)

Get Usage

from vlmrun.client.types import CreditUsage

usage: CreditUsage = response.usage
print(usage)

Document Utilities

The VLM Run SDK provides several document-processing utilities for encoding and downloading documents.

from pathlib import Path
from vlmrun.common.pdf import pdf_images

# Read a PDF file and return an iterator of images
images: Iterator[Image.Image] = pdf_images(Path("path/to/document.pdf"))
for image in images:
    print(image)