The client.image object allows you to process images and extract structured data.

Generate Predictions

from PIL import Image

from vlmrun.client import VLMRun
from vlmrun.client.types import PredictionResponse, GenerationConfig

# Initialize the client
client = VLMRun()

# Process an image with a predefined schema
image: Image.Image = Image.open("path/to/image.jpg")
response = client.image.generate(
    images=[image],
    domain="document.invoice",
)

# Process with custom schema
image: Image.Image = Image.open("path/to/image.jpg")
response: PredictionResponse = client.image.generate(
    images=[image],
    domain="document.invoice",
    config=GenerationConfig(
        json_schema={...}
    )
)
print(response)

Generate Predictions with a custom schema

Let’s say we want to classify images into one of three categories: tv, document, or other. You can define a custom schema as follows, and pass it to the json_schema parameter:

from typing import Literal
from pydantic import BaseModel, Field
from vlmrun.client.types import GenerationConfig


class ImagePrediction(BaseModel):
    label: Literal["tv", "document", "other"] = Field(..., title="Class label for the image.")
    caption: str = Field(..., title="Caption for the image.")

# Initialize the client
client = VLMRun()

# Load the image, and process it with the custom schema
image: Image.Image = Image.open("path/to/image.jpg")
response: PredictionResponse = client.image.generate(
    images=[image],
    domain="image.classification",
    config=GenerationConfig(
        json_schema=ImagePrediction.model_json_schema()
    )
)

Get Usage

from vlmrun.client.types import CreditUsage

usage: CreditUsage = response.usage
print(usage)

Image Utilities

The VLM Run SDK provides several image-processing utilities for encoding and downloading images.

from vlmrun.common.image import encode_image
from vlmrun.common.utils import download_image
from PIL import Image

# Convert image to base64
image = Image.open("image.jpg")
base64_str = encode_image(image, format="PNG")

# Download image from URL
image: Image.Image = download_image("https://example.com/image.jpg")