In addition to the pre-defined domains, VLM-1 also supports custom schemas (coming soon). This feature allows you to define your own schema for a specific domain or use-case, and use VLM-1 to extract structured data that conforms to that schema.

1. Define your Custom Schema

VLM-1 has first-class support for Pydantic, which allows you to define your schema using rich, strongly-typed Pydantic models. Here’s an example of a custom schema for classifying and captioning images at the same time:

from typing import Literal
from pydantic import BaseModel, Field

class ImagePrediction(BaseModel):
    label: Literal["tv", "document", "other"] = Field(..., title="Class label for the image.")
    caption: str = Field(..., title="Caption for the image.")

2. Extract Structured Data

Once you have defined your custom schema, you can use VLM-1 to extract structured data from images that conform to this schema. The extracted data will be validated against the schema you defined, ensuring that it conforms to the expected structure and types.

We support querying the API via RESTful endpoints, or using the OpenAI Python SDK with our OpenAI-Compatible API.

Once you have extracted the JSON response, you can immediately validate it against the schema you defined using Pydantic:

print(json.dumps(response_dict, indent=2))

You should see the following output:

{
  "label": "tv",
  "caption": "A TV screen showing a news broadcast with a news anchor and chyron."
}

Since we have defined the schema using Pydantic, you can use the ImagePrediction.model_validate(response_dict) method to validate the extracted JSON response and initialize a Pydantic model instance.

from pydantic import ValidationError

try:
    image_prediction = ImagePrediction(**response_dict)
except ValidationError as e:
    print(e)
print(image_prediction)

You should see the following output:

ImagePrediction(label='tv', caption='A TV screen showing a news broadcast with a news anchor and chyron.')

Want us to support a new schema?

If you’re interested in seeing a specific domain or use-case supported, feel free to reach out to us on Discord or email.

Get Started with our Image -> JSON API

Head over to our Image -> JSON to start building your own image processing pipeline with VLM-1. Sign-up for access to our API here.