With our new OpenAI-compatible API, you can use the popular Instructor library to interact with VLM Run. This allows developers to switch between OpenAI, Instructor and VLM Run APIs with minimal changes.

Configure the OpenAI-compatible Instructor Client with VLM Run

Since Instructor is compatible with the OpenAI API, you can use the same configuration methods as described in the OpenAI Compatibility page.

Here’s an example of how to configure the Instructor client to work with VLM Run:

import instructor
import openai

# Configure the OpenAI client
client = openai.OpenAI(
    base_url="https://api.vlm.run/v1/openai",
    api_key="<VLMRUN_API_KEY>"
)

# Configure the Instructor client
inst_client = instructor.from_openai(
    client, mode=instructor.Mode.MD_JSON
)

Usage: Chat Completion with Instructor and VLM Run

Now that you have configured the Instructor client, you can use the inst_client.chat.completions.create method to interact with VLM Run.

Below is an example of how to use the Instructor client to create a chat completion:

from pydantic import BaseModel


# Let's define a simplified Pydantic model to represent the invoice. This needs to match the domain's schema you specified in the `extra_body`. We refer the user to the hub schemas for more details.
class Invoice(BaseModel):
    invoice_id: str
    total: float
    currency: str

# Now we can use the Instructor client to create a chat completion
response = inst_client.chat.completions.create(
    model="vlm-1",
    max_tokens=4096,
    response_model=Invoice,
    max_retries=0,
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Please extract the invoice details from the provided image in JSON format."},
                {"type": "image_url", "image_url": {"url": encode_image(image), "detail": "auto"}},
            ],
        }
    ],
    extra_body={
        "vlmrun": {
            "domain": "document.invoice",
            "metadata": {"allow_training": False},
        }
    },
)
logger.debug(f"type={type(response)}, response={response}")

Usage: Guided Chat Completion with Instructor and VLM Run

The example above is an unconstrained chat completion request, relying solely on the JSON schema provided to the VLM as an instruction. In order to enforce the JSON schema, you can use the json_schema extra body parameter to guide the VLM Run model to extract structured data from the image.

# Adapt the above example with an additional "json_schema" parameter 
# to enforce guided decoding of the JSON in the schema.
response = inst_client.chat.completions.create(
    ...
    response_model=Invoice,
    extra_body={
        "vlmrun": {
            "domain": "document.invoice",
            "json_schema": Invoice.model_json_schema(),
        }
    },
)

This will ensure that the VLM Run model extracts the invoice details in the specified JSON format.