Skip to main content
With our new OpenAI-compatible API, you can use the popular Instructor library to interact with VLM Run. This allows developers to switch between OpenAI, Instructor and VLM Run APIs with minimal changes.

Configure the OpenAI-compatible Instructor Client with VLM Run

Since Instructor is compatible with the OpenAI API, you can use the same configuration methods as described in the OpenAI Compatibility page. Here’s an example of how to configure the Instructor client to work with VLM Run:
import instructor
import openai

# Configure the OpenAI client
client = openai.OpenAI(
    base_url="https://agent.vlm.run/v1/openai",
    api_key="<VLMRUN_API_KEY>"
)

# Configure the Instructor client
inst_client = instructor.from_openai(
    client, mode=instructor.Mode.MD_JSON
)

Usage: Chat Completion with Instructor and VLM Run Agents

Now that you have configured the Instructor client, you can use the inst_client.chat.completions.create method to interact with VLM Run. Below is an example of how to use the Instructor client to create a chat completion:
from pydantic import BaseModel, Field

# Let's define a simplified Pydantic model to represent the outputs. Let's say we want to generate an image of big ben at a distance, and crop into the clock face.
class AgentResponse(BaseModel):
    image_url: str = Field(description="The pre-signed URL of the image generated.")
    clock_image_url: str = Field(description="The pre-signed URL of the image of the clock face cropped from the generated image.")
    crop_xywh: tuple[float, float, float, float] = Field(description="The (x, y, width, height) of the clock face cropped from the generated image.")

# Now we can use the Instructor client to create a chat completion
response = inst_client.chat.completions.create(
    model="vlm-agent-1",
    max_retries=0,
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Generate an image of big ben at a distance. Crop into the clock face and provide a close up of the clock face."},
            ],
        }
    ],
    response_model=AgentResponse,
)
logger.debug(f"type={type(response)}, response={response}")

Usage: Guided Chat Completion with Instructor and VLM Run Agents

The example above is an unconstrained chat completion request, relying solely on the JSON schema provided to the VLM Run Agents as an instruction. In order to enforce the JSON schema, you can use the json_schema extra body parameter to guide the VLM Run Agents model to extract structured data from the image.
# Adapt the above example with an additional "json_schema" parameter
# to enforce guided decoding of the JSON in the schema.
response = inst_client.chat.completions.create(
    ... # same as above
    response_model=AgentResponse,
    response_format={"type": "json_schema", "json_schema": AgentResponse.model_json_schema()},
)
This will ensure that the VLM Run model generates the image and crops the clock face in the specified JSON format.