Skip to main content
Foundation vision models support chat over visual inputs, but automation needs reliable, machine-validated output. Agent chat completions let you define the expected structure up front and get consistently formatted JSON back – either loosely with json_object or strictly via json_schema.
Using the OpenAI SDK? See OpenAI Compatibility.

Extract Structured JSON with VLM Run’s Orion Agents

Here’s an example of using the agent chat completions endpoint to extract typed JSON directly from user prompts and files.
from vlmrun.client import VLMRun

# Initialize the VLMRun client
client = VLMRun(api_key="<VLMRUN_API_KEY>")

# Ask the agent for structured output using a loose JSON object
response = client.agent.completions.create(
    model="vlmrun-orion-1:auto",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Extract invoice number, dates, totals, and vendor in JSON."},
                {"type": "image_url", "image_url": {"url": "https://example.com/invoice.jpg"}}
            ]
        }
    ],
    response_format={"type": "json_object"},
)
print(response.choices[0].message.content)
>>> '{"invoice_number":"INV-2024-001","invoice_date":"2024-09-15","total_amount":1250.00,"vendor_name":"Acme Corporation"}'

JSON Response

JSON
{
  "invoice_number": "INV-2024-001",
  "invoice_date": "2024-09-15T00:00:00",
  "total_amount": 1250.00,
  "vendor_name": "Acme Corporation"
}

Response Format Types

TypeDescription
json_objectValid JSON object without specific schema
json_schemaStrict JSON conforming to provided schema