Skip to main content
Foundation vision models support chat over visual inputs, but automation needs reliable, machine-validated output. Agent chat completions let you define the expected structure up front and get consistently formatted JSON back – either loosely with json_object or strictly via json_schema.

Extract Structured JSON with VLM Run’s Orion Agents

Here’s an example of using the agent chat completions endpoint to extract typed JSON directly from user prompts and files.
from pathlib import Path
from openai import OpenAI

# Initialize the OpenAI client with the custom base URL
client = OpenAI(
    api_key="<VLMRUN_API_KEY>",
    base_url="https://agent.vlm.run/v1/openai"
)

# Ask the agent for structured output using a loose JSON object
response = client.chat.completions.create(
    model="vlmrun-orion-1:auto",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "Extract invoice number, dates, totals, and vendor in JSON."},
                {"type": "image_url", "image_url": {"url": "https://example.com/invoice.jpg"}}
            ]
        }
    ],
    response_format={"type": "json_object"},
)

JSON Response

JSON
{
  "id": "chatcmpl_abc123xyz",
  "object": "chat.completion",
  "model": "vlmrun-orion-1:auto",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "{\"invoice_number\":\"INV-2024-001\",\"date\":\"2024-09-15\",\"total_amount\":1250.00,\"vendor_name\":\"Acme Corporation\"}"
      },
      "finish_reason": "stop"
    }
  ]
}

Response Format Types

Similar to OpenAI’s json_object and json_schema response formats, you can use the json_object or json_schema response format types to extract structured JSON from the agent’s response.
TypeDescription
json_objectValid JSON object without specific schema
json_schemaStrict JSON conforming to provided schema

Best Practices

  • Clear system prompts: Define role and output format in the system message
  • Prefer schemas for automation: Use json_schema for guaranteed structure
  • Control randomness: Use lower temperature (0.0–0.3) for deterministic outputs
  • Validate responses: Parse/validate JSON and handle errors gracefully
  • Keep history concise: Shorter message histories improve latency and reliability