Extract JSON from images, videos, and documents with type-safety.
vlm-1
is purpose-built for what is popularly known as JSON mode. This mode is particularly useful for developers who want to build automation workflows, data pipelines, or other software systems that require structured data as output.
vlm-1
can extract from an invoice:
Parsing an invoice with `vlm-1`
vlm-1
can extract a wide range of detailed information from the invoice, including vendor and customer details, line items, payment terms, and more. This structured data can be easily integrated into various financial systems, accounting software, or used for automated invoice processing.
vlm-1
also supports custom schemas that allows you to define your own schema for a specific domain or use-case. This gives you the flexibility to extract structured data that conforms to your specific needs and requirements, while still leveraging all the vision-based reasoning capabilities of vlm-1
(see Capabilities section for more details). See the next section on Custom Schemas for more details.