GraphQL
Query a subset of schema fields to improve efficiency for querying and document ETL.
One of the most powerful features of Vision Language Models is their ability to reason about complex queries, and answer with the relevant data. Unlike traditional OCR-based methods, where every single character is extracted and processed, VLMs admit a much more powerful query mechanism, which is the basis of our GraphQL-based query mechanism.
Why GraphQL?
First, if you are not familiar with GraphQL, it is a query language for APIs that allows you to specify exactly which fields you want to extract from a given schema. Instead of always receiving the full JSON response, and post-processing to extract the specific fields you need, you can simply request only the specific data points relevant to your application. There is a similar and direct analog to querying LLMs today, where you can specify the exact fields you want to extract from the LLM’s response in a structured JSON response.
Querying VLMs with GraphQL
We simply take this one step further, and enable this same query mechanism for Vision Language Models (VLMs). VLM Run’s GraphQL capability enables you to extract only the specific fields you need from complex schemas, improving efficiency for querying and document ETL processes. This powerful feature allows you to precisely control what data is extracted, minimizing server-side processing overhead of extracting unncessary details, and simultaneously reducing the amount of data transferred over the network, providing a much more efficient and scalable way to extract data (i.e. ETL) from complex unstructured data.
Benefits of GraphQL
- Improved Performance: Extract only the data fields you need (unlike OCR-based methods), reducing server-side computational overhead.
- Reduced Bandwidth: Minimize network traffic by receiving smaller, targeted responses
- Flexible Data Selection: Dynamically adjust which fields to extract based on your needs
- Hierarchical Queries: Select nested fields with intuitive syntax
A Concrete End-to-End Example
Let’s say you have an invoice PDF document that contains a table of data with an extensive list of fields (such as line items, tax, total, etc.). You can see the official schema we use for invoices here.
However, for your use case, you only need to extract the most important fields such as:
- Invoice Number: The number of the invoice
- Issue Date: The date of the invoice
- Due Date: The due date of the invoice
- Total Amount: The total amount of the invoice
Since we have already defined the schema for invoices, you can simply use it as a reference and select the fields you specifically need in the following GraphQL query:
Now that you have defined the GraphQL query, you can provide it via the gql_stmt
parameter to the GenerationConfig
object in the generate
method.
Extracting Nested Fields with GraphQL
GraphQL’s hierarchical query structure enables precise extraction of deeply nested fields from complex document schemas. For instance, consider a scenario where you require not only top-level invoice metadata—such as invoice_number
, issue_date
, and due_date
—but also a specific nested attribute like the postal_code
within the customer_billing_address
object. This can be accomplished with a single, declarative GraphQL query:
This approach leverages GraphQL’s ability to traverse and select arbitrary subfields within a schema, ensuring that only the minimal, application-relevant data is extracted from the model’s output. The result is a significant reduction in both server-side post-processing and network payload size, which is especially impactful when dealing with high-throughput ETL pipelines or latency-sensitive applications.
By architecting your data extraction workflows around GQL queries, you can enforce strict data contracts, optimize resource utilization, and build robust, scalable systems on top of VLM Run’s document intelligence capabilities.
Try our Document -> JSON API today
Head over to our Document -> JSON to start building your own document processing pipeline with VLM Run. Sign-up for access on our platform.