vlm-1
can intelligently classify documents based on their content, layout, and visual characteristics. This enables robust classification of documents like invoices, bank statements, utility bills, and other document types, even when they come in different formats or layouts.
For example, below is a diagram showing how a document is classified into different types, and how each type can have its own custom post-processing logic.
Classifying Financial Documents
Let’s look at a financial document classification example to see howvlm-1
can be used to automatically categorize different types of documents. In this example, we’ll use vlm-1
to classify documents into categories like invoices, bank statements, utility bills, and other financial documents. This classification can then be used to route documents to the appropriate processing pipeline or storage system.

Example of different types of financial documents that need classification.
Define a custom schema for document classification
In the sections below, we’ll showcase how to use the API for document classification.vlm-1
can automatically classify documents based on their content and visual characteristics, providing both a classification and a rationale for its decision. First, let’s create a custom schema that will be used to classify the documents.
Classify documents
Once you have defined your custom schema, you can usevlm-1
to classify documents according to this schema. The classification will be validated against the schema you defined, ensuring that it conforms to the expected structure and types. First, let’s look at an example of how to classify a single document.
Sample Document Classification
Let’s take a look at the sample output for a typical invoice document.rationale
: A detailed explanation of why it classified the document as an invoice, based on both content and visual features. This allows the developer or user to introspect on the classification and make any necessary adjustments downstream to the model.document_type
: The correct document classification type, in this case aninvoice
.confidence
: A qualitative confidence level of “hi”, indicating strong certainty in the classification based on the clear presence of invoice-specific features.
Processing larger document collections with batch=True
Once you have validated the classification for a single document, you can scale this process to classify larger collections of documents. The code example below shows how to process several documents in a directory. The rationale-based approach is particularly useful when dealing with ambiguous documents or when you need to understand why a document was classified in a certain way.
Fine-tuning Document Classification
This feature is currently only available for our enterprise-tier customers. If you are interested in using this feature, please contact us.