Classifying Documents
Learn how to classify documents into categories like invoices, bank statements, and utility bills.
While traditional document processing systems often rely on template-based approaches or simple keyword matching, vlm-1
can intelligently classify documents based on their content, layout, and visual characteristics. This enables robust classification of documents like invoices, bank statements, utility bills, and other document types, even when they come in different formats or layouts.
For example, below is a diagram showing how a document is classified into different types, and how each type can have its own custom post-processing logic.
Classifying Financial Documents
Let’s look at a financial document classification example to see how vlm-1
can be used to automatically categorize different types of documents. In this example, we’ll use vlm-1
to classify documents into categories like invoices, bank statements, utility bills, and other financial documents. This classification can then be used to route documents to the appropriate processing pipeline or storage system.
Example of different types of financial documents that need classification.
Define a custom schema for document classification
In the sections below, we’ll showcase how to use the API for document classification. vlm-1
can automatically classify documents based on their content and visual characteristics, providing both a classification and a rationale for its decision. First, let’s create a custom schema that will be used to classify the documents.
Classify documents
Once you have defined your custom schema, you can use vlm-1
to classify documents according to this schema. The classification will be validated against the schema you defined, ensuring that it conforms to the expected structure and types. First, let’s look at an example of how to classify a single document.
Sample Document Classification
Let’s take a look at the sample output for a typical invoice document.
Let’s breakdown the output into their respective components:
rationale
: A detailed explanation of why it classified the document as an invoice, based on both content and visual features. This allows the developer or user to introspect on the classification and make any necessary adjustments downstream to the model.document_type
: The correct document classification type, in this case aninvoice
.confidence
: A qualitative confidence level of “hi”, indicating strong certainty in the classification based on the clear presence of invoice-specific features.
Processing larger document collections with batch=True
Once you have validated the classification for a single document, you can scale this process to classify larger collections of documents. The code example below shows how to process several documents in a directory. The rationale-based approach is particularly useful when dealing with ambiguous documents or when you need to understand why a document was classified in a certain way.
Fine-tuning Document Classification
For enterprise use-cases where you need to fine-tune the model for custom document types and improved accuracy, you can use our fine-tuning guides to customize the model performance and scalability needs. This can include fine-tuning the model on your own document collections, customizing the classification schema, or adding new document types to the classification system. Fine-tuning can help you improve the accuracy and performance of the model for your specific document types, and also help you scale the model to handle larger volumes of documents with more efficient, lightweight fine-tuned models that are optimized for your specific use-case. Contact us at support@vlm.run to learn more about how we can help you with your fine-tuning needs.
Try our Document -> JSON API today
Head over to our Document -> JSON to start building your own document processing pipeline with VLM Run. Sign-up for access on our platform.