Classifying Images
Learn how to classify images into categories like animals, landscapes, and objects using AI.
While traditional image processing systems often rely on simple feature detection or rule-based approaches, vlm-1
can intelligently classify images based on their content, composition, and visual characteristics. This enables robust classification of images into various categories, even when they come in different styles, lighting conditions, or perspectives.
For example, below is a diagram showing how an image can be classified into different types, and how each type can have its own custom post-processing logic.
Classifying TV Images
Let’s look at a TV image classification example to see how vlm-1
can be used to automatically analyze and categorize television content. In this example, we’ll use vlm-1
to classify TV screenshots and frames into categories like news broadcasts, entertainment shows, commercials, and other programming types. This classification enables automated content monitoring, ad detection, and intelligent media archiving by identifying the type of TV content being shown.
Example image that needs classification.
Define a custom schema for image classification
In the sections below, we’ll showcase how to use the API for image classification. vlm-1
can automatically classify images based on their content and visual characteristics, providing both a classification and a rationale for its decision. First, let’s create a custom schema that will be used to classify the images.
Classify images
Once you have defined your custom schema, you can use vlm-1
to classify images according to this schema. The classification will be validated against the schema you defined, ensuring that it conforms to the expected structure and types. First, let’s look at an example of how to classify a single image.
Sample Image Classification
Let’s take a look at the sample output for a typical animal image.
Let’s breakdown the output into their respective components:
rationale
: A detailed explanation of why it classified the image as a news, based on visual features and content. This allows the developer or user to introspect on the classification and make any necessary adjustments downstream to the model.image_type
: The correct image classification type, in this casenews
.confidence
: A qualitative confidence level of “high”, indicating strong certainty in the classification based on the clear presence of financial market data and a news presenter.
Fine-tuning Image Classification
For enterprise use-cases where you need to fine-tune the model for custom image types and improved accuracy, you can use our fine-tuning guides to customize the model performance and scalability needs. This can include fine-tuning the model on your own image collections, customizing the classification schema, or adding new image types to the classification system. Fine-tuning can help you improve the accuracy and performance of the model for your specific image types, and also help you scale the model to handle larger volumes of images with more efficient, lightweight fine-tuned models that are optimized for your specific use-case. Contact us at support@vlm.run to learn more about how we can help you with your fine-tuning needs.
Try our Image -> JSON API today
Head over to our Image -> JSON to start building your own document processing pipeline with VLM Run. Sign-up for access on our platform.