Find a template image within a set of reference images and locate it with bounding boxes using VLM Run MCP tools.
This example demonstrates how to use VLM Run MCP tools to perform template matching. A source “template” image is used to search for and locate all instances of that template across a batch of different “reference” images. This is ideal for tasks like brand monitoring, content verification, and visual search.
Template matching is a powerful computer vision technique used to find small patches of an image (templates) in a larger image. In this workflow, the user provides an image of a logo and asks the agent to find that logo in three other news articles and photos, then return the images with the logo’s location highlighted.
Try it out yourself with the following prompt (in Claude Desktop or Web):
Given a template image (https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTC_ps_PWPSsQ0ZeX7Zsqvtu_30qfYpdmW-0g&s) check which of the following images contains this template in the provided 3 reference images below.
The agent breaks this complex request down into a series of tool calls:
I’ll help you find the template image in the reference images. Let me start by loading all the images and then use template matching to identify where the template appears.
Load all images
Use the put_image_url
tool to load the template image and each of the three reference images into the system. This is done four times.
Response: Each call returns a unique image ID.
Find template in reference images
Use the find_template
tool, providing the ID of the template and a list of the reference image IDs. A text prompt can be included to help guide the search.
Response: The tool returns a JSON object detailing the matches, confidence scores, and bounding box coordinates for each reference image.
Visualize bounding boxes
For each image where a match was found, use the visualize_bboxes
tool with the coordinates from the find_template
response to draw the red boxes on the images.
Response: Creates a new image with the bounding boxes rendered.
Generate preview links
Use the preview_object_ref
tool on the newly created annotated images to get shareable links.
Response:
The final output clearly summarizes the findings and provides links to the annotated images, showing exactly where the Apple logo was found.
Template Matching Results
This template matching capability can be adapted for many powerful applications that require finding specific visual information within larger sets of data.
Head over to our MCP server to start building your own document processing pipeline with VLM Run. Sign-up for access on our platform.
Find a template image within a set of reference images and locate it with bounding boxes using VLM Run MCP tools.
This example demonstrates how to use VLM Run MCP tools to perform template matching. A source “template” image is used to search for and locate all instances of that template across a batch of different “reference” images. This is ideal for tasks like brand monitoring, content verification, and visual search.
Template matching is a powerful computer vision technique used to find small patches of an image (templates) in a larger image. In this workflow, the user provides an image of a logo and asks the agent to find that logo in three other news articles and photos, then return the images with the logo’s location highlighted.
Try it out yourself with the following prompt (in Claude Desktop or Web):
Given a template image (https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcTC_ps_PWPSsQ0ZeX7Zsqvtu_30qfYpdmW-0g&s) check which of the following images contains this template in the provided 3 reference images below.
The agent breaks this complex request down into a series of tool calls:
I’ll help you find the template image in the reference images. Let me start by loading all the images and then use template matching to identify where the template appears.
Load all images
Use the put_image_url
tool to load the template image and each of the three reference images into the system. This is done four times.
Response: Each call returns a unique image ID.
Find template in reference images
Use the find_template
tool, providing the ID of the template and a list of the reference image IDs. A text prompt can be included to help guide the search.
Response: The tool returns a JSON object detailing the matches, confidence scores, and bounding box coordinates for each reference image.
Visualize bounding boxes
For each image where a match was found, use the visualize_bboxes
tool with the coordinates from the find_template
response to draw the red boxes on the images.
Response: Creates a new image with the bounding boxes rendered.
Generate preview links
Use the preview_object_ref
tool on the newly created annotated images to get shareable links.
Response:
The final output clearly summarizes the findings and provides links to the annotated images, showing exactly where the Apple logo was found.
Template Matching Results
This template matching capability can be adapted for many powerful applications that require finding specific visual information within larger sets of data.
Head over to our MCP server to start building your own document processing pipeline with VLM Run. Sign-up for access on our platform.