Supported Domains

The VLM Run Hub is a collection of pre-defined domains and schemas for structured data extraction.

Document Domains

Domain	Allowed Inputs	Description
document.bank-statement	documentimage	Bank statement data extraction system that processes bank statements to extract structured financial transaction information including account details, balances, and transaction history.
document.classification	documentimage	Classify documents into one or more categories based on their content, visual features, and metadata.
document.invoice	documentimage	Comprehensive invoice data extraction system that processes invoice images to extract structured information including invoice metadata, customer details, line items, and financial totals.
document.markdown	documentimage	Convert document pages into a highly-accurate content descriptions, including table and chart content.
document.q-and-a	documentimage	Convert document pages into a highly-accurate content descriptions, including table and chart content.
document.receipt	documentimage	Receipt data extraction system that processes receipt images to extract structured information including transaction details, merchant information, and financial totals.
document.resume	documentimage	Resume data extraction system that processes resume images to extract structured information including contact details, education, work experience, skills, and additional sections.
document.us-drivers-license	documentimage	Driver’s license information extraction system that processes driver’s license images to extract structured information including name, address, date of birth, and license details.
document.utility-bill	documentimage	Utility bill data extraction system that processes utility bill images to extract structured information including account details, billing period, charges, and payment information.

Image Domains

Domain	Allowed Inputs	Description
image.classification	image	Classify the image into one of the following categories: [category1, category2, category3, …]
image.caption	image	Extract the caption from the image.
image.tv-news	image	Extract the TV news segment from the image.
image.q-and-a	image	Answer questions about an image.

Audio Domains

Domain	Allowed Inputs	Description
audio.transcription	audio	Gain a competitive edge with real-time analytics from audio, helping you track trending topics, public sentiment, and influential quotes that shape audience perspectives.

Video Domains

Video domains can be used to analyze video content, including transcribing the video content, summarizing the video content, and analyzing the video content. They are categorized into 3 types:

Whole-video Summary (summary): Analyze the whole video content.
Segmented Summary (segmented-summary): Analyze the entire video content and summarize it into multiple segments (key moments, scenes, highlights, etc.). You can provide prompts and cues to guide the segmentation, and the number of segments can be specified (e.g. “Find 5 key moments in this video where the CEO mentions “AI”).
Segmented Analysis (segmented-analysis): Analyze the video content per-segment, with each segment extracting detailed information prompted via the custom video segment model (e.g. json_schema). Each segment is automatically detected with audio and visual cues (e.g. silence, new scene, etc.)

Domain	Allowed Inputs	Type	Description
video.transcription	video	`segmented-analysis`	Transcribe video content, including timestamps, visual descriptions, audio transcriptions, summaries, and topic identification.
video.transcription-summary	video	`segmented-summary`	Analyze video content by breaking it into 5-minute segments with detailed visual and audio analysis, including timestamps, visual descriptions, audio transcriptions, summaries, and topic identification.
video.product-demo-summary	video	`segmented-summary`	Analyze video content by breaking it into 5-minute segments with detailed visual and audio analysis, including timestamps, visual descriptions, audio transcriptions, summaries, and topic identification.
video.conferencing-summary	video	`segmented-summary`	Analyze conferencing videos to extract structured information including segments, topics, presenters, and key events for comprehensive conferencing monitoring and analysis.
video.podcast-summary	video	`segmented-summary`	Analyze podcast videos to extract structured information including segments, topics, presenters, and key events for comprehensive podcast monitoring and analysis.
video.summary	video	`summary`	Analyze whole video content by summarizing the video content into a concise 2-3 sentence summary, and providing a list of topics discussed in the video, at most 5 topics. Use the provided topics enum if possible.
video.dashcam-analytics	imagevideo	`summary`	Analyze dashcam footage to identify and classify events, objects, and situations relevant to road safety and vehicle monitoring, including traffic conditions, incidents, and environmental factors.

Industry-specific Domains

Domain	Allowed Inputs	Description
aerospace.remote-sensing	imagevideo	Satellite image analysis system for identifying and categorizing geographical features, infrastructure, and environmental elements from aerial imagery.
healthcare.patient-consent	document	Extract the information from the provided document and fill in the consent model accordingly. If you aren’t sure, leave it blank.
healthcare.patient-identification	documentimage	Extract the information from the provided document and fill in the identification model accordingly. If you aren’t sure, leave it blank.
healthcare.patient-insurance-card	documentimage	Extract the information from the provided document and fill in the insurance card model accordingly. If you aren’t sure, leave it blank.
healthcare.patient-intake	document	Extract the information from the provided document and fill in the intake form model accordingly. If you aren’t sure, leave it blank.
healthcare.patient-medical-history	document	Extract the information from the provided document and fill in the medical history model accordingly. If you aren’t sure, leave it blank.
healthcare.patient-referral	document	Extract the information from the provided document and fill in the patient referral record model accordingly. If you aren’t sure, leave it blank.
retail.ecommerce-product-caption	image	Product data extraction system that processes product images to extract structured information including visual description, product details, and delivery information.
retail.product-catalog	image	Gain a competitive edge with real-time analytics from product catalog, helping you track trending topics, public sentiment, and influential quotes that shape audience perspectives.

Get Started

Capabilities

Guides - Doc AI

Guides - Image AI

Guides - Video/Audio AI

Guides - Finetuning

Misc

Supported Domains

Document Domains

Image Domains

Audio Domains

Video Domains

Industry-specific Domains

Get Started

Capabilities

Guides - Doc AI

Guides - Image AI

Guides - Video/Audio AI

Guides - Finetuning

Misc

​Document Domains

​Image Domains

​Audio Domains

​Video Domains

​Industry-specific Domains

Document Domains

Image Domains

Audio Domains

Video Domains

Industry-specific Domains