The VLM Run Hub is a collection of pre-defined domains and schemas for structured data extraction.

Document Domains

Domain
Allowed Inputs
Description
document.bank-statementdocumentimageBank statement data extraction system that processes bank statements to extract structured financial transaction information including account details, balances, and transaction history.
document.classificationdocumentimageClassify documents into one or more categories based on their content, visual features, and metadata.
document.invoicedocumentimageComprehensive invoice data extraction system that processes invoice images to extract structured information including invoice metadata, customer details, line items, and financial totals.
document.markdowndocumentimageConvert document pages into a highly-accurate content descriptions, including table and chart content.
document.q-and-adocumentimageConvert document pages into a highly-accurate content descriptions, including table and chart content.
document.receiptdocumentimageReceipt data extraction system that processes receipt images to extract structured information including transaction details, merchant information, and financial totals.
document.resumedocumentimageResume data extraction system that processes resume images to extract structured information including contact details, education, work experience, skills, and additional sections.
document.us-drivers-licensedocumentimageDriver’s license information extraction system that processes driver’s license images to extract structured information including name, address, date of birth, and license details.
document.utility-billdocumentimageUtility bill data extraction system that processes utility bill images to extract structured information including account details, billing period, charges, and payment information.

Image Domains

Domain
Allowed Inputs
Description
image.classificationimageClassify the image into one of the following categories: [category1, category2, category3, …]
image.captionimageExtract the caption from the image.
image.classificationimageClassify the image into one of the following categories: [category1, category2, category3, …]
image.tv-newsimageExtract the TV news segment from the image.
image.q-and-aimageAnswer questions about an image.

Audio Domains

Domain
Allowed Inputs
Description
audio.transcriptionaudioGain a competitive edge with real-time analytics from audio, helping you track trending topics, public sentiment, and influential quotes that shape audience perspectives.

Video Domains

Video domains can be used to analyze video content, including transcribing the video content, summarizing the video content, and analyzing the video content. They are categorized into 3 types:

  • Whole-video Summary (summary): Analyze the whole video content.
  • Segmented Summary (segmented-summary): Analyze the entire video content and summarize it into multiple segments (key moments, scenes, highlights, etc.). You can provide prompts and cues to guide the segmentation, and the number of segments can be specified (e.g. “Find 5 key moments in this video where the CEO mentions “AI”).
  • Segmented Analysis (segmented-analysis): Analyze the video content per-segment, with each segment extracting detailed information prompted via the custom video segment model (e.g. json_schema). Each segment is automatically detected with audio and visual cues (e.g. silence, new scene, etc.)
Domain
Allowed Inputs
Type
Description
video.transcriptionvideosegmented-analysisTranscribe video content, including timestamps, visual descriptions, audio transcriptions, summaries, and topic identification.
video.transcription-summaryvideosegmented-summaryAnalyze video content by breaking it into 5-minute segments with detailed visual and audio analysis, including timestamps, visual descriptions, audio transcriptions, summaries, and topic identification.
video.product-demo-summaryvideosegmented-summaryAnalyze video content by breaking it into 5-minute segments with detailed visual and audio analysis, including timestamps, visual descriptions, audio transcriptions, summaries, and topic identification.
video.conferencing-summaryvideosegmented-summaryAnalyze conferencing videos to extract structured information including segments, topics, presenters, and key events for comprehensive conferencing monitoring and analysis.
video.podcast-summaryvideosegmented-summaryAnalyze podcast videos to extract structured information including segments, topics, presenters, and key events for comprehensive podcast monitoring and analysis.
video.summaryvideosummaryAnalyze whole video content by summarizing the video content into a concise 2-3 sentence summary, and providing a list of topics discussed in the video, at most 5 topics. Use the provided topics enum if possible.
video.dashcam-analyticsimagevideosummaryAnalyze dashcam footage to identify and classify events, objects, and situations relevant to road safety and vehicle monitoring, including traffic conditions, incidents, and environmental factors.

Industry-specific Domains

Domain
Allowed Inputs
Description
aerospace.remote-sensingimagevideoSatellite image analysis system for identifying and categorizing geographical features, infrastructure, and environmental elements from aerial imagery.
healthcare.patient-consentdocumentExtract the information from the provided document and fill in the consent model accordingly. If you aren’t sure, leave it blank.
healthcare.patient-identificationdocumentimageExtract the information from the provided document and fill in the identification model accordingly. If you aren’t sure, leave it blank.
healthcare.patient-insurance-carddocumentimageExtract the information from the provided document and fill in the insurance card model accordingly. If you aren’t sure, leave it blank.
healthcare.patient-intakedocumentExtract the information from the provided document and fill in the intake form model accordingly. If you aren’t sure, leave it blank.
healthcare.patient-medical-historydocumentExtract the information from the provided document and fill in the medical history model accordingly. If you aren’t sure, leave it blank.
healthcare.patient-referraldocumentExtract the information from the provided document and fill in the patient referral record model accordingly. If you aren’t sure, leave it blank.
retail.ecommerce-product-captionimageProduct data extraction system that processes product images to extract structured information including visual description, product details, and delivery information.
retail.product-catalogimageGain a competitive edge with real-time analytics from product catalog, helping you track trending topics, public sentiment, and influential quotes that shape audience perspectives.