SDK Overview
Core concepts and components of the VLM Run Python SDK
SDK Overview
The VLM Run SDK enables you to extract structured data from unstructured content using VLMs. Whether you’re processing invoices, analyzing images, transcribing audio, or extracting insights from video, the SDK provides a unified interface to transform raw media into actionable business data.
Core Concepts
Domains & Schemas
In VLM Run, domains represent different types of content analysis:
document.invoice
- Extract data from invoicesimage.caption
- Extract caption from the imageaudio.transcription
- Transcribe spoken contentvideo.dashcam-analytics
- Analyze dashcam footage
Each domain has an associated schema that defines the structured output format.
Content Processing Flow
The typical flow for processing content follows these steps:
- Prepare content - File, URL, or in-memory data
- Choose domain - Select appropriate domain for your task
- Generate prediction - Process the content
- Handle results - Work with the structured response
SDK Structure
The SDK is organized around a central VLMRun
client that gives you access to all functionality:
Working with Media Types
Each media type has its own specialized client with consistent patterns.
Images
Documents
Audio
Video
Working with Predictions
All content processing methods return a PredictionResponse
with a consistent structure:
Prediction Statuses
A prediction will have one of these statuses:
enqueued
- Waiting to be processedpending
- Ready to start processingrunning
- Currently being processedcompleted
- Processing finished successfullyfailed
- Processing encountered an error
Handling Async Processing
For content that takes time to process, you can wait for completion:
Using Schemas
Schemas define the structure of prediction responses, providing type-safe access to extracted data.
Working with Standard Schemas
Every domain has a predefined schema:
Using Custom Schemas
You can define your own schema for custom extraction:
Key Resources
Files
Manage files for processing:
Hub
Access domains and schemas:
Models
Get information about available models:
Common Patterns
Process & Extract
The most common pattern is processing content and extracting structured data:
Upload & Process
Another common pattern is uploading files first, then processing them:
Batch Processing
For processing multiple files:
Next Steps
Now that you understand the core concepts, you can: