Video Tools

VLM Run’s Orion agents can leverage various video-editing tools such as trimming, sampling, and extracting segments from videos. These tools are designed to help you extract key moments from videos, trim videos to specific segments, and sample frames from videos for analysis.

Full video used to demonstrate video tools such as trimming, sampling, and keyframe detection

Example Usage

For most video trimming examples, you can use the Structured Outputs API to ensure that the returned response can be structured with valid video URLs and frame data.

1. Video Frame Sampling

Extract frames at regular intervals or specific timestamps for analysis.

Extract at least 3 frames from the video for thumbnail generation.

Example of 3 frames extracted from the video for thumbnail generation.

import openai
from pydantic import BaseModel, Field
from typing import List

class VideoFrame(BaseModel):
    url: str = Field(..., description="The URL of the extracted frame")
    timestamp: str = Field(..., description="The timestamp of the extracted frame, in HH:MM:SS.MS format")

class VideoSamplingResponse(BaseModel):
    frames: List[VideoFrame] = Field(..., description="List of extracted frames")

# Initialize the VLM Run client
client = VLMRun(
    base_url="https://agent.vlm.run/v1", api_key="<VLMRUN_API_KEY>"
)

# Extract keyframes for thumbnails
response = client.agent.completions.create(
    model="vlmrun-orion-1:auto",
    messages=[
        {
            "role": "user",
            "content": "Extract keyframes from this video for thumbnail generation, sampling every 5 seconds"
        },
        {
            "role": "video_url",
            "video_url": {"url": "https://storage.googleapis.com/vlm-data-public-prod/hub/examples/video.transcription/bakery.mp4"}
        }
    ],
    response_format={"type": "json_schema", "schema": VideoTrimmingResponse.model_json_schema()}
)

# Print the response
print(response.choices[0].message.content)
>>> {"frames": [{"url": "https://.../frame-1.jpg", "timestamp": "..."}, {"url": "https://.../frame-2.jpg", "timestamp": "00:00:05.000"}, ...]}

# Validate the response
print(VideoSamplingResponse.model_validate_json(response.choices[0].message.content))
>>> VideoSamplingResponse(frames=[{"url": "https://.../frame-1.jpg", "timestamp": "..."}, {"url": "https://.../frame-2.jpg", "timestamp": "00:00:05.000"}, ...])

2. Video Highlight Extraction

Our video agents can extract the best moments from a video, focusing on scoring plays and key actions.

Extract the 3 best moments from this video, including the start and end times of each moment.

Example of 3 video highlight extraction.

from pydantic import BaseModel, Field
from typing import List
from vlmrun.client import VLMRun

class HighlightVideo(BaseModel):
    start_time: str = Field(..., description="Start time of the segment, in HH:MM:SS.MS format")
    end_time: str = Field(..., description="End time of the segment, in HH:MM:SS.MS format")
    url: str = Field(..., description="The URL of the extracted segment")

class HighlightExtractionResponse(BaseModel):
    segments: List[HighlightVideo] = Field(..., description="List of extracted segments")

# Initialize the VLM Run client
client = VLMRun(
    base_url="https://agent.vlm.run/v1", api_key="<VLMRUN_API_KEY>"
)

# Extract multiple segments
response = client.agent.completions.create(
    model="vlmrun-orion-1:auto",
    messages=[
        {
            "role": "user",
            "content": "Extract the 3 best moments from this video, including the start and end times of each moment."
        },
        {
            "role": "video_url",
            "video_url": {"url": "https://storage.googleapis.com/vlm-data-public-prod/hub/examples/video.transcription/bakery.mp4"}
        }
    ],
    response_format={"type": "json_schema", "schema": MultiSegmentResponse.model_json_schema()}
)

# Print the response
print(response.choices[0].message.content)
>>> {"segments": [...], "total_segments": 5, ...}

# Validate the response
print(MultiSegmentResponse.model_validate_json(response.choices[0].message.content))
>>> MultiSegmentResponse(segments=[...], total_segments=5, ...)

3. Time-Based Trimming

Extract specific segments from videos with precise start and end timestamps.

Trim the video from 10 seconds to 30 seconds

Example of time-based trimming of a 20 second video.

from pydantic import BaseModel, Field
from vlmrun.client import VLMRun

class VideoResponse(BaseModel):
  start_time: str = Field(..., description="The start time of the trimmed video (HH:MM:SS.MS format)")
  end_time: str = Field(..., description="The end time of the trimmed video (HH:MM:SS.MS format)")
  url: str = Field(..., description="The URL of the trimmed video")

# Initialize the VLM Run client
client = VLMRun(
    base_url="https://agent.vlm.run/v1", api_key="<VLMRUN_API_KEY>"
)

# Trim video and extract frames
response = client.agent.completions.create(
    model="vlmrun-orion-1:auto",
    messages=[
        {
          "role": "user",
          "content": "Trim the video from 10 seconds to 30 seconds"
        },
        {
          "role": "video_url",
          "video_url": {"url": "https://storage.googleapis.com/vlm-data-public-prod/hub/examples/video.transcription/bakery.mp4"}
        }
    ],
    response_format={"type": "json_schema", "schema": VideoTrimmingResponse.model_json_schema()}
)

# Print the response
print(response.choices[0].message.content)
>>> {"start_time": "00:00:10.000", "end_time": "00:00:30.000", "url": "https://.../trimmed.mp4"}

# Validate the response
print(VideoTrimmingResponse.model_validate_json(response.choices[0].message.content))
>>> VideoTrimmingResponse(start_time="00:00:10.000", end_time="00:00:30.000", url="https://.../trimmed.mp4")

FAQ

What video formats are supported for trimming?

MP4: Most common format with excellent compatibility
MOV: Apple QuickTime format
AVI: Windows video format
MKV: Matroska video format
WebM: Web-optimized format
Quality Preservation: Maintains original video quality in trimmed segments

What are the best practices for frame sampling?

Uniform Sampling: Extract frames at regular intervals (e.g., every 1-5 seconds)
Keyframe Sampling: Extract only keyframes for efficient analysis
Scene-Based: Sample based on scene changes for better content analysis
Quality Balance: Choose appropriate sampling rate based on analysis needs

How precise is the time-based trimming?

Millisecond Precision: Cut videos to exact time ranges with millisecond accuracy
Keyframe Alignment: Align cuts to nearest keyframes for clean edits
Smart Boundaries: Automatically detect optimal cut points
Quality Preservation: Maintain video quality without re-encoding when possible

Try Video Trimming

Experience video trimming and frame sampling with live examples in our interactive chat interface

Get Started

Concepts

Image Capabilities

Document Capabilities

Video Capabilities

Misc

Example Usage

1. Video Frame Sampling

2. Video Highlight Extraction

3. Time-Based Trimming

FAQ

Try Video Trimming

Get Started

Concepts

Image Capabilities

Document Capabilities

Video Capabilities

Misc

​Example Usage

​1. Video Frame Sampling

​2. Video Highlight Extraction

​3. Time-Based Trimming

​FAQ

Try Video Trimming

Example Usage

1. Video Frame Sampling

2. Video Highlight Extraction

3. Time-Based Trimming

FAQ