Skills work with all VLM Run API generation endpoints (api.vlm.run). Pass skills in the config.skills parameter to automatically apply the skill’s prompt and JSON schema to your request.
Image → JSON
Extract structured JSON from images:
from PIL import Image
from vlmrun.client import VLMRun
from vlmrun.client.types import GenerationConfig, AgentSkill
client = VLMRun(api_key="<VLMRUN_API_KEY>")
response = client.image.generate(
images=[Image.open("photo.jpg")],
model="vlm-1",
config=GenerationConfig(
skills=[AgentSkill(skill_name="invoice-extraction", version="latest")]
)
)
Document → JSON
Extract structured JSON from documents:
from pathlib import Path
from vlmrun.client import VLMRun
from vlmrun.client.types import GenerationConfig, AgentSkill
client = VLMRun(api_key="<VLMRUN_API_KEY>")
response = client.document.generate(
file=Path("invoice.pdf"),
model="vlm-1",
config=GenerationConfig(
skills=[AgentSkill(skill_name="invoice-extraction", version="latest")]
),
)
Video → JSON
Extract structured JSON from videos:
from pathlib import Path
from vlmrun.client import VLMRun
from vlmrun.client.types import GenerationConfig, AgentSkill
client = VLMRun(api_key="<VLMRUN_API_KEY>")
response = client.video.generate(
file=Path("recording.mp4"),
model="vlm-1",
config=GenerationConfig(
skills=[AgentSkill(skill_name="meeting-notes", version="latest")]
),
batch=True,
)
When skills are provided and domain is omitted, the platform creates a dynamic application from the skill’s prompt and JSON schema. You do not need to specify a domain.