Segmentation

Create precise pixel-level segmentation masks for objects, regions, and features in images. Perfect for medical imaging, autonomous driving, photo editing, and augmented reality applications.

Segmented car mask visualized.

Object Segmentation

Segmentation of cars with their masks overlaid

Face Segmentation

Segmentation of individual faces with their masks overlaid

Example detections of objects, people and faces.

Usage Example

For segmentation, we highly recommend using the Structured Outputs API to get the segmentation masks in a structured and validated data format. The output masks will be in PNG format that can be retrieved as a pre-signed URL, per object instance.

from pydantic import BaseModel, Field
from vlmrun.client import VLMRun

class SegmentationMask(BaseModel):
  label: str = Field(..., description="Name of the segmented object or region")
  mask_url: str = Field(..., description="Pre-signed URL to the PNG segmentation mask for this object instance")

class Segmentations(BaseModel):
  segmentations: list[SegmentationMask] = Field(..., description="List of segmented objects or regions with their mask URLs")

# Initialize the VLM Run client
client = VLMRun(
    base_url="https://agent.vlm.run/v1", api_key="<VLMRUN_API_KEY>"
)

# Segment objects in the image
response = client.agent.completions.create(
    model="vlmrun-orion-1:auto",
    messages=[
        {
          "role": "user",
          "content": [
            {"type": "text", "text": "Segment all the cars in this image"},
            {"type": "image_url", "image_url": {"url": "https://storage.googleapis.com/vlm-data-public-prod/hub/examples/image.object-detection/nascar.jpg", "detail": "auto"}}
          ]
        }
    ],
    response_format={"type": "json_schema", "schema": Segmentations.model_json_schema()},
)

# Print the response
print(response.choices[0].message.content)

# Validate the response
print(Segmentations.model_validate_json(response.choices[0].message.content))
>>> Segmentations(segmentations=[SegmentationMask(label="car", mask_url="https://.../mask1.png"), SegmentationMask(label="car", mask_url="https://.../mask2.png")])

FAQ

What different types of segmentation are supported?

Instance Segmentation: Segment individual objects with unique masks
Semantic Segmentation: Classify pixels by category or class
Panoptic Segmentation: Combine instance and semantic segmentation

What format do the segmentation masks come in?

The segmentation masks come in the format of a list of objects with their masks. The masks are in PNG format that can be retrieved as a pre-signed URL, per object instance.

What types of objects and categories can be segmented?

Common Objects

People: person, face, hand, foot
Vehicles: car, truck, bus, motorcycle, bicycle
Animals: dog, cat, bird, horse, cow, sheep
Furniture: chair, table, bed, sofa, desk
Electronics: laptop, phone, tv, keyboard, mouse

Specialized Categories

Medical: organ, tissue, lesion, bone
Nature: tree, grass, sky, water, mountain
Indoor: wall, floor, ceiling, door, window
Outdoor: road, sidewalk, building, sign, traffic_light

What mask formats are supported?

PNG Masks

Binary or grayscale images where each pixel value represents a segment ID
Compatible with most image editing software
Small file size for simple segmentations

Other Formats (Coming soon!): JSON Polygons and COCO Format

Get Started

Concepts

Image Capabilities

Document Capabilities

Video Capabilities

Misc

Object Segmentation

Face Segmentation

Usage Example

FAQ

Get Started

Concepts

Image Capabilities

Document Capabilities

Video Capabilities

Misc

​Object Segmentation

​Face Segmentation

​Usage Example

​FAQ

Object Segmentation

Face Segmentation

Usage Example

FAQ