Skip to main content
Create precise pixel-level segmentation masks for objects, regions, and features in images. Perfect for medical imaging, autonomous driving, photo editing, and augmented reality applications.

Segmented car mask visualized.

Object Segmentation

Segmentation of cars with their masks overlaid

Face Segmentation

Segmentation of individual faces with their masks overlaid

Example detections of objects, people and faces.

Usage Example

For segmentation, we highly recommend using the Structured Outputs API to get the segmentation masks in a structured and validated data format. The output masks will be in PNG format that can be retrieved as a pre-signed URL, per object instance.
import openai
from pydantic import BaseModel, Field

class SegmentationMask(BaseModel):
  label: str = Field(..., description="Name of the segmented object or region")
  mask_url: str = Field(..., description="Pre-signed URL to the PNG segmentation mask for this object instance")

class Segmentations(BaseModel):
  segmentations: list[SegmentationMask] = Field(..., description="List of segmented objects or regions with their mask URLs")

# Initialize the client
client = openai.OpenAI(
    base_url="https://agent.vlm.run/v1/openai",
    api_key="<VLMRUN_API_KEY>"
)

# Segment objects in the image
response = client.chat.completions.create(
    model="vlm-agent-1",
    messages=[
        {
          "role": "user",
          "content": [
            {"type": "text", "text": "Segment all the cars in this image"},
            {"type": "image_url", "image_url": {"url": "https://storage.googleapis.com/vlm-data-public-prod/hub/examples/image.object-detection/nascar.jpg", "detail": "auto"}}
          ]
        }
    ],
    response_format={"type": "json_schema", "schema": Segmentations.model_json_schema()},
)

# Print the response
print(response.choices[0].message.content)

# Validate the response
print(Segmentations.model_validate_json(response.choices[0].message.content))
>>> Segmentations(segmentations=[SegmentationMask(label="car", mask_url="https://.../mask1.png"), SegmentationMask(label="car", mask_url="https://.../mask2.png")])

FAQ

  • Instance Segmentation: Segment individual objects with unique masks
  • Semantic Segmentation: Classify pixels by category or class
  • Panoptic Segmentation: Combine instance and semantic segmentation
The segmentation masks come in the format of a list of objects with their masks. The masks are in PNG format that can be retrieved as a pre-signed URL, per object instance.
Common Objects
  • People: person, face, hand, foot
  • Vehicles: car, truck, bus, motorcycle, bicycle
  • Animals: dog, cat, bird, horse, cow, sheep
  • Furniture: chair, table, bed, sofa, desk
  • Electronics: laptop, phone, tv, keyboard, mouse
Specialized Categories
  • Medical: organ, tissue, lesion, bone
  • Nature: tree, grass, sky, water, mountain
  • Indoor: wall, floor, ceiling, door, window
  • Outdoor: road, sidewalk, building, sign, traffic_light
PNG Masks
  • Binary or grayscale images where each pixel value represents a segment ID
  • Compatible with most image editing software
  • Small file size for simple segmentations
Other Formats (Coming soon!): JSON Polygons and COCO Format