Skip to main content
vlm-agent-1 can leverage various image-editing tools such as cropping, rotating, super-resolution, and de-oldify. These tools are designed to help you enhance image quality, extract specific regions, correct orientation, and restore historical photos with modern AI techniques.
For most image editing examples, you can use the Structured Outputs API to ensure that the returned response can be structured with valid image URLs and transformation data.

1. Image Cropping

Extract specific regions or focus on particular subjects within an image.
Crop the clock to tell the time more clearly.

Example of intelligent cropping to focus on the clock.

import openai
from pydantic import BaseModel, Field

class ImageCropResponse(BaseModel):
    url: str = Field(..., description="URL of the cropped image")
    label: str = Field(..., description="Object label of the cropped image")
    xywh: tuple[float, float, float, float] = Field(..., description="Bounding box of the cropped object in normalized coordinates (x, y, width, height)")

# Initialize the client
client = openai.OpenAI(
    base_url="https://agent.vlm.run/v1/openai",
    api_key="<VLMRUN_API_KEY>"
)

# Crop image to focus on main subject
response = client.chat.completions.create(
    model="vlm-agent-1",
    messages=[
        { "role": "user", "content": "Crop the clock to tell the time more clearly" },
        { "role": "image_url", "image_url": {"url": "https://storage.googleapis.com/vlm-data-public-prod/hub/examples/image.agent/clock.jpg", "detail": "auto"}},
    ],
    response_format={"type": "json_schema", "schema": ImageCropResponse.model_json_schema()}
)

# Print the response
print(response.choices[0].message.content)
>>> {"url": "https://.../cropped.jpg", "label": "clock", "xywh": [0.2, 0.2, 0.6, 0.6]}

# Validate the response
print(ImageCropResponse.model_validate_json(response.choices[0].message.content))
>>> ImageCropResponse(url="https://.../cropped.jpg", label="clock", xywh=[0.2, 0.2, 0.6, 0.6])
While the demonstration uses a single crop, we also support cropping multiple regions at once.

2. Image Rotation

Correct image orientation or apply creative rotations for better composition.
Rotate the image 90 degrees clockwise to correct the orientation.
Original image before rotation
Image after 90-degree rotation in the clockwise direction

Example of image rotation by 90 degrees clockwise.

import openai
from pydantic import BaseModel, Field

class ImageRotationResponse(BaseModel):
    url: str = Field(..., description="URL of the rotated image")
    angle: int = Field(..., description="Rotation angle applied (0, 90, 180, 270) degrees in the clockwise direction")


# Initialize the client
client = openai.OpenAI(
    base_url="https://agent.vlm.run/v1/openai",
    api_key="<VLMRUN_API_KEY>"
)

# Rotate image to correct orientation
response = client.chat.completions.create(
    model="vlm-agent-1",
    messages=[
        {
            "role": "user",
            "content": "Rotate this image 90 degrees clockwise"
        },
        {
            "role": "image_url",
            "image_url": {"url": "https://storage.googleapis.com/vlm-data-public-prod/hub/examples/image.object-detection/cats.jpg"}
        }
    ],
    response_format={"type": "json_schema", "schema": ImageRotationResponse.model_json_schema()}
)

# Print the response
print(response.choices[0].message.content)
>>> {"rotated_url": "https://.../rotated.jpg", "rotation_angle": 90, "original_dimensions": [1920, 1080], "new_dimensions": [1080, 1920]}

# Validate the response
print(ImageRotationResponse.model_validate_json(response.choices[0].message.content))
>>> ImageRotationResponse(rotated_url="https://.../rotated.jpg", rotation_angle=90, ...)

3. Super-Resolution Enhancement

Upscale images while maintaining quality and adding realistic details.
Enhance this image using super-resolution to increase its resolution while preserving quality.
Original low-resolution image
Enhanced high-resolution image

Example of super-resolution enhancement showing 4x upscaling with quality preservation.

import openai
from pydantic import BaseModel, Field

class SuperResolutionResponse(BaseModel):
    url: str = Field(..., description="URL of the super-resolution enhanced image")

# Initialize the client
client = openai.OpenAI(
    base_url="https://agent.vlm.run/v1/openai",
    api_key="<VLMRUN_API_KEY>"
)

# Apply super-resolution enhancement
response = client.chat.completions.create(
    model="vlm-agent-1",
    messages=[
        {
            "role": "user",
            "content": "Enhance this image using super-resolution to increase its resolution while preserving quality"
        },
        {
            "role": "image_url",
            "image_url": {"url": "https://storage.googleapis.com/vlm-data-public-prod/hub/examples/image.agent/vegetables-lo.jpg"}
        }
    ],
    response_format={"type": "json_schema", "schema": SuperResolutionResponse.model_json_schema()}
)

# Print the response
print(response.choices[0].message.content)
>>> {"url": "https://.../enhanced.jpg"}

# Validate the response
print(SuperResolutionResponse.model_validate_json(response.choices[0].message.content))
>>> SuperResolutionResponse(url="https://.../enhanced.jpg")

4. De-Oldify (Colorization)

Transform black and white or sepia images into vibrant color photos using AI.
De-oldify this image so that it's colorized and upsampled.
Original black and white image
Colorized image with realistic colors

Example of AI-powered colorization transforming a vintage black and white photo.

import openai
from pydantic import BaseModel, Field

class DeOldifyResponse(BaseModel):
    url: str = Field(..., description="URL of the colorized image")

# Initialize the client
client = openai.OpenAI(
    base_url="https://agent.vlm.run/v1/openai",
    api_key="<VLMRUN_API_KEY>"
)

# Colorize black and white image
response = client.chat.completions.create(
    model="vlm-agent-1",
    messages=[
        {
            "role": "user",
            "content": "De-oldify this image so that it's colorized and upsampled"
        },
        {
            "role": "image_url",
            "image_url": {"url": "https://storage.googleapis.com/vlm-data-public-prod/hub/examples/image.agent/lunch-skyscraper.jpg"}
        }
    ],
    response_format={"type": "json_schema", "schema": DeOldifyResponse.model_json_schema()}
)

# Print the response
print(response.choices[0].message.content)
>>> {"url": "https://.../colorized.jpg"}

# Validate the response
print(DeOldifyResponse.model_validate_json(response.choices[0].message.content))
>>> DeOldifyResponse(url="https://.../colorized.jpg")

FAQ

  • JPEG/JPG: Most common format with excellent compatibility
  • PNG: Lossless format with transparency support
  • TIFF: High-quality format for professional editing
  • WebP: Modern format with superior compression
  • BMP: Uncompressed bitmap format
  • Quality Preservation: Maintains original image quality in all transformations
  • Rule of Thirds: Align subjects with intersection points for better composition
  • Aspect Ratio: Maintain consistent aspect ratios for professional results
  • Subject Focus: Keep the main subject centered or following composition rules
  • Background Removal: Remove distracting elements while preserving context
  • AI-Powered: Uses advanced neural networks for realistic detail generation
  • Multiple Scales: Supports 2x, 4x, and 8x upscaling with quality preservation
  • Detail Enhancement: Intelligently adds realistic textures and patterns
  • Quality Metrics: Provides confidence scores for enhancement quality
  • Historical Accuracy: Uses context-aware AI to suggest period-appropriate colors
  • Natural Colors: Generates realistic skin tones, clothing, and environmental colors
  • Confidence Scoring: Provides confidence levels for color accuracy
  • Region Analysis: Identifies and colors different regions with appropriate palettes

Try Image Tools

Experience image cropping, rotation, super-resolution, and de-oldify with live examples in our interactive chat interface
I