Image -> JSON
Generate structured prediction for the given image.
Authorizations
Bearer authentication header of the form Bearer <token>
, where <token>
is your auth token.
Body
Request to the VLM API (i.e. structured prediction).
List of base64 encoded images or URLs to the images.
The domain identifier (e.g. image.caption
).
aerospace.remote-sensing
, document.ocr
, document.generative
, document.invoice
, document.markdown
, document.presentation
, document.receipt
, document.resume
, document.utility-bill
, image.caption
, retail.product-catalog
, retail.ecommerce-product-caption
, video.tv-news
, video.caption
, video.commentary
Optional metadata to pass to the model.
The VLM generation config to be used for /<dtype>/generate.
Unique identifier of the request.
Date and time when the request was created (in UTC timezone)
The URL to call when the request is completed.
1
The model to use for generating the response.
"vlm-1"
Whether to process the image in batch mode (async).
Response
Base prediction response for all API responses.
The usage metrics for the request.
Unique identifier of the response.
Date and time when the request was created (in UTC timezone)
Date and time when the response was completed (in UTC timezone)
The response from the model.
The status of the job.
enqueued
, pending
, running
, completed
, failed
, paused