
Object detections visualized.
Usage Example
For object detection, we highly recommend using the Structured Outputs API to get the bounding boxes and confidence scores in a structured and validated data format.
The following examples can also be used for face or person detection. The response schema is identical to the object detection example.
FAQ
What objects are supported?
What objects are supported?
- Common Objects: person, car, truck, bus, bicycle, motorcycle
- Animals: dog, cat, bird, horse, cow, sheep
- Food Items: apple, banana, sandwich, pizza, donut, cake
- Electronics: laptop, phone, tv, keyboard, mouse
- Furniture: chair, table, bed, sofa, desk
- And 80+ more COCO dataset classes
What format do the bounding boxes come in?
What format do the bounding boxes come in?
The bounding boxes come in the format of normalized
xywh
, where x
and y
are the top-left corner of the bounding box, and w
and h
are the width and height of the bounding box. All values are between 0 and 1, and normalized by the image size. x
and w
are normalized by the image width, and y
and h
are normalized by the image height.What is the confidence score?
What is the confidence score?
The confidence score is a value between 0 and 1 that indicates the confidence of the detection.