COCO

class rpycocotools.COCO(annotation_path: str, image_folder_path: str) None

Create the COCO dataset object from the annotation file and the image folder.

Parameters:
  • annotation_path (str) – The path to the json annotation file.

  • image_folder_path (str) – The path to the image folder.

Raises:
  • ValueError – If the json file does not exist/cannot be read or if an error happens when deserializing and parsing it.

  • ValueError – If there is an annotation with an image id X, but no image entry has this id.

get_ann(ann_id: int) Annotation

Return the Annotation corresponding to the given annotation id.

Parameters:

ann_id (int) – The id of the Annotation to retrieve.

Returns:

The annotation.

Return type:

Annotation

Raises:

KeyError – If there is no entry in the dataset corresponding to ann_id.

get_anns() list[Annotation]

Return all the annotations of the dataset.

Returns:

The annotations.

Return type:

list[Annotation]

get_cat(cat_id: int) Category

Return the Category corresponding to the given category id.

Parameters:

cat_id (int) – The id of the Category to retrieve.

Returns:

The category.

Return type:

Category

Raises:

KeyError – If there is no entry in the dataset corresponding to cat_id.

get_cats() list[Category]

Return all the categories in the dataset.

Returns:

The categories.

Return type:

list[Category]

get_img(img_id: int) Image

Return the Image corresponding to the given image id.

Parameters:

img_id (int) – The id of the Image to retrieve.

Returns:

The image.

Return type:

Image

Raises:

KeyError – If there is no entry in the dataset corresponding to img_id.

get_imgs() list[Image]

Return all the image entries in the dataset.

Returns:

The images.

Return type:

list[Image]

get_img_anns(img_id: int) list[Annotation]

Return the annotations for the given image id.

Parameters:

img_id (int) – The id of the Image whose annotations should be retrieved.

Returns:

The annotations for the image.

Return type:

list[Annotation]

Raises:

KeyError – If there is no entry in the dataset corresponding to img_id.

visualize_img(img_id: int) None

Visualize an image and its annotations.

Parameters:

img_id (int) – The id of the Image whose annotations should be visualized.

Raises:

ValueError – If the image cannot be drawn (potentially due to it not being in the dataset) or cannot be displayed.

draw_anns(self: Self, img_id: int, draw_bboxes: bool) npt.NDArray[np.uint8]: ...

Draw the annotations on the image and returns it as a (RGB) numpy array.

Parameters:
  • img_id (int) – The id of the Image whose annotations should be visualized.

  • draw_bboxes (bool) – Whether to display bounding boxes or not (if False, only the masks will be drawn).

Raises:

ValueError – If the image cannot be drawn (potentially due to it not being in the dataset) or cannot be displayed.

json(self: Self) str: ...

Return the dataset as a json string.

Returns:

The dataset as a json string.

Return type:

str

__len__(self: Self) int ...

Return number of images in the dataset.

Returns:

The number of images in the dataset.

Return type:

int

class rpycocotools.anns.Annotation(id: int, image_id: int, category_id: int, segmentation: Polygons | PolygonsRS | RLE | COCO_RLE, area: float, bbox: BBox, iscrowd: int) None

Create an annotation used for object detection tasks.

Each object instance annotation contains a series of fields, including the category id and segmentation mask of the object.In [the original COCO dataset](https://cocodataset.org/#home), the segmentation format depends on whether the instance represents a single object (iscrowd=0 in which case polygons are used) or a collection of objects (iscrowd=1 in which case RLE is used). Note that a single object (iscrowd=0) may require multiple polygons, for example if occluded.Crowd annotations (iscrowd=1) are used to label large groups of objects (e.g. a crowd of people). In addition, an enclosing bounding box is provided for each object (box coordinates are measured from the top left image corner and are 0-indexed).Finally, the categories field of the annotation structure stores the mapping of category id to category and supercategory names.

Parameters:
  • id (int) – The id of the annotation.

  • image_id (int) – The id of the image corresponding to this annotation.

  • category_id (int) – The id of the category corresponding to this annotation.

  • segmentation (Polygons | PolygonsRS | RLE | COCO_RLE) – The segmentation data for the annotation, which can be of type Polygons, PolygonsRS, RLE or COCO_RLE.

  • area (float) – The area of the annotation bounding box.

  • bbox (BBox) – The bounding box of the annotation.

  • iscrowd (int) – The iscrowd flag for the annotation, which indicates if the annotation represents a group of objects or not.

class rpycocotools.anns.Category(id: int, name: str, supercategory: str) None

Creates a category used for COCO object detection tasks.

Parameters:
  • id (int) – The id of the category.

  • name (str) – The name of the category.

  • supercategory (str) – The supercategory of the category.

id

The id of the category.

Type:

int

name

The name of the category.

Type:

str

supercategory

The supercategory of the category.

Type:

str

class rpycocotools.anns.BBox(left: float, top: float, width: float, height: float) None

A bounding box used for object detection tasks.

Parameters:
  • left (float) – The top-left x coordinate of the bounding box.

  • top (float) – The top-left y coordinate of the bounding box.

  • width (float) – The width of the bounding box.

  • height (float) – The height of the bounding box.

left

The top-left x coordinate of the bounding box.

Type:

float

top

The top-left y coordinate of the bounding box.

Type:

float

width

The width of the bounding box.

Type:

float

height

The height of the bounding box.

Type:

float

class rpycocotools.anns.Image(id: int, width: int, height: int, file_name: str) None

A COCO image entry.

Parameters:
  • id (int) – The id of the image.

  • width (int) – The width of the image.

  • height (int) – The height of the image.

  • file_name (str) – The file name of the image.

id

The id of the image.

Type:

int

width

The width of the image.

Type:

int

height

The height of the image.

Type:

int

file_name

The file name of the image.

Type:

str

class rpycocotools.anns.PolygonsRS(size: list[int], counts: list[list[float]]) None

Polygon(s) representing a segmentation mask. A Segmentation mask might require multiple polygons if the mask is in multiple parts (in case of partial occlusion for example).

Parameters:
  • size (list[int]) – List with two elements, the height and width of the image associated to the segmentation mask.

  • counts` (list[list[float]]) – Each list[float] represents an enclosed area belonging to the segmentation mask. The length of each list must be even. Every 2*n value represents the x coordinates of the nth point, while the 2*n+1 represents its y coordinates.

size

List with two elements, the height and width of the image associated to the segmentation mask.

Type:

list[int]

counts

The polygons that constitute the mask.

Type:

list[list[float]]

class rpycocotools.anns.RLE(size: list[int], counts: list[int]) None

Segmentation mask compressed as a [Run-Length Encoding](https://en.wikipedia.org/wiki/Run-length_encoding).

Parameters:
  • size (list[int]) – List with two elements, the height and width of the image corresponding to the segmentation mask.

  • counts (list[int]) – The rle representation of the mask.

size

List with two elements, the height and width of the image corresponding to the segmentation mask.

Type:

list[int]

counts

The RLE representation of the mask.

Type:

list[int]

class rpycocotools.anns.COCO_RLE(size: list[int], counts: str) None

Segmentation mask compressed as a [Run-Length Encoding](https://en.wikipedia.org/wiki/Run-length_encoding) and then further encoded into a string.

Parameters:
  • size (list[int]) – List with two elements, the height and width of the image corresponding to the segmentation mask.

  • counts (str) – The COCO RLE representation of the mask.

size

List with two elements, the height and width of the image corresponding to the segmentation mask.

Type:

list[int]

counts

The COCO RLE representation of the mask.

Type:

str