rikai.types package¶

Submodules¶

rikai.types.geometry module¶

Geometry types

class rikai.types.geometry.Box2d(xmin: float, ymin: float, xmax: float, ymax: float)¶

Bases: ToNumpy, Sequence, ToDict, Drawable

2-D Bounding Box, defined by (xmin, ymin, xmax, ymax)

xmin¶

X-coordinate of the top-left point of the box.

Type: float

ymin¶

Y-coordinate of the top-left point of the box.

Type: float

xmax¶

X-coordinate of the bottm-right point of the box.

Type: float

ymax¶

Y-coordinate of the bottm-right point of the box.

Type: float

Example

>>> box = Box2d(1, 2, 3, 4)
>>> box / 2
Box2d(xmin=0.5, ymin=1.0, xmax=1.5, ymax=2.0)
>>> box * (3.5, 5)
Box2d(xmin=3.5, ymin=10.0, xmax=10.5, ymax=20.0)
>>> # Box2d can be used directly with PIL.ImageDraw
>>> draw = PIL.ImageDraw.Draw(img)
>>> draw.rectangle(box, fill="green", width=2)

property area: float¶: Area of the bounding box

classmethod from_center(center_x: float, center_y: float, width: float, height: float) → Box2d¶

Factory method to construct a Box2d from the center point coordinates: {center_x, center_y, width, height}.

Parameters

center_x (float) – X-coordinate of the center point of the box.
center_y (float) – Y-coordinate of the center point of the box.
width (float) – The width of the box.
height (float) – The height of the box.

Return type

Box2d

classmethod from_top_left(xmin: float, ymin: float, width: float, height: float) → Box2d¶

Construct a Box2d from the top-left based coordinates: {x0, y0, width, height}.

Top-left corner of an image / bbox is (0, 0).

Several public datasets, including Coco Dataset, use this coordinations.

Parameters

xmin (float) – X-coordinate of the top-left point of the box.
ymin (float) – Y-coordinate of the top-left point of the box.
width (float) – The width of the box.
height (float) – The height of the box.

References

Coco Dataset

property height: float¶

iou(other: Union[Box2d, List[Box2d], ndarray]) → Union[float, ndarray]¶: Compute intersection over union(IOU).

static ious(boxes1: Union[List[Box2d], ndarray], boxes2: Union[List[Box2d], ndarray]) → Optional[ndarray]¶

Compute intersection over union(IOU).

Parameters

boxes1 (numpy.ndarray) – a list of Box2d with length of N
boxes2 (numpy.ndarray) – a list of Box2d with length of M

Returns

For two lists of box2ds, which have the length of N, and M respectively, this function should return a N*M matrix, each element is the iou value (float,[0, 1]). Returns None if one of the inputs is empty.

Return type

numpy.ndarray, optional

Example

>>> import random
>>>
>>> def a_random_box2d():
...   x_min = random.uniform(0, 1)
...   y_min = random.uniform(0, 1)
...   x_max = random.uniform(x_min, 1)
...   y_max = random.uniform(y_min, 1)
...  return Box2d(x_min, y_min, x_max, y_max)
>>>
>>> list1 = [a_random_box2d() for _ in range(0, 2)]
>>>
>>> list2 = [a_random_box2d() for _ in range(0, 3)]
>>>
>>> Box2d.ious(list1, list2)

to_dict() → dict¶

to_numpy() → ndarray¶: Convert a Box2d to numpy ndarray: array([xmin, ymin, xmax, ymax])

property width: float¶

with_label(text: str, color: str = 'red') → rikai.viz.Draw¶

class rikai.types.geometry.Box3d(center: Point, length: float, width: float, height: float, heading: float)¶

Bases: ToNumpy, ToDict

A 3-D bounding box

center¶

Center Point of the bounding box

Type: Point

length¶

The x dimention of the box

Type: float

width¶

The y dimention of the box

Type: float

height¶

The z dimention of the box

Type: float

heading¶

The heading of the bounding box (in radians). The heading is the angle required to rotate +x to the surface normal of the box front face. It is normalized to [-pi, pi).

Type: float

References

Waymo Dataset Spec https://github.com/waymo-research/waymo-open-dataset/blob/master/waymo_open_dataset/label.proto

to_dict() → dict¶

to_numpy() → ndarray¶: Returns the content as a numpy ndarray.

class rikai.types.geometry.Mask(data: Union[list, ndarray], width: Optional[int] = None, height: Optional[int] = None, mask_type: Type = Type.POLYGON)¶

Bases: ToNumpy, ToDict, Drawable

2-d Mask over an image

This 2D mask can be built from:

A binary-valued (0 or 1) 2D-numpy matrix.
A Run Length Encoded (RLE) data. It supports both row-based RLE (Mask.Type.RLE) or column-based RLE (Mask.Type.COCO_RLE) which is used in the Coco dataset.
A Polygon [x0, y0, x1, y1, ..., xn, yn]

Parameters

data (list or np.ndarray) – The mask data. Can be a numpy array or a list.
width (int, optional) – The width of the image this mask applies to.
height (int, optional) – The height of the image this mask applies to.
mask_type (Mask.Type) – The type of the mask.

Examples

from pycocotools.coco import COCO
from rikai.types import Mask

coco = COCO("instance_train2017.json")
ann = coco.loadAnns(ann_id)
image = coco.loadImgs(ann["image_id"])
if ann["iscrowed"] == 0:
    mask = Mask.from_polygon(
        ann["segmentation"],
        width=image["width],
        height=image["height"],
    )
else:
    mask = Mask.from_coco_rle(
        ann["segmentation"]["counts"],
        width=image["width],
        height=image["height"],
    )

class Type(value)¶

Bases: Enum

Mask type.

COCO_RLE = 3¶

POLYGON = 1¶

RLE = 2¶

static from_coco_rle(data: list[int], width: int, height: int) → Mask¶

Convert a COCO RLE mask (segmentation) into Mask

Parameters

data (list[int]) – the RLE data
height (int) – The height of the image which the mask applies to.
width (int) – The width of the image which the mask applies to.

static from_mask(mask: ndarray) → Mask¶

Build mask from a numpy array.

Parameters: mask (np.ndarray) – A binary-valued (0/1) numpy array

static from_polygon(data: list[list[float]], width: int, height: int) → Mask¶

Build mask from a Polygon

Parameters

data (list[list[float]]) – Multiple Polygon segmentation data. i.e., [[x0, y0, x1, y1, ...], [x0, y0, x1, y1, ...]])
width (int) – The width of the image which the mask applies to.
height (int) – The height of the image which the mask applies to.

static from_rle(data: list[int], width: int, height: int) → Mask¶

Convert a (row-based) RLE mask (segmentation) into Mask

Parameters

data (list[int]) – the RLE data
width (int) – The width of the image which the mask applies to.
height (int) – The height of the image which the mask applies to.

iou(other: Mask) → float¶

to_dict() → dict¶

to_mask() → ndarray¶: Convert this mask to a numpy array.

to_numpy() → ndarray¶: Returns the content as a numpy ndarray.

class rikai.types.geometry.Point(x: float, y: float, z: float)¶

Bases: ToNumpy, ToDict

Point in a 3-D space, specified by (x, y, z) coordinates.

x¶

The X coordinate.

Type: float

y¶

The Y coordinate.

Type: float

z¶

The Z coordinate.

Type: float

to_dict() → dict¶

to_numpy() → ndarray¶: Returns the content as a numpy ndarray.

rikai.types.rle module¶

rikai.types.rle.decode(rle: np.array, shape: Tuple[int] | Tuple[int, int], order: str = 'C') → np.ndarray¶

Decode (COCO) RLE encoding into a numpy mask.

Parameters

rle (np.array) – A 1-D array of RLE encoded data.
shape (tuple of ints) – (height, width)
order (str) – Numpy array order. If uses Coco-style RLE, order should set to F.

rikai.types.rle.encode(arr: ndarray) → array¶

Run-length encoding a matrix.

Parameters: arr (a data array or n-D metrix/tensor.) –

rikai.types.video module¶

Video related types and utils

class rikai.types.video.Segment(start_fno: int, end_fno: int)¶

Bases: object

A video segment bounded by frame numbers

class rikai.types.video.SingleFrameSampler(stream: VideoStream, sample_rate: int = 1, start_frame: int = 0, max_samples: int = -1)¶

Bases: VideoSampler

A simple sampler that just returns one out of every sample_rate frames

class rikai.types.video.VideoSampler(stream: VideoStream)¶

Bases: ABC

Subclasses will implement different ways to retrieve samples from a given VideoStream.

class rikai.types.video.VideoStream(uri: str)¶

Bases: Displayable, ToDict

Represents a particular video stream at a given uri

display(width: Optional[int] = None, height: Optional[int] = None, **kwargs)¶

Customize visualization in jupyter notebook

Parameters

width (int, default None) – Width in pixels. Defaults to the original video width
height (int, default None) – Height in pixels. Defaults to the original video height
kwargs (dict) – See IPython.display.Video doc for other kwargs

Returns

v

Return type

IPython.display.Video

to_dict() → dict¶

class rikai.types.video.YouTubeVideo(vid: str)¶

Bases: Displayable

Represents a YouTubeVideo, the basis of many open-source video data sets. This classes uses the ipython display library to integrate with jupyter notebook display and uses youtube-dl to download and create a VideoStream instance which represents a particular video stream file obj

display(width: int = 400, height: int = 300, **kwargs)¶

Visualization in jupyter notebook with custom options

Parameters

width (int, default 400) – Width in pixels
height (int, default 300) – Height in pixels
kwargs (dict) – See IPython.display.YouTubeVideo for other kwargs

Returns

v

Return type

IPython.display.YouTubeVideo

get_stream(ext: str = 'mp4', quality: str = 'worst') → VideoStream¶

Get a reference to a particular stream

Parameters

ext (str, default 'mp4') – The preferred extension type to get. One of [‘ogg’, ‘m4a’, ‘mp4’, ‘flv’, ‘webm’, ‘3gp’] See: https://pythonhosted.org/Pafy/#Pafy.Stream.extension
quality (str, default 'worst') – Either ‘worst’ (lowest bitrate) or ‘best’ (highest bitrate) See: https://pythonhosted.org/Pafy/index.html#Pafy.Pafy.getbest

Returns

v – VideoStream referencing an actual video resource

Return type

VideoStream

rikai.types.vision module¶

Vision Related User-defined Types:

Image

class rikai.types.vision.Image(image: Union[bytes, bytearray, IOBase, str, Path])¶

Bases: ToNumpy, ToPIL, Asset, Displayable, ToDict

An external Image Asset.

It contains a reference URI to an image stored on the remote system.

Parameters: image (bytes, file-like object, str or Path) – It can be the content of image, or a URI / Path of an image.

crop(box: Union[Box2d, List[Box2d]], format: Optional[str] = None) → Union[Image, List[Image]]¶

Crop image specified by the bounding boxes, and returns the cropped images.

Support crop images in batch, to save I/O overhead to download the original image.

Parameters

box (Box2d or List[Box2d]) – The bounding box(es) to crop out of this image.
format (str, optional) – The image format to save as

Return type

Image or a list of Image

display(**kwargs)¶

Custom visualizer for this image in jupyter notebook

Parameters: kwargs (dict) – Optional display arguments
Returns: img
Return type: IPython.display.Image

draw(drawable: Union[Drawable, list[Drawable], Draw]) → Draw¶

classmethod from_array(array: ndarray, uri: Optional[Union[str, Path]] = None, mode: Optional[str] = None, format: Optional[str] = None, **kwargs) → Image¶

Create an image in memory from numpy array.

Parameters

array (np.ndarray) – Array data
uri (str or Path) – The external URI to store the data.
mode (str, optional) – The mode which PIL used to create image. See supported modes on PIL document.
format (str, optional) – The image format to save as. See supported formats for details.
kwargs (dict, optional) – Optional arguments to pass to PIL.Image.save.

Module contents¶

Semantic types

noindex