rikai.types package

Submodules

rikai.types.geometry module

Geometry types

class rikai.types.geometry.Box2d(xmin: float, ymin: float, xmax: float, ymax: float)

Bases: ToNumpy, Sequence, ToDict, Drawable

2-D Bounding Box, defined by (xmin, ymin, xmax, ymax)

xmin

X-coordinate of the top-left point of the box.

Type

float

ymin

Y-coordinate of the top-left point of the box.

Type

float

xmax

X-coordinate of the bottm-right point of the box.

Type

float

ymax

Y-coordinate of the bottm-right point of the box.

Type

float

Example

>>> box = Box2d(1, 2, 3, 4)
>>> box / 2
Box2d(xmin=0.5, ymin=1.0, xmax=1.5, ymax=2.0)
>>> box * (3.5, 5)
Box2d(xmin=3.5, ymin=10.0, xmax=10.5, ymax=20.0)
>>> # Box2d can be used directly with PIL.ImageDraw
>>> draw = PIL.ImageDraw.Draw(img)
>>> draw.rectangle(box, fill="green", width=2)
property area: float

Area of the bounding box

classmethod from_center(center_x: float, center_y: float, width: float, height: float) Box2d

Factory method to construct a Box2d from the center point coordinates: {center_x, center_y, width, height}.

Parameters
  • center_x (float) – X-coordinate of the center point of the box.

  • center_y (float) – Y-coordinate of the center point of the box.

  • width (float) – The width of the box.

  • height (float) – The height of the box.

Return type

Box2d

classmethod from_top_left(xmin: float, ymin: float, width: float, height: float) Box2d

Construct a Box2d from the top-left based coordinates: {x0, y0, width, height}.

Top-left corner of an image / bbox is (0, 0).

Several public datasets, including Coco Dataset, use this coordinations.

Parameters
  • xmin (float) – X-coordinate of the top-left point of the box.

  • ymin (float) – Y-coordinate of the top-left point of the box.

  • width (float) – The width of the box.

  • height (float) – The height of the box.

References

property height: float
iou(other: Union[Box2d, List[Box2d], ndarray]) Union[float, ndarray]

Compute intersection over union(IOU).

static ious(boxes1: Union[List[Box2d], ndarray], boxes2: Union[List[Box2d], ndarray]) Optional[ndarray]

Compute intersection over union(IOU).

Parameters
Returns

For two lists of box2ds, which have the length of N, and M respectively, this function should return a N*M matrix, each element is the iou value (float,[0, 1]). Returns None if one of the inputs is empty.

Return type

numpy.ndarray, optional

Example

>>> import random
>>>
>>> def a_random_box2d():
...   x_min = random.uniform(0, 1)
...   y_min = random.uniform(0, 1)
...   x_max = random.uniform(x_min, 1)
...   y_max = random.uniform(y_min, 1)
...  return Box2d(x_min, y_min, x_max, y_max)
>>>
>>> list1 = [a_random_box2d() for _ in range(0, 2)]
>>>
>>> list2 = [a_random_box2d() for _ in range(0, 3)]
>>>
>>> Box2d.ious(list1, list2)
to_dict() dict
to_numpy() ndarray

Convert a Box2d to numpy ndarray: array([xmin, ymin, xmax, ymax])

property width: float
with_label(text: str, color: str = 'red') rikai.viz.Draw
class rikai.types.geometry.Box3d(center: Point, length: float, width: float, height: float, heading: float)

Bases: ToNumpy, ToDict

A 3-D bounding box

center

Center Point of the bounding box

Type

Point

length

The x dimention of the box

Type

float

width

The y dimention of the box

Type

float

height

The z dimention of the box

Type

float

heading

The heading of the bounding box (in radians). The heading is the angle required to rotate +x to the surface normal of the box front face. It is normalized to [-pi, pi).

Type

float

References

to_dict() dict
to_numpy() ndarray

Returns the content as a numpy ndarray.

class rikai.types.geometry.Mask(data: Union[list, ndarray], width: Optional[int] = None, height: Optional[int] = None, mask_type: Type = Type.POLYGON)

Bases: ToNumpy, ToDict, Drawable

2-d Mask over an image

This 2D mask can be built from:

  • A binary-valued (0 or 1) 2D-numpy matrix.

  • A Run Length Encoded (RLE) data. It supports both row-based RLE (Mask.Type.RLE) or column-based RLE (Mask.Type.COCO_RLE) which is used in the Coco dataset.

  • A Polygon [x0, y0, x1, y1, ..., xn, yn]

Parameters
  • data (list or np.ndarray) – The mask data. Can be a numpy array or a list.

  • width (int, optional) – The width of the image this mask applies to.

  • height (int, optional) – The height of the image this mask applies to.

  • mask_type (Mask.Type) – The type of the mask.

Examples

from pycocotools.coco import COCO
from rikai.types import Mask

coco = COCO("instance_train2017.json")
ann = coco.loadAnns(ann_id)
image = coco.loadImgs(ann["image_id"])
if ann["iscrowed"] == 0:
    mask = Mask.from_polygon(
        ann["segmentation"],
        width=image["width],
        height=image["height"],
    )
else:
    mask = Mask.from_coco_rle(
        ann["segmentation"]["counts"],
        width=image["width],
        height=image["height"],
    )
class Type(value)

Bases: Enum

Mask type.

COCO_RLE = 3
POLYGON = 1
RLE = 2
static from_coco_rle(data: list[int], width: int, height: int) Mask

Convert a COCO RLE mask (segmentation) into Mask

Parameters
  • data (list[int]) – the RLE data

  • height (int) – The height of the image which the mask applies to.

  • width (int) – The width of the image which the mask applies to.

static from_mask(mask: ndarray) Mask

Build mask from a numpy array.

Parameters

mask (np.ndarray) – A binary-valued (0/1) numpy array

static from_polygon(data: list[list[float]], width: int, height: int) Mask

Build mask from a Polygon

Parameters
  • data (list[list[float]]) – Multiple Polygon segmentation data. i.e., [[x0, y0, x1, y1, ...], [x0, y0, x1, y1, ...]])

  • width (int) – The width of the image which the mask applies to.

  • height (int) – The height of the image which the mask applies to.

static from_rle(data: list[int], width: int, height: int) Mask

Convert a (row-based) RLE mask (segmentation) into Mask

Parameters
  • data (list[int]) – the RLE data

  • width (int) – The width of the image which the mask applies to.

  • height (int) – The height of the image which the mask applies to.

iou(other: Mask) float
to_dict() dict
to_mask() ndarray

Convert this mask to a numpy array.

to_numpy() ndarray

Returns the content as a numpy ndarray.

class rikai.types.geometry.Point(x: float, y: float, z: float)

Bases: ToNumpy, ToDict

Point in a 3-D space, specified by (x, y, z) coordinates.

x

The X coordinate.

Type

float

y

The Y coordinate.

Type

float

z

The Z coordinate.

Type

float

to_dict() dict
to_numpy() ndarray

Returns the content as a numpy ndarray.

rikai.types.rle module

rikai.types.rle.decode(rle: np.array, shape: Tuple[int] | Tuple[int, int], order: str = 'C') np.ndarray

Decode (COCO) RLE encoding into a numpy mask.

Parameters
  • rle (np.array) – A 1-D array of RLE encoded data.

  • shape (tuple of ints) – (height, width)

  • order (str) – Numpy array order. If uses Coco-style RLE, order should set to F.

rikai.types.rle.encode(arr: ndarray) array

Run-length encoding a matrix.

Parameters

arr (a data array or n-D metrix/tensor.) –

rikai.types.video module

Video related types and utils

class rikai.types.video.Segment(start_fno: int, end_fno: int)

Bases: object

A video segment bounded by frame numbers

class rikai.types.video.SingleFrameSampler(stream: VideoStream, sample_rate: int = 1, start_frame: int = 0, max_samples: int = -1)

Bases: VideoSampler

A simple sampler that just returns one out of every sample_rate frames

class rikai.types.video.VideoSampler(stream: VideoStream)

Bases: ABC

Subclasses will implement different ways to retrieve samples from a given VideoStream.

class rikai.types.video.VideoStream(uri: str)

Bases: Displayable, ToDict

Represents a particular video stream at a given uri

display(width: Optional[int] = None, height: Optional[int] = None, **kwargs)

Customize visualization in jupyter notebook

Parameters
  • width (int, default None) – Width in pixels. Defaults to the original video width

  • height (int, default None) – Height in pixels. Defaults to the original video height

  • kwargs (dict) – See IPython.display.Video doc for other kwargs

Returns

v

Return type

IPython.display.Video

to_dict() dict
class rikai.types.video.YouTubeVideo(vid: str)

Bases: Displayable

Represents a YouTubeVideo, the basis of many open-source video data sets. This classes uses the ipython display library to integrate with jupyter notebook display and uses youtube-dl to download and create a VideoStream instance which represents a particular video stream file obj

display(width: int = 400, height: int = 300, **kwargs)

Visualization in jupyter notebook with custom options

Parameters
  • width (int, default 400) – Width in pixels

  • height (int, default 300) – Height in pixels

  • kwargs (dict) – See IPython.display.YouTubeVideo for other kwargs

Returns

v

Return type

IPython.display.YouTubeVideo

get_stream(ext: str = 'mp4', quality: str = 'worst') VideoStream

Get a reference to a particular stream

Parameters
Returns

v – VideoStream referencing an actual video resource

Return type

VideoStream

rikai.types.vision module

Vision Related User-defined Types:

class rikai.types.vision.Image(image: Union[bytes, bytearray, IOBase, str, Path])

Bases: ToNumpy, ToPIL, Asset, Displayable, ToDict

An external Image Asset.

It contains a reference URI to an image stored on the remote system.

Parameters

image (bytes, file-like object, str or Path) – It can be the content of image, or a URI / Path of an image.

crop(box: Union[Box2d, List[Box2d]], format: Optional[str] = None) Union[Image, List[Image]]

Crop image specified by the bounding boxes, and returns the cropped images.

Support crop images in batch, to save I/O overhead to download the original image.

Parameters
  • box (Box2d or List[Box2d]) – The bounding box(es) to crop out of this image.

  • format (str, optional) – The image format to save as

Return type

Image or a list of Image

display(**kwargs)

Custom visualizer for this image in jupyter notebook

Parameters

kwargs (dict) – Optional display arguments

Returns

img

Return type

IPython.display.Image

draw(drawable: Union[Drawable, list[Drawable], Draw]) Draw
classmethod from_array(array: ndarray, uri: Optional[Union[str, Path]] = None, mode: Optional[str] = None, format: Optional[str] = None, **kwargs) Image

Create an image in memory from numpy array.

Parameters
  • array (np.ndarray) – Array data

  • uri (str or Path) – The external URI to store the data.

  • mode (str, optional) – The mode which PIL used to create image. See supported modes on PIL document.

  • format (str, optional) – The image format to save as. See supported formats for details.

  • kwargs (dict, optional) – Optional arguments to pass to PIL.Image.save.

See also

PIL.Image.fromarray, numpy_to_image()

static from_pil(img: PILImage, uri: Optional[Union[str, Path]] = None, format: Optional[str] = None, **kwargs) Image

Create an image in memory from a PIL.Image.

Parameters
  • img (PIL.Image) – An PIL Image instance

  • uri (str or Path) – The URI to store the image externally.

  • format (str, optional) –

    The image format to save as. See supported formats for details.

  • kwargs (dict, optional) –

    Optional arguments to pass to PIL.Image.save.

static read(uri: Union[str, Path]) Image

Create an embedded image from external URI

Parameters

uri (str or Path) – The URI pointed to an image.

save(uri: Union[str, Path]) Image

Save the image into a file, specified by the file path or URI.

Parameters

uri (str or Path) – The external URI to store the image to.

Returns

A new image with the new URI / path

Return type

Image

scale(factor: Union[int, float, Tuple]) Image
to_dict() dict
to_embedded() Image

Convert this image into an embedded image.

to_numpy() ndarray

Convert this image into an numpy.ndarray.

to_pil() PILImage

Return an PIL image.

Module contents

Semantic types

noindex