rikai.spark.sql package



rikai.spark.sql.exceptions module

exception rikai.spark.sql.exceptions.SpecError(message)

Bases: Exception

rikai.spark.sql.model module

class rikai.spark.sql.model.ModelSpec(spec: Dict[str, Any], validate: bool = True)

Bases: ABC

Model Spec Payload

  • spec (dict) – Dictionary representation of an input spec

  • validate (bool, default True.) – Validate the spec during construction. Default True.

property flavor: str

Model flavor

load_label_fn() Optional[Callable]

Load the function that maps label id to human-readable string

abstract load_model() Any

Load the model artifact specified in this spec

property model_type: ModelType

Return model type

property model_uri: str

Return Model artifact URI

property name: str

Return model name

property options: Dict[str, Any]

Model options

property schema: str

Return the output schema of the model.

validate(schema={'properties': {'model': {'description': 'model description', 'properties': {'flavor': {'type': 'string'}, 'type': {'type': 'string'}, 'uri': {'type': 'string'}}, 'required': ['uri'], 'type': 'object'}, 'name': {'description': 'Model name', 'type': 'string'}, 'schema': {'type': 'string'}, 'version': {'description': 'Model SPEC format version', 'type': 'string'}}, 'required': ['version', 'model'], 'type': 'object'})

Validate model spec


SpecError – If the spec is not well-formatted.

property version: str

Returns spec version.

class rikai.spark.sql.model.ModelType

Bases: ABC

Base-class for a Model Type.

A Model Type defines the functionalities which is required to run an arbitrary ML models in SQL ML, including:

dataType() pyspark.sql.types.DataType

Returns schema as pyspark.sql.types.DataType.

abstract load_model(spec: ModelSpec, **kwargs)

Lazy loading the model from a ModelSpec.

abstract predict(*args, **kwargs) Any

Run model inference and convert return types into Rikai-compatible types.


Release underneath resources if applicable.

It will be called after a model runner finishes a partition in Spark.

abstract schema() str

Return the string value of model schema.


>>> model_type.schema()
... "array<struct<box:box2d, score:float, label_id:int>>"
abstract transform() Callable

A callable to pre-process the data before calling inference.

It will be feed into torch.data.DataLoader or tensorflow.data.Dataset.map().

rikai.spark.sql.schema module

class rikai.spark.sql.schema.CaseChangingStream(stream, upper=False)

Bases: object

exception rikai.spark.sql.schema.SchemaError(message: str)

Bases: Exception

class rikai.spark.sql.schema.SparkDataTypeVisitor

Bases: RikaiModelSchemaVisitor

visitArrayType(ctx: ArrayTypeContext) ArrayType
visitPlainFieldType(ctx: PlainFieldTypeContext) DataType
visitStructField(ctx: StructFieldContext) StructField
visitStructType(ctx: StructTypeContext) StructType
visitUnquotedIdentifier(ctx: UnquotedIdentifierContext) str
rikai.spark.sql.schema.parse_schema(schema_str: str, visitor: ~rikai.spark.sql.generated.RikaiModelSchemaVisitor.RikaiModelSchemaVisitor = <rikai.spark.sql.schema.SparkDataTypeVisitor object>)

Parse schema and returns the data type for the runtime

Module contents