FrameworkInfo Class

The following API can be used to pass MCT framework-related information to use when optimizing the network

class model_compression_toolkit.core.FrameworkInfo(activation_quantizer_mapping, kernel_channels_mapping, activation_min_max_mapping, layer_min_max_mapping, kernel_ops_attributes_mapping, out_channel_axis_mapping)

A class to wrap all information about a specific framework the library needs to quantize a model. Specifically, FrameworkInfo holds lists of layers by how they should be quantized, and multiple mappings such as layer to it kernel channels indices, and a layer to its min/max values, etc. The layers lists are divided into three groups: kernel_ops: Layers that have coefficients and need to get quantized (e.g., Conv2D, Dense, etc.) activation_ops: Layers that their outputs should get quantized (e.g., Add, ReLU, etc.) no_quantization_ops:Layers that should not get quantized (e.g., Reshape, Transpose, etc.)

Parameters:
  • activation_quantizer_mapping (Dict[QuantizationMethod, Callable]) – A dictionary mapping from QuantizationMethod to a quantization function.

  • kernel_channels_mapping (DefaultDict) – Dictionary from a layer to a tuple of its kernel in/out channels indices.

  • activation_min_max_mapping (Dict[str, tuple]) – Dictionary from an activation function to its min/max output values.

  • layer_min_max_mapping (Dict[Any, tuple]) – Dictionary from a layer to its min/max output values.

  • kernel_ops_attributes_mapping (DefaultDict) – Dictionary from a framework operator to a list of its weights attirbutes to quantize.

  • out_channel_axis_mapping (DefaultDict) – Dictionary of output channels of the model’s layers (for computing statistics per-channel).

Examples

When quantizing a Keras model, if we want to quantize the kernels of Conv2D layers only, we can set, and we know it’s kernel out/in channel indices are (3, 2) respectivly:

>>> import tensorflow as tf
>>> kernel_ops = [tf.keras.layers.Conv2D]
>>> kernel_channels_mapping = DefaultDict({tf.keras.layers.Conv2D: (3,2)})

Then, we can create a FrameworkInfo object:

>>> FrameworkInfo(kernel_channels_mapping, {}, {})

If an activation layer (tf.keras.layers.Activation) should be quantized and we know it’s min/max outputs range in advanced, we can add it to activation_min_max_mapping for saving the statistics collection time. For example:

>>> activation_min_max_mapping = {'softmax': (0, 1)}
>>> FrameworkInfo(kernel_channels_mapping, activation_min_max_mapping, {})

If a layer’s activations should be quantized and we know it’s min/max outputs range in advanced, we can add it to layer_min_max_mapping for saving the statistics collection time. For example:

>>> layer_min_max_mapping = {tf.keras.layers.Softmax: (0, 1)}
>>> FrameworkInfo(kernel_channels_mapping, activation_min_max_mapping, layer_min_max_mapping)

ChannelAxis

Enum to select the output channels format in the model:

class model_compression_toolkit.core.ChannelAxis(value)

Index of output channels axis:

NHWC - Output channels index is last.

NCHW - Output channels index is 1.