FrameworkInfo Class¶
The following API can be used to pass MCT framework-related information to use when optimizing the network
- class model_compression_toolkit.core.FrameworkInfo(activation_quantizer_mapping, kernel_channels_mapping, activation_min_max_mapping, layer_min_max_mapping, kernel_ops_attributes_mapping, out_channel_axis_mapping)¶
A class to wrap all information about a specific framework the library needs to quantize a model. Specifically, FrameworkInfo holds lists of layers by how they should be quantized, and multiple mappings such as layer to it kernel channels indices, and a layer to its min/max values, etc. The layers lists are divided into three groups: kernel_ops: Layers that have coefficients and need to get quantized (e.g., Conv2D, Dense, etc.) activation_ops: Layers that their outputs should get quantized (e.g., Add, ReLU, etc.) no_quantization_ops:Layers that should not get quantized (e.g., Reshape, Transpose, etc.)
- Parameters:
activation_quantizer_mapping (Dict[QuantizationMethod, Callable]) – A dictionary mapping from QuantizationMethod to a quantization function.
kernel_channels_mapping (DefaultDict) – Dictionary from a layer to a tuple of its kernel in/out channels indices.
activation_min_max_mapping (Dict[str, tuple]) – Dictionary from an activation function to its min/max output values.
layer_min_max_mapping (Dict[Any, tuple]) – Dictionary from a layer to its min/max output values.
kernel_ops_attributes_mapping (DefaultDict) – Dictionary from a framework operator to a list of its weights attirbutes to quantize.
out_channel_axis_mapping (DefaultDict) – Dictionary of output channels of the model’s layers (for computing statistics per-channel).
Examples
When quantizing a Keras model, if we want to quantize the kernels of Conv2D layers only, we can set, and we know it’s kernel out/in channel indices are (3, 2) respectivly:
>>> import tensorflow as tf >>> kernel_ops = [tf.keras.layers.Conv2D] >>> kernel_channels_mapping = DefaultDict({tf.keras.layers.Conv2D: (3,2)})
Then, we can create a FrameworkInfo object:
>>> FrameworkInfo(kernel_channels_mapping, {}, {})
If an activation layer (tf.keras.layers.Activation) should be quantized and we know it’s min/max outputs range in advanced, we can add it to activation_min_max_mapping for saving the statistics collection time. For example:
>>> activation_min_max_mapping = {'softmax': (0, 1)} >>> FrameworkInfo(kernel_channels_mapping, activation_min_max_mapping, {})
If a layer’s activations should be quantized and we know it’s min/max outputs range in advanced, we can add it to layer_min_max_mapping for saving the statistics collection time. For example:
>>> layer_min_max_mapping = {tf.keras.layers.Softmax: (0, 1)} >>> FrameworkInfo(kernel_channels_mapping, activation_min_max_mapping, layer_min_max_mapping)
ChannelAxis¶
Enum to select the output channels format in the model:
- class model_compression_toolkit.core.ChannelAxis(value)¶
Index of output channels axis:
NHWC - Output channels index is last.
NCHW - Output channels index is 1.