Keras Quantization Aware Training Model Finalize

model_compression_toolkit.qat.keras_quantization_aware_training_finalize_experimental(in_model)

Convert a model fine-tuned by the user (Trainable quantizers) to a model with Inferable quantizers.

Parameters:

in_model (Model) – Keras model to replace TrainableQuantizer with InferableQuantizer

Returns:

A quantized model with Inferable quantizers

Examples

Import MCT:

>>> import model_compression_toolkit as mct

Import a Keras model:

>>> from tensorflow.keras.applications.mobilenet_v2 import MobileNetV2
>>> model = MobileNetV2()

Create a random dataset generator:

>>> import numpy as np
>>> def repr_datagen(): yield [np.random.random((1, 224, 224, 3))]

Create a MCT core config, containing the quantization configuration:

>>> config = mct.core.CoreConfig()

If mixed precision is desired, create a MCT core config with a mixed-precision configuration, to quantize a model with different bitwidths for different layers. The candidates bitwidth for quantization should be defined in the target platform model:

>>> config = mct.core.CoreConfig(mixed_precision_config=MixedPrecisionQuantizationConfig())

For mixed-precision set a target ResourceUtilization object: Create a ResourceUtilization object to limit our returned model’s size. Note that this value affects only coefficients that should be quantized (for example, the kernel of Conv2D in Keras will be affected by this value, while the bias will not):

>>> ru = mct.core.ResourceUtilization(model.count_params() * 0.75)  # About 0.75 of the model size when quantized with 8 bits.

Pass the model, the representative dataset generator, the configuration and the target resource utilization to get a quantized model:

>>> quantized_model, quantization_info, custom_objects = mct.qat.keras_quantization_aware_training_init_experimental(model, repr_datagen, ru, core_config=config)

Use the quantized model for fine-tuning. For loading the model from file, use the custom_objects dictionary:

>>> quantized_model = tf.keras.models.load_model(model_file, custom_objects=custom_objects)
>>> quantized_model = mct.qat.keras_quantization_aware_training_finalize_experimental(quantized_model)
Return type:

Model