PyTorch Quantization Aware Training Model Finalize

model_compression_toolkit.qat.pytorch_quantization_aware_training_finalize_experimental(in_model)

Convert a model fine-tuned by the user to a network with QuantizeWrappers containing InferableQuantizers, that quantizes both the layers weights and outputs

Parameters:

in_model (Model) – Pytorch model to remove QuantizeWrappers.

Returns:

A quantized model with QuantizeWrappers and InferableQuantizers.

Examples

Import MCT:

>>> import model_compression_toolkit as mct

Import a Pytorch model:

>>> from torchvision.models import mobilenet_v2
>>> model = mobilenet_v2(pretrained=True)

Create a random dataset generator:

>>> import numpy as np
>>> def repr_datagen(): yield [np.random.random((1, 224, 224, 3))]

Create a MCT core config, containing the quantization configuration:

>>> config = mct.core.CoreConfig()

Pass the model, the representative dataset generator, the configuration and the target resource utilization to get a quantized model:

>>> quantized_model, quantization_info = mct.qat.pytorch_quantization_aware_training_init_experimental(model, repr_datagen, core_config=config)

Use the quantized model for fine-tuning. Finally, remove the quantizer wrappers and keep a quantize model ready for inference.

>>> quantized_model = mct.qat.pytorch_quantization_aware_training_finalize_experimental(quantized_model)