PyTorch Quantization Aware Training Model Finalize¶
- model_compression_toolkit.qat.pytorch_quantization_aware_training_finalize_experimental(in_model)¶
Convert a model fine-tuned by the user to a network with QuantizeWrappers containing InferableQuantizers, that quantizes both the layers weights and outputs
- Parameters:
in_model (Model) – Pytorch model to remove QuantizeWrappers.
- Returns:
A quantized model with QuantizeWrappers and InferableQuantizers.
Examples
Import MCT:
>>> import model_compression_toolkit as mct
Import a Pytorch model:
>>> from torchvision.models import mobilenet_v2 >>> model = mobilenet_v2(pretrained=True)
Create a random dataset generator:
>>> import numpy as np >>> def repr_datagen(): yield [np.random.random((1, 224, 224, 3))]
Create a MCT core config, containing the quantization configuration:
>>> config = mct.core.CoreConfig()
Pass the model, the representative dataset generator, the configuration and the target resource utilization to get a quantized model:
>>> quantized_model, quantization_info = mct.qat.pytorch_quantization_aware_training_init_experimental(model, repr_datagen, core_config=config)
Use the quantized model for fine-tuning. Finally, remove the quantizer wrappers and keep a quantize model ready for inference.
>>> quantized_model = mct.qat.pytorch_quantization_aware_training_finalize_experimental(quantized_model)