Model Compression Toolkit User Guide¶
Overview¶
Model Compression Toolkit (MCT) is an open source project for neural networks optimization that enables users to compress and quantize models. This project enables researchers, developers and engineers an easily way to optimized and quantized state-of-the-art neural network.
MCT project is developed by researchers and engineers working in Sony Semiconductor Israel.
Install¶
See the MCT install guide for the pip package, and build from source.
From Source:
git clone https://github.com/sony/model_optimization.git
python setup.py install
From PyPi - latest stable release:
pip install model-compression-toolkit
A nightly version is also available (unstable):
pip install mct-nightly
For using with Tensorflow please install the packages: tensorflow
For using with Pytorch please install the package: torch
Supported Features¶
Keras:
Gradient based post training using knowledge distillation [2]
Init model for Quantization Aware Training (Experimental)
Finalize model after Quantization Aware Training (Experimental)
Structured pruning (Experimental)
Data generation (Experimental)
Pytorch:
Gradient based post training using knowledge distillation [2]
Init model for Quantization Aware Training (Experimental)
Finalize model after Quantization Aware Training (Experimental)
Structured pruning (Experimental)
Data generation (Experimental)
Visualization:
Quickstart¶
Take a look of how you can start using MCT in just a few minutes!
Visit our notebooks
API Documentation¶
Please visit the MCT API documentation here
Technical Constraints¶
MCT doesn’t keep the structure of the model’s output. For example, if the output of a model is a list of lists of Tensors [[out1, out2], out3], the optimized model output will be [out1, out2, out3]
References¶
[1] Habi, H.V., Peretz, R., Cohen, E., Dikstein, L., Dror, O., Diamant, I., Jennings, R.H. and Netzer, A., 2021. HPTQ: Hardware-Friendly Post Training Quantization. arXiv preprint.
[2] Gordon, O., Habi, H.V., and Netzer, A., 2023. EPTQ: Enhanced Post-Training Quantization via Label-Free Hessian. arXiv preprint.