Model Compression Toolkit User Guide


Model Compression Toolkit (MCT) is an open source project for neural networks optimization that enables users to compress and quantize models. This project enables researchers, developers and engineers an easily way to optimized and quantized state-of-the-art neural network.

MCT project is developed by researchers and engineers working in Sony Semiconductor Israel.


See the MCT install guide for the pip package, and build from source.

From Source:

git clone
python install

From PyPi - latest stable release:

pip install model-compression-toolkit

A nightly version is also available (unstable):

pip install mct-nightly

For using with Tensorflow please install the packages: tensorflow

For using with Pytorch please install the package: torch

Supported Features





Take a look of how you can start using MCT in just a few minutes!

Visit our notebooks and MCT quick start.

API Documentation

Please visit the MCT API documentation here

Technical Constraints

  • MCT doesn’t keep the structure of the model’s output. For example, if the output of a model is a list of lists of Tensors [[out1, out2], out3], the optimized model output will be [out1, out2, out3]


[1] Habi, H.V., Peretz, R., Cohen, E., Dikstein, L., Dror, O., Diamant, I., Jennings, R.H. and Netzer, A., 2021. HPTQ: Hardware-Friendly Post Training Quantization. arXiv preprint.

[2] Gordon, O., Habi, H.V., and Netzer, A., 2023. EPTQ: Enhanced Post-Training Quantization via Label-Free Hessian. arXiv preprint.