API Docs¶
Init module for MCT API.
import model_compression_toolkit as mct
ptq¶
pytorch_post_training_quantization: A function to use for post training quantization of PyTorch models.
keras_post_training_quantization: A function to use for post training quantization of Keras models.
gptq¶
pytorch_gradient_post_training_quantization: A function to use for gradient-based post training quantization of Pytorch models.
get_pytorch_gptq_config: A function to create a GradientPTQConfig instance to use for Pytorch models when using GPTQ.
keras_gradient_post_training_quantization: A function to use for gradient-based post training quantization of Keras models.
get_keras_gptq_config: A function to create a GradientPTQConfig instance to use for Keras models when using GPTQ.
GradientPTQConfig: Class to configure GradientPTQ options for gradient based post training quantization.
qat¶
pytorch_quantization_aware_training_init_experimental: A function to use for preparing a Pytorch model for Quantization Aware Training (experimental).
pytorch_quantization_aware_training_finalize_experimental: A function to finalize a Pytorch model after Quantization Aware Training to a model without QuantizeWrappers (experimental).
keras_quantization_aware_training_init_experimental: A function to use for preparing a Keras model for Quantization Aware Training (experimental).
keras_quantization_aware_training_finalize_experimental: A function to finalize a Keras model after Quantization Aware Training to a model without QuantizeWrappers (experimental).
qat_config: Module to create quantization configuration for Quantization-aware Training (experimental).
core¶
CoreConfig: Module to contain configurations of the entire optimization process.
QuantizationConfig: Module to configure the quantization process.
QuantizationErrorMethod: Select a method for quantization parameters’ selection.
MixedPrecisionQuantizationConfig: Module to configure the quantization process when using mixed-precision PTQ.
BitWidthConfig: Module to configure the bit-width manually.
ResourceUtilization: Module to configure resources to use when searching for a configuration for the optimized model.
MpDistanceWeighting: Mixed precision distance metric weighting methods.
network_editor: Module to modify the optimization process for troubleshooting.
pytorch_resource_utilization_data: A function to compute Resource Utilization data that can be used to calculate the desired target resource utilization for PyTorch models.
keras_resource_utilization_data: A function to compute Resource Utilization data that can be used to calculate the desired target resource utilization for Keras models.
data_generation¶
pytorch_data_generation_experimental: A function to generate data for a Pytorch model (experimental).
get_pytorch_data_generation_config: A function to load a DataGenerationConfig for Pytorch data generation (experimental).
keras_data_generation_experimental: A function to generate data for a Keras model (experimental).
get_keras_data_generation_config: A function to generate a DataGenerationConfig for Tensorflow data generation (experimental).
DataGenerationConfig: A configuration class for the data generation process (experimental).
pruning¶
pytorch_pruning_experimental: A function to apply structured pruning for Pytorch models (experimental).
keras_pruning_experimental: A function to apply structured pruning for Keras models (experimental).
PruningConfig: Configuration for the pruning process (experimental).
PruningInfo: Information about the pruned model such as pruned channel indices, etc. (experimental).
xquant¶
xquant_report_pytorch_experimental: A function to generate an explainable quantization report for a quantized Pytorch model (experimental).
xquant_report_keras_experimental: A function to generate an explainable quantization report for a quantized Keras model (experimental).
XQuantConfig: Configuration for the XQuant report (experimental).
exporter¶
exporter: Module that enables to export a quantized model in different serialization formats.
trainable_infrastructure¶
trainable_infrastructure: Module that contains quantization abstraction and quantizers for hardware-oriented model optimization tools.
set_log_folder¶
set_log_folder: Function to set the logger path directory and to enable logging.
keras_load_quantized_model¶
keras_load_quantized_model: A function to load a quantized keras model.
target_platform¶
target_platform: Module to create and model hardware-related settings to optimize the model according to, by the hardware the optimized model will use during inference.
get_target_platform_capabilities: A function to get a target platform model for Tensorflow and Pytorch.
DefaultDict: Util class for creating a TargetPlatformCapabilities.
Indices and tables¶
Note
This documentation is auto-generated using Sphinx