MixedPrecisionQuantizationConfig

Class to configure the quantization process of the model when quantizing in mixed-precision:

class model_compression_toolkit.core.MixedPrecisionQuantizationConfig(compute_distance_fn=None, distance_weighting_method=MpDistanceWeighting.AVG, num_of_images=MP_DEFAULT_NUM_SAMPLES, configuration_overwrite=None, num_interest_points_factor=1.0, use_hessian_based_scores=False, norm_scores=True, refine_mp_solution=True, metric_normalization_threshold=1e10, hessian_batch_size=ACT_HESSIAN_DEFAULT_BATCH_SIZE)

Class with mixed precision parameters to quantize the input model.

Parameters:
  • compute_distance_fn (Callable) – Function to compute a distance between two tensors. If None, using pre-defined distance methods based on the layer type for each layer.

  • distance_weighting_method (MpDistanceWeighting) – MpDistanceWeighting enum value that provides a function to use when weighting the distances among different layers when computing the sensitivity metric.

  • num_of_images (int) – Number of images to use to evaluate the sensitivity of a mixed-precision model comparing to the float model.

  • configuration_overwrite (List[int]) – A list of integers that enables overwrite of mixed precision with a predefined one.

  • num_interest_points_factor (float) – A multiplication factor between zero and one (represents percentage) to reduce the number of interest points used to calculate the distance metric.

  • use_hessian_based_scores (bool) – Whether to use Hessian-based scores for weighted average distance metric computation.

  • norm_scores (bool) – Whether to normalize the returned scores for the weighted distance metric (to get values between 0 and 1).

  • refine_mp_solution (bool) – Whether to try to improve the final mixed-precision configuration using a greedy algorithm that searches layers to increase their bit-width, or not.

  • metric_normalization_threshold (float) – A threshold for checking the mixed precision distance metric values, In case of values larger than this threshold, the metric will be scaled to prevent numerical issues.

  • hessian_batch_size (int) – The Hessian computation batch size. used only if using mixed precision with Hessian-based objective.