MixedPrecisionQuantizationConfig¶
Class to configure the quantization process of the model when quantizing in mixed-precision:
- class model_compression_toolkit.core.MixedPrecisionQuantizationConfig(compute_distance_fn=None, distance_weighting_method=MpDistanceWeighting.AVG, num_of_images=MP_DEFAULT_NUM_SAMPLES, configuration_overwrite=None, num_interest_points_factor=1.0, use_hessian_based_scores=False, norm_scores=True, refine_mp_solution=True, metric_normalization_threshold=1e10, hessian_batch_size=ACT_HESSIAN_DEFAULT_BATCH_SIZE)¶
Class with mixed precision parameters to quantize the input model.
- Parameters:
compute_distance_fn (Callable) – Function to compute a distance between two tensors. If None, using pre-defined distance methods based on the layer type for each layer.
distance_weighting_method (MpDistanceWeighting) – MpDistanceWeighting enum value that provides a function to use when weighting the distances among different layers when computing the sensitivity metric.
num_of_images (int) – Number of images to use to evaluate the sensitivity of a mixed-precision model comparing to the float model.
configuration_overwrite (List[int]) – A list of integers that enables overwrite of mixed precision with a predefined one.
num_interest_points_factor (float) – A multiplication factor between zero and one (represents percentage) to reduce the number of interest points used to calculate the distance metric.
use_hessian_based_scores (bool) – Whether to use Hessian-based scores for weighted average distance metric computation.
norm_scores (bool) – Whether to normalize the returned scores for the weighted distance metric (to get values between 0 and 1).
refine_mp_solution (bool) – Whether to try to improve the final mixed-precision configuration using a greedy algorithm that searches layers to increase their bit-width, or not.
metric_normalization_threshold (float) – A threshold for checking the mixed precision distance metric values, In case of values larger than this threshold, the metric will be scaled to prevent numerical issues.
hessian_batch_size (int) – The Hessian computation batch size. used only if using mixed precision with Hessian-based objective.