GradientPTQConfig Class

The following API can be used to create a GradientPTQConfig instance which can be used for post training quantization using knowledge distillation from a teacher (float Keras model) to a student (the quantized Keras model)

class model_compression_toolkit.gptq.GradientPTQConfig(n_epochs, optimizer, optimizer_rest=None, loss=None, log_function=None, train_bias=True, rounding_type=RoundingType.SoftQuantizer, use_hessian_based_weights=True, optimizer_quantization_parameter=None, optimizer_bias=None, regularization_factor=REG_DEFAULT, hessian_weights_config=GPTQHessianScoresConfig(), gptq_quantizer_params_override=None)

Configuration to use for quantization with GradientPTQ.

Initialize a GradientPTQConfig.

Parameters:
  • n_epochs (int) – Number of representative dataset epochs to train.

  • optimizer (Any) – Optimizer to use.

  • optimizer_rest (Any) – Optimizer to use for bias and quantizer parameters.

  • loss (Callable) – The loss to use. should accept 6 lists of tensors. 1st list of quantized tensors, the 2nd list is the float tensors, the 3rd is a list of quantized weights, the 4th is a list of float weights, the 5th and 6th lists are the mean and std of the tensors accordingly. see example in multiple_tensors_mse_loss

  • log_function (Callable) – Function to log information about the GPTQ process.

  • train_bias (bool) – Whether to update the bias during the training or not.

  • rounding_type (RoundingType) – An enum that defines the rounding type.

  • use_hessian_based_weights (bool) – Whether to use Hessian-based weights for weighted average loss.

  • optimizer_quantization_parameter (Any) – Optimizer to override the rest optimizer for quantizer parameters.

  • optimizer_bias (Any) – Optimizer to override the rest optimizer for bias.

  • regularization_factor (float) – A floating point number that defines the regularization factor.

  • hessian_weights_config (GPTQHessianScoresConfig) – A configuration that include all necessary arguments to run a computation of Hessian scores for the GPTQ loss.

  • gptq_quantizer_params_override (dict) – A dictionary of parameters to override in GPTQ quantizer instantiation. Defaults to None (no parameters).

GPTQHessianScoresConfig Class

The following API can be used to create a GPTQHessianScoresConfig instance which can be used to define necessary parameters for computing Hessian scores for the GPTQ loss function.

class model_compression_toolkit.gptq.GPTQHessianScoresConfig(hessians_num_samples=GPTQ_HESSIAN_NUM_SAMPLES, norm_scores=True, log_norm=True, scale_log_norm=False, hessian_batch_size=ACT_HESSIAN_DEFAULT_BATCH_SIZE)

Configuration to use for computing the Hessian-based scores for GPTQ loss metric.

Initialize a GPTQHessianWeightsConfig.

Parameters:
  • hessians_num_samples (int) – Number of samples to use for computing the Hessian-based scores.

  • norm_scores (bool) – Whether to normalize the returned scores of the weighted loss function (to get values between 0 and 1).

  • log_norm (bool) – Whether to use log normalization for the GPTQ Hessian-based scores.

  • scale_log_norm (bool) – Whether to scale the final vector of the Hessian-based scores.

  • hessian_batch_size (int) – The Hessian computation batch size. used only if using GPTQ with Hessian-based objective.

RoundingType

class model_compression_toolkit.gptq.RoundingType(value)

An enum for choosing the GPTQ rounding methods:

STE - STRAIGHT-THROUGH ESTIMATOR

SoftQuantizer - SoftQuantizer