GradientPTQConfig Class¶

The following API can be used to create a GradientPTQConfig instance which can be used for post training quantization using knowledge distillation from a teacher (float Keras model) to a student (the quantized Keras model)

class model_compression_toolkit.gptq.GradientPTQConfig(n_epochs, optimizer, optimizer_rest=None, loss=None, log_function=None, train_bias=True, rounding_type=RoundingType.SoftQuantizer, use_hessian_based_weights=True, optimizer_quantization_parameter=None, optimizer_bias=None, regularization_factor=REG_DEFAULT, hessian_weights_config=GPTQHessianScoresConfig(), gptq_quantizer_params_override=None)¶

Configuration to use for quantization with GradientPTQ.

Initialize a GradientPTQConfig.

Parameters:

n_epochs (int) – Number of representative dataset epochs to train.
optimizer (Any) – Optimizer to use.
optimizer_rest (Any) – Optimizer to use for bias and quantizer parameters.
loss (Callable) – The loss to use. should accept 6 lists of tensors. 1st list of quantized tensors, the 2nd list is the float tensors, the 3rd is a list of quantized weights, the 4th is a list of float weights, the 5th and 6th lists are the mean and std of the tensors accordingly. see example in multiple_tensors_mse_loss
log_function (Callable) – Function to log information about the GPTQ process.
train_bias (bool) – Whether to update the bias during the training or not.
rounding_type (RoundingType) – An enum that defines the rounding type.
use_hessian_based_weights (bool) – Whether to use Hessian-based weights for weighted average loss.
optimizer_quantization_parameter (Any) – Optimizer to override the rest optimizer for quantizer parameters.
optimizer_bias (Any) – Optimizer to override the rest optimizer for bias.
regularization_factor (float) – A floating point number that defines the regularization factor.
hessian_weights_config (GPTQHessianScoresConfig) – A configuration that include all necessary arguments to run a computation of Hessian scores for the GPTQ loss.
gptq_quantizer_params_override (dict) – A dictionary of parameters to override in GPTQ quantizer instantiation. Defaults to None (no parameters).

GPTQHessianScoresConfig Class¶

The following API can be used to create a GPTQHessianScoresConfig instance which can be used to define necessary parameters for computing Hessian scores for the GPTQ loss function.

class model_compression_toolkit.gptq.GPTQHessianScoresConfig(hessians_num_samples=GPTQ_HESSIAN_NUM_SAMPLES, norm_scores=True, log_norm=True, scale_log_norm=False, hessian_batch_size=ACT_HESSIAN_DEFAULT_BATCH_SIZE)¶

Configuration to use for computing the Hessian-based scores for GPTQ loss metric.

Initialize a GPTQHessianWeightsConfig.

Parameters:

hessians_num_samples (int) – Number of samples to use for computing the Hessian-based scores.
norm_scores (bool) – Whether to normalize the returned scores of the weighted loss function (to get values between 0 and 1).
log_norm (bool) – Whether to use log normalization for the GPTQ Hessian-based scores.
scale_log_norm (bool) – Whether to scale the final vector of the Hessian-based scores.
hessian_batch_size (int) – The Hessian computation batch size. used only if using GPTQ with Hessian-based objective.

RoundingType¶

class model_compression_toolkit.gptq.RoundingType(value)¶

An enum for choosing the GPTQ rounding methods:

STE - STRAIGHT-THROUGH ESTIMATOR

SoftQuantizer - SoftQuantizer