Get GradientPTQConfig for Pytorch Models

model_compression_toolkit.gptq.get_pytorch_gptq_config(n_epochs, optimizer=None, optimizer_rest=None, loss=None, log_function=None, use_hessian_based_weights=True, regularization_factor=None, hessian_batch_size=ACT_HESSIAN_DEFAULT_BATCH_SIZE, use_hessian_sample_attention=True, gradual_activation_quantization=True)

Create a GradientPTQConfig instance for Pytorch models.

Parameters:
  • n_epochs (int) – Number of epochs for running the representative dataset for fine-tuning.

  • optimizer (Optimizer) – Pytorch optimizer to use for fine-tuning for auxiliry variable.

  • optimizer_rest (Optimizer) – Pytorch optimizer to use for fine-tuning of the bias variable.

  • loss (Callable) – loss to use during fine-tuning. See the default loss function for the exact interface.

  • log_function (Callable) – Function to log information about the gptq process.

  • use_hessian_based_weights (bool) – Whether to use Hessian-based weights for weighted average loss.

  • regularization_factor (float) – A floating point number that defines the regularization factor.

  • hessian_batch_size (int) – Batch size for Hessian computation in Hessian-based weights GPTQ.

  • use_hessian_sample_attention (bool) – whether to use Sample-Layer Attention score for weighted loss.

  • gradual_activation_quantization (bool, GradualActivationQuantizationConfig) – If False, GradualActivationQuantization is disabled. If True, GradualActivationQuantization is enabled with the default settings. GradualActivationQuantizationConfig object can be passed to use non-default settings.

Returns:

a GradientPTQConfig object to use when fine-tuning the quantized model using gptq.

Examples

Import MCT and Create a GradientPTQConfig to run for 5 epochs:

>>> import model_compression_toolkit as mct
>>> gptq_conf = mct.gptq.get_pytorch_gptq_config(n_epochs=5)

Other PyTorch optimizers can be passed with dummy params:

>>> import torch
>>> gptq_conf = mct.gptq.get_pytorch_gptq_config(n_epochs=3, optimizer=torch.optim.Adam([torch.Tensor(1)]))

The configuration can be passed to pytorch_gradient_post_training_quantization() in order to quantize a pytorch model using gptq.

Return type:

GradientPTQConfig