Mixed Precision Trainings¶
DynamicLossScalingUpdater¶
-
class
nnabla.experimental.mixed_precision_training.DynamicLossScalingUpdater(solver, loss, data_feeder=<function DynamicLossScalingUpdater.<lambda>>, scale=8.0, scaling_factor=2.0, N=2000, clear_buffer=True, accum_grad=1, weight_decay=None, comm=None, grads=[])[source]¶ Dynamic Loss Scaling Updater for the mixed precision training.
- Parameters
solver (
nnabla.solvers.Solver) – Solver object. E.g., Momentum or Adam.loss (
nnabla.Variable) – Loss variable from which the forward and the backward is called.data_feeder (callable
object, function, or lambda) – Data feederscale (
float) – Loss scale constant. This is dynamically changing during training.scaling_factor (
float) – Scaling factor for the dynamic loss scaling.N (
int) – Interval, the number of iterations in training for increasingloss scalebyscaling_factor.clear_buffer (
bool) – Clears the no longer referenced variables during backpropagation to save memory.accum_grad (
int) – Number of accumulation of gradients. Update method of the Solver is called after theaccum_gradnumber of the forward and backward is called.weight_decay (
float) – Decay constant. Default isNone, not applying the weight decay.comm (
nnabla.communicators.Communicator) – Communicator when to do distributed training. Default isNone.grads (
listofnnabla.NdArray) – The list of gradients to be exchanged when to do distributed training. Default is the emptylist.
-
solver¶ Solver object. E.g., Momentum or Adam.
-
loss¶ Loss variable from which the forward and the backward is called.
- Type
-
N¶ Interval, the number of iterations in training for increasing
loss scalebyscaling_factor.- Type
-
clear_buffer¶ Clears the no longer referenced variables during backpropagation to save memory.
- Type
-
accum_grad¶ Number of accumulation of gradients. Update method of the Solver is called after the
accum_gradnumber of the forward and backward is called.- Type
-
comm¶ Communicator when to do distributed training.
-
grads¶ The list of gradients to be exchanged when to do distributed training.
- Type
Example
Reference: