LibMTL.weighting
¶
- class AbsWeighting[source]¶
Bases:
torch.nn.Module
An abstract class for weighting strategies.
- init_param(self)¶
Define and initialize some trainable parameters required by specific weighting methods.
- property backward(self, losses, **kwargs)¶
- Parameters
losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.
- class EW[source]¶
Bases:
LibMTL.weighting.abstract_weighting.AbsWeighting
Equal Weighting (EW).
The loss weight for each task is always
1 / T
in every iteration, whereT
denotes the number of tasks.- backward(self, losses, **kwargs)¶
- Parameters
losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.
- class GradNorm¶
Bases:
LibMTL.weighting.abstract_weighting.AbsWeighting
Gradient Normalization (GradNorm).
This method is proposed in GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks (ICML 2018) and implemented by us.
- Parameters
alpha (float, default=1.5) – The strength of the restoring force which pulls tasks back to a common training rate.
- init_param(self)¶
Define and initialize some trainable parameters required by specific weighting methods.
- backward(self, losses, **kwargs)¶
- Parameters
losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.
- class MGDA¶
Bases:
LibMTL.weighting.abstract_weighting.AbsWeighting
Multiple Gradient Descent Algorithm (MGDA).
This method is proposed in Multi-Task Learning as Multi-Objective Optimization (NeurIPS 2018) and implemented by modifying from the official PyTorch implementation.
- Parameters
mgda_gn ({'none', 'l2', 'loss', 'loss+'}, default='none') – The type of gradient normalization.
- backward(self, losses, **kwargs)¶
- Parameters
losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.
- class UW¶
Bases:
LibMTL.weighting.abstract_weighting.AbsWeighting
Uncertainty Weights (UW).
This method is proposed in Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics (CVPR 2018) and implemented by us.
- init_param(self)¶
Define and initialize some trainable parameters required by specific weighting methods.
- backward(self, losses, **kwargs)¶
- Parameters
losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.
- class DWA[source]¶
Bases:
LibMTL.weighting.abstract_weighting.AbsWeighting
Dynamic Weight Average (DWA).
This method is proposed in End-To-End Multi-Task Learning With Attention (CVPR 2019) and implemented by modifying from the official PyTorch implementation.
- Parameters
T (float, default=2.0) – The softmax temperature.
- backward(self, losses, **kwargs)¶
- class GLS¶
Bases:
LibMTL.weighting.abstract_weighting.AbsWeighting
Geometric Loss Strategy (GLS).
This method is proposed in MultiNet++: Multi-Stream Feature Aggregation and Geometric Loss Strategy for Multi-Task Learning (CVPR 2019 workshop) and implemented by us.
- backward(self, losses, **kwargs)¶
- Parameters
losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.
- class GradDrop¶
Bases:
LibMTL.weighting.abstract_weighting.AbsWeighting
Gradient Sign Dropout (GradDrop).
This method is proposed in Just Pick a Sign: Optimizing Deep Multitask Models with Gradient Sign Dropout (NeurIPS 2020) and implemented by us.
- Parameters
leak (float, default=0.0) – The leak parameter for the weighting matrix.
Warning
GradDrop is not supported by parameter gradients, i.e.,
rep_grad
must beTrue
.- backward(self, losses, **kwargs)¶
- Parameters
losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.
- class PCGrad[source]¶
Bases:
LibMTL.weighting.abstract_weighting.AbsWeighting
Project Conflicting Gradients (PCGrad).
This method is proposed in Gradient Surgery for Multi-Task Learning (NeurIPS 2020) and implemented by us.
Warning
PCGrad is not supported by representation gradients, i.e.,
rep_grad
must beFalse
.- backward(self, losses, **kwargs)¶
- Parameters
losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.
- class GradVac[source]¶
Bases:
LibMTL.weighting.abstract_weighting.AbsWeighting
Gradient Vaccine (GradVac).
This method is proposed in Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models (ICLR 2021 Spotlight) and implemented by us.
- Parameters
GradVac_beta (float, default=0.5) – The exponential moving average (EMA) decay parameter.
GradVac_group_type (int, default=0) – The parameter granularity (0: whole_model; 1: all_layer; 2: all_matrix).
Warning
GradVac is not supported by representation gradients, i.e.,
rep_grad
must beFalse
.- init_param(self)¶
- backward(self, losses, **kwargs)¶
- class IMTL¶
Bases:
LibMTL.weighting.abstract_weighting.AbsWeighting
Impartial Multi-task Learning (IMTL).
This method is proposed in Towards Impartial Multi-task Learning (ICLR 2021) and implemented by us.
- init_param(self)¶
Define and initialize some trainable parameters required by specific weighting methods.
- backward(self, losses, **kwargs)¶
- Parameters
losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.
- class CAGrad[source]¶
Bases:
LibMTL.weighting.abstract_weighting.AbsWeighting
Conflict-Averse Gradient descent (CAGrad).
This method is proposed in Conflict-Averse Gradient Descent for Multi-task learning (NeurIPS 2021) and implemented by modifying from the official PyTorch implementation.
- Parameters
calpha (float, default=0.5) – A hyperparameter that controls the convergence rate.
rescale ({0, 1, 2}, default=1) – The type of the gradient rescaling.
Warning
CAGrad is not supported by representation gradients, i.e.,
rep_grad
must beFalse
.- backward(self, losses, **kwargs)¶
- Parameters
losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.
- class Nash_MTL¶
Bases:
LibMTL.weighting.abstract_weighting.AbsWeighting
Nash-MTL.
This method is proposed in Multi-Task Learning as a Bargaining Game (ICML 2022) and implemented by modifying from the official PyTorch implementation.
- Parameters
update_weights_every (int, default=1) – Period of weights update.
optim_niter (int, default=20) – The max iteration of optimization solver.
max_norm (float, default=1.0) – The max norm of the gradients.
Warning
Nash_MTL is not supported by representation gradients, i.e.,
rep_grad
must beFalse
.- init_param(self)¶
Define and initialize some trainable parameters required by specific weighting methods.
- solve_optimization(self, gtg: numpy.array)¶
- backward(self, losses, **kwargs)¶
- Parameters
losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.
- class RLW¶
Bases:
LibMTL.weighting.abstract_weighting.AbsWeighting
Random Loss Weighting (RLW).
This method is proposed in Reasonable Effectiveness of Random Weighting: A Litmus Test for Multi-Task Learning (TMLR 2022) and implemented by us.
- backward(self, losses, **kwargs)¶
- Parameters
losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.
- class MoCo¶
Bases:
LibMTL.weighting.abstract_weighting.AbsWeighting
MoCo.
This method is proposed in Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Approach (ICLR 2023) and implemented based on the author’ sharing code (Heshan Fernando: fernah@rpi.edu).
- Parameters
MoCo_beta (float, default=0.5) – The learning rate of y.
MoCo_beta_sigma (float, default=0.5) – The decay rate of MoCo_beta.
MoCo_gamma (float, default=0.1) – The learning rate of lambd.
MoCo_gamma_sigma (float, default=0.5) – The decay rate of MoCo_gamma.
MoCo_rho (float, default=0) – The ell_2 regularization parameter of lambda’s update.
Warning
MoCo is not supported by representation gradients, i.e.,
rep_grad
must beFalse
.- init_param(self)¶
Define and initialize some trainable parameters required by specific weighting methods.
- backward(self, losses, **kwargs)¶
- Parameters
losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.
- class Aligned_MTL[source]¶
Bases:
LibMTL.weighting.abstract_weighting.AbsWeighting
Aligned-MTL.
This method is proposed in Independent Component Alignment for Multi-Task Learning (CVPR 2023) and implemented by modifying from the official PyTorch implementation.
- backward(self, losses, **kwargs)¶
- Parameters
losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.