`LibMTL.weighting`¶

class AbsWeighting[source]¶

Bases: torch.nn.Module

An abstract class for weighting strategies.

init_param()[source]¶: Define and initialize some trainable parameters required by specific weighting methods.

property backward¶
Args:
losses (list): A list of losses of each task.
kwargs (dict): A dictionary of hyperparameters of weighting methods.

class EW[source]¶

Bases: LibMTL.weighting.abstract_weighting.AbsWeighting

Equal Weighting (EW).

The loss weight for each task is always 1 / T in every iteration, where T denotes the number of tasks.

backward(losses, **kwargs)[source]¶

Parameters:

losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.

class GradNorm[source]¶

Bases: LibMTL.weighting.abstract_weighting.AbsWeighting

Gradient Normalization (GradNorm).

This method is proposed in GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks (ICML 2018) and implemented by us.

Parameters:: alpha (float, default=1.5) – The strength of the restoring force which pulls tasks back to a common training rate.

init_param()[source]¶: Define and initialize some trainable parameters required by specific weighting methods.

backward(losses, **kwargs)[source]¶

Parameters:

losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.

class MGDA[source]¶

Bases: LibMTL.weighting.abstract_weighting.AbsWeighting

Multiple Gradient Descent Algorithm (MGDA).

This method is proposed in Multi-Task Learning as Multi-Objective Optimization (NeurIPS 2018) and implemented by modifying from the official PyTorch implementation.

Parameters:: mgda_gn ({'none', 'l2', 'loss', 'loss+'}, default='none') – The type of gradient normalization.

backward(losses, **kwargs)[source]¶

Parameters:

losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.

class UW[source]¶

Bases: LibMTL.weighting.abstract_weighting.AbsWeighting

Uncertainty Weights (UW).

This method is proposed in Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics (CVPR 2018) and implemented by us.

init_param()[source]¶: Define and initialize some trainable parameters required by specific weighting methods.

backward(losses, **kwargs)[source]¶

Parameters:

losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.

class DWA[source]¶

Bases: LibMTL.weighting.abstract_weighting.AbsWeighting

Dynamic Weight Average (DWA).

This method is proposed in End-To-End Multi-Task Learning With Attention (CVPR 2019) and implemented by modifying from the official PyTorch implementation.

Parameters:: T (float, default=2.0) – The softmax temperature.

backward(losses, **kwargs)[source]¶

class GLS[source]¶

Bases: LibMTL.weighting.abstract_weighting.AbsWeighting

Geometric Loss Strategy (GLS).

This method is proposed in MultiNet++: Multi-Stream Feature Aggregation and Geometric Loss Strategy for Multi-Task Learning (CVPR 2019 workshop) and implemented by us.

backward(losses, **kwargs)[source]¶

Parameters:

losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.

class GradDrop[source]¶

Bases: LibMTL.weighting.abstract_weighting.AbsWeighting

Gradient Sign Dropout (GradDrop).

This method is proposed in Just Pick a Sign: Optimizing Deep Multitask Models with Gradient Sign Dropout (NeurIPS 2020) and implemented by us.

Parameters:: leak (float, default=0.0) – The leak parameter for the weighting matrix.

Warning

GradDrop is not supported by parameter gradients, i.e., rep_grad must be True.

backward(losses, **kwargs)[source]¶

Parameters:

losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.

class PCGrad[source]¶

Bases: LibMTL.weighting.abstract_weighting.AbsWeighting

Project Conflicting Gradients (PCGrad).

This method is proposed in Gradient Surgery for Multi-Task Learning (NeurIPS 2020) and implemented by us.

Warning

PCGrad is not supported by representation gradients, i.e., rep_grad must be False.

backward(losses, **kwargs)[source]¶

Parameters:

losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.

class GradVac[source]¶

Bases: LibMTL.weighting.abstract_weighting.AbsWeighting

Gradient Vaccine (GradVac).

This method is proposed in Gradient Vaccine: Investigating and Improving Multi-task Optimization in Massively Multilingual Models (ICLR 2021 Spotlight) and implemented by us.

Parameters:

GradVac_beta (float, default=0.5) – The exponential moving average (EMA) decay parameter.
GradVac_group_type (int, default=0) – The parameter granularity (0: whole_model; 1: all_layer; 2: all_matrix).

Warning

GradVac is not supported by representation gradients, i.e., rep_grad must be False.

init_param()[source]¶

backward(losses, **kwargs)[source]¶

class IMTL[source]¶

Bases: LibMTL.weighting.abstract_weighting.AbsWeighting

Impartial Multi-task Learning (IMTL).

This method is proposed in Towards Impartial Multi-task Learning (ICLR 2021) and implemented by us.

init_param()[source]¶: Define and initialize some trainable parameters required by specific weighting methods.

backward(losses, **kwargs)[source]¶

Parameters:

losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.

class CAGrad[source]¶

Bases: LibMTL.weighting.abstract_weighting.AbsWeighting

Conflict-Averse Gradient descent (CAGrad).

This method is proposed in Conflict-Averse Gradient Descent for Multi-task learning (NeurIPS 2021) and implemented by modifying from the official PyTorch implementation.

Parameters:

calpha (float, default=0.5) – A hyperparameter that controls the convergence rate.
rescale ({0, 1, 2}, default=1) – The type of the gradient rescaling.

Warning

CAGrad is not supported by representation gradients, i.e., rep_grad must be False.

backward(losses, **kwargs)[source]¶

Parameters:

losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.

class Nash_MTL[source]¶

Bases: LibMTL.weighting.abstract_weighting.AbsWeighting

Nash-MTL.

This method is proposed in Multi-Task Learning as a Bargaining Game (ICML 2022) and implemented by modifying from the official PyTorch implementation.

Parameters:

update_weights_every (int, default=1) – Period of weights update.
optim_niter (int, default=20) – The max iteration of optimization solver.
max_norm (float, default=1.0) – The max norm of the gradients.

Warning

Nash_MTL is not supported by representation gradients, i.e., rep_grad must be False.

init_param()[source]¶: Define and initialize some trainable parameters required by specific weighting methods.

solve_optimization(gtg: numpy.array)[source]¶

backward(losses, **kwargs)[source]¶

Parameters:

losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.

class RLW[source]¶

Bases: LibMTL.weighting.abstract_weighting.AbsWeighting

Random Loss Weighting (RLW).

This method is proposed in Reasonable Effectiveness of Random Weighting: A Litmus Test for Multi-Task Learning (TMLR 2022) and implemented by us.

backward(losses, **kwargs)[source]¶

Parameters:

losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.

class MoCo[source]¶

Bases: LibMTL.weighting.abstract_weighting.AbsWeighting

MoCo.

This method is proposed in Mitigating Gradient Bias in Multi-objective Learning: A Provably Convergent Approach (ICLR 2023) and implemented based on the author’ sharing code (Heshan Fernando: fernah@rpi.edu).

Parameters:

MoCo_beta (float, default=0.5) – The learning rate of y.
MoCo_beta_sigma (float, default=0.5) – The decay rate of MoCo_beta.
MoCo_gamma (float, default=0.1) – The learning rate of lambd.
MoCo_gamma_sigma (float, default=0.5) – The decay rate of MoCo_gamma.
MoCo_rho (float, default=0) – The ell_2 regularization parameter of lambda’s update.

Warning

MoCo is not supported by representation gradients, i.e., rep_grad must be False.

init_param()[source]¶: Define and initialize some trainable parameters required by specific weighting methods.

backward(losses, **kwargs)[source]¶

Parameters:

losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.

class Aligned_MTL[source]¶

Bases: LibMTL.weighting.abstract_weighting.AbsWeighting

Aligned-MTL.

This method is proposed in Independent Component Alignment for Multi-Task Learning (CVPR 2023) and implemented by modifying from the official PyTorch implementation.

backward(losses, **kwargs)[source]¶

Parameters:

losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.

class DB_MTL[source]¶

Bases: LibMTL.weighting.abstract_weighting.AbsWeighting

An abstract class for weighting strategies.

init_param()[source]¶: Define and initialize some trainable parameters required by specific weighting methods.

backward(losses, **kwargs)[source]¶

Parameters:

losses (list) – A list of losses of each task.
kwargs (dict) – A dictionary of hyperparameters of weighting methods.

LibMTL.weighting¶

`LibMTL.weighting`¶