LibMTL.architecture.MMoE

class MMoE(task_name, encoder_class, decoders, rep_grad, multi_input, device, **kwargs)[source]

Bases: LibMTL.architecture.abstract_arch.AbsArchitecture

Multi-gate Mixture-of-Experts (MMoE).

This method is proposed in Modeling Task Relationships in Multi-task Learning with Multi-gate Mixture-of-Experts (KDD 2018) and implemented by us.

Parameters
  • img_size (list) – The size of input data. For example, [3, 244, 244] denotes input images with size 3x224x224.

  • num_experts (int) – The number of experts shared for all tasks. Each expert is an encoder network.

forward(self, inputs, task_name=None)[source]
Parameters
  • inputs (torch.Tensor) – The input data.

  • task_name (str, default=None) – The task name corresponding to inputs if multi_input is True.

Returns

A dictionary of name-prediction pairs of type (str, torch.Tensor).

Return type

dict

get_share_params(self)[source]

Return the shared parameters of the model.

zero_grad_share_params(self)[source]

Set gradients of the shared parameters to zero.