network.activation

GLU

class ding.torch_utils.network.activation.GLU(input_dim: int, output_dim: int, context_dim: int, input_type: str = 'fc')[source]
Overview:

Gating Linear Unit. This class does a thing like this:

Interfaces:

forward

Tip

This module also supports 2D convolution, in which case, the input and context must have the same shape.

forward(x: torch.Tensor, context: torch.Tensor)torch.Tensor[source]
Overview:

Return GLU computed tensor

Arguments:
  • x (torch.Tensor) : the input tensor

  • context (torch.Tensor) : the context tensor

Returns:
  • x (torch.Tensor): the computed tensor

build_activation

Overview:

Return the activation module according to the given type.

Arguments:
  • actvation (str): the type of activation module, now supports [‘relu’, ‘glu’, ‘prelu’]

  • inplace (bool): can optionally do the operation in-place in relu. Default None

Returns:
  • act_func (nn.module): the corresponding activation module

network.nn_module

weight_init

Overview:

Init weight according to the specified type.

Arguments:
  • weight (torch.Tensor): the weight that needed to init

  • init_type (str): the type of init to implement, supports [“xavier”, “kaiming”, “orthogonal”]

  • activation (str): the activation function name, recommend that use only with

    [‘relu’, ‘leaky_relu’].

sequential_pack

Overview:

Pack the layers in the input list to a nn.Sequential module. If there is a convolutional layer in module, an extra attribute out_channels will be added to the module and set to the out_channel of the conv layer.

Arguments:
  • layers (list): the input list

Returns:
  • seq (nn.Sequential): packed sequential container

conv1d_block

Overview:

Create a 1-dim convlution layer with activation and normalization.

Arguments:
  • in_channels (int): Number of channels in the input tensor

  • out_channels (int): Number of channels in the output tensor

  • kernel_size (int): Size of the convolving kernel

  • stride (int): Stride of the convolution

  • padding (int): Zero-padding added to both sides of the input

  • dilation (int): Spacing between kernel elements

  • groups (int): Number of blocked connections from input channels to output channels

  • activation (nn.Module): the optional activation function

  • norm_type (str): type of the normalization

Returns:
  • block (nn.Sequential): a sequential list containing the torch layers of the 1 dim convlution layer

conv2d_block

Overview:

Create a 2-dim convlution layer with activation and normalization.

Arguments:
  • in_channels (int): Number of channels in the input tensor

  • out_channels (int): Number of channels in the output tensor

  • kernel_size (int): Size of the convolving kernel

  • stride (int): Stride of the convolution

  • padding (int): Zero-padding added to both sides of the input

  • dilation (int): Spacing between kernel elements

  • groups (int): Number of blocked connections from input channels to output channels

  • pad_type (str): the way to add padding, include [‘zero’, ‘reflect’, ‘replicate’], default: None

  • activation (nn.Module): the optional activation function

  • norm_type (str): type of the normalization, default set to None, now support [‘BN’, ‘IN’, ‘SyncBN’]

Returns:
  • block (nn.Sequential): a sequential list containing the torch layers of the 2 dim convlution layer

deconv2d_block

Overview:

Create a 2-dim transopse convlution layer with activation and normalization

Arguments:
  • in_channels (int): Number of channels in the input tensor

  • out_channels (int): Number of channels in the output tensor

  • kernel_size (int): Size of the convolving kernel

  • stride (int): Stride of the convolution

  • padding (int): Zero-padding added to both sides of the input

  • pad_type (str): the way to add padding, include [‘zero’, ‘reflect’, ‘replicate’]

  • activation (nn.Module): the optional activation function

  • norm_type (str): type of the normalization

Returns:
  • block (nn.Sequential): a sequential list containing the torch layers of the 2-dim

    transpose convlution layer

fc_block

Overview:

Create a fully-connected block with activation, normalization and dropout. Optional normalization can be done to the dim 1 (across the channels) x -> fc -> norm -> act -> dropout -> out

Arguments:
  • in_channels (int): Number of channels in the input tensor

  • out_channels (int): Number of channels in the output tensor

  • activation (nn.Module): the optional activation function

  • norm_type (str): type of the normalization

  • use_dropout (bool) : whether to use dropout in the fully-connected block

  • dropout_probability (float) : probability of an element to be zeroed in the dropout. Default: 0.5

Returns:
  • block (nn.Sequential): a sequential list containing the torch layers of the fully-connected block

MLP

Overview:

create a multi-layer perceptron using fully-connected blocks with activation, normalization and dropout, optional normalization can be done to the dim 1 (across the channels) x -> fc -> norm -> act -> dropout -> out

Arguments:
  • in_channels (int): Number of channels in the input tensor

  • hidden_channels (int): Number of channels in the hidden tensor

  • out_channels (int): Number of channels in the output tensor

  • layer_num (int): Number of layers

  • layer_fn (Callable): layer function

  • activation (nn.Module): the optional activation function

  • norm_type (str): type of the normalization

  • use_dropout (bool): whether to use dropout in the fully-connected block

  • dropout_probability (float): probability of an element to be zeroed in the dropout. Default: 0.5

Returns:
  • block (nn.Sequential): a sequential list containing the torch layers of the fully-connected block

one_hot

Overview:

Convert a torch.LongTensor to one hot encoding. This implementation can be slightly faster than torch.nn.functional.one_hot

Arguments:
  • val (torch.LongTensor): each element contains the state to be encoded, the range should be [0, num-1]

  • num (int): number of states of the one hot encoding

  • num_first (bool): If num_first is False, the one hot encoding is added as the last;

    Otherwise as the first dimension.

Returns:
  • one_hot (torch.FloatTensor)

Example:
>>> one_hot(2*torch.ones([2,2]).long(),3)
tensor([[[0., 0., 1.],
         [0., 0., 1.]],
        [[0., 0., 1.],
         [0., 0., 1.]]])
>>> one_hot(2*torch.ones([2,2]).long(),3,num_first=True)
tensor([[[0., 0.], [1., 0.]],
        [[0., 1.], [0., 0.]],
        [[1., 0.], [0., 1.]]])

binary_encode

Overview:

Convert elements in a tensor to its binary representation

Arguments:
  • y (torch.Tensor): the tensor to be transferred into its binary representation

  • max_val (torch.Tensor): the max value of the elements in tensor

Returns:
  • binary (torch.Tensor): the input tensor in its binary representation

Example:
>>> binary_encode(torch.tensor([3,2]),torch.tensor(8))
tensor([[0, 0, 1, 1],[0, 0, 1, 0]])

noise_block

Overview:

Create a fully-connected block with activation, normalization and dropout Optional normalization can be done to the dim 1 (across the channels) x -> fc -> norm -> act -> dropout -> out

Arguments:
  • in_channels (int): Number of channels in the input tensor

  • out_channels (int): Number of channels in the output tensor

  • activation (str): the optional activation function

  • norm_type (str): type of the normalization

  • use_dropout (bool) : whether to use dropout in the fully-connected block

  • dropout_probability (float) : probability of an element to be zeroed in the dropout. Default: 0.5

  • simga0 (float): the sigma0 is the defalut noise volumn when init NoiseLinearLayer

Returns:
  • block (nn.Sequential): a sequential list containing the torch layers of the fully-connected block

ChannelShuffle

class ding.torch_utils.network.nn_module.ChannelShuffle(group_num: int)[source]
Overview:

Apply channelShuffle to the input tensor

Interface:

forward

Note

You can see the original paper shuffle net in https://arxiv.org/abs/1707.01083

forward(x: torch.Tensor)torch.Tensor[source]
Overview:

Return the upsampled input

Arguments:
  • x (torch.Tensor): the input tensor

Returns:
  • x (torch.Tensor): the shuffled input tensor

NearestUpsample

class ding.torch_utils.network.nn_module.NearestUpsample(scale_factor: Union[float, List[float]])[source]
Overview:

Upsamples the input to the given member varible scale_factor using mode nearest

Interface:

forward

forward(x: torch.Tensor)torch.Tensor[source]
Overview:

Return the upsampled input

Arguments:
  • x (torch.Tensor): the input tensor

Returns:
  • upsample(torch.Tensor): the upsampled input tensor

BilinearUpsample

class ding.torch_utils.network.nn_module.BilinearUpsample(scale_factor: Union[float, List[float]])[source]
Overview:

Upsamples the input to the given member varible scale_factor using mode biliner

Interface:

forward

forward(x: torch.Tensor)torch.Tensor[source]
Overview:

Return the upsampled input

Arguments:
  • x (torch.Tensor): the input tensor

Returns:
  • upsample(torch.Tensor): the upsampled input tensor

NoiseLinearLayer

class ding.torch_utils.network.nn_module.NoiseLinearLayer(in_channels: int, out_channels: int, sigma0: int = 0.4)[source]
Overview:

Linear layer with random noise.

Interface:

reset_noise, reset_parameters, forward

forward(x: torch.Tensor)[source]
Overview:

Layer forward with noise.

Arguments:
  • x (torch.Tensor): the input tensor

Returns:
  • output (torch.Tensor): the output with noise

reset_noise()[source]
Overview:

Reset noise settinngs in the layer.

reset_parameters()[source]
Overview:

Reset parameters in the layer.

network.normalization

build_normalization

Overview:

Build the corresponding normalization module

Arguments:
  • norm_type (str): type of the normaliztion, now support [‘BN’, ‘IN’, ‘SyncBN’, ‘AdaptiveIN’]

  • dim (int): dimension of the normalization, when norm_type is in [BN, IN]

Returns:
  • norm_func (nn.Module): the corresponding batch normalization function

Note

For beginers, you can refer to <https://zhuanlan.zhihu.com/p/34879333> to learn more about batch normalization.

network.res_block

ResBlock

class ding.torch_utils.network.res_block.ResBlock(in_channels: int, activation: torch.nn.modules.module.Module = ReLU(), norm_type: str = 'BN', res_type: str = 'basic')[source]
Overview:
Residual Block with 2D convolution layers, including 2 types:
basic block:

input channel: C x -> 3*3*C -> norm -> act -> 3*3*C -> norm -> act -> out __________________________________________/+

bottleneck block:

x -> 1*1*(1/4*C) -> norm -> act -> 3*3*(1/4*C) -> norm -> act -> 1*1*C -> norm -> act -> out _____________________________________________________________________________/+

Interfaces:

forward

forward(x: torch.Tensor)torch.Tensor[source]
Overview:

Return the redisual block output

Arguments:
  • x (torch.Tensor): the input tensor

Returns:
  • x(torch.Tensor): the resblock output tensor

ResFCBlock

class ding.torch_utils.network.res_block.ResFCBlock(in_channels: int, activation: torch.nn.modules.module.Module = ReLU(), norm_type: str = 'BN')[source]
Overview:

Residual Block with 2 fully connected block x -> fc1 -> norm -> act -> fc2 -> norm -> act -> out _____________________________________/+

Interfaces:

forward

forward(x: torch.Tensor)torch.Tensor[source]
Overview:

Return the redisual block output

Arguments:
  • x (torch.Tensor): the input tensor

Returns:
  • x(torch.Tensor): the resblock output tensor

network.rnn

LSTMForwardWrapper

class ding.torch_utils.network.rnn.LSTMForwardWrapper[source]
Overview:

A class which provides methods to use before and after forward, in order to wrap the LSTM forward method.

Interfaces:

_before_forward, _after_forward

_after_forward(next_state: List[Tuple[torch.Tensor]], list_next_state: bool = False)Union[torch.Tensor, list][source]
Overview:

Post-process the next_state, return list or tensor type next_states

Arguments:
  • next_state (List[Tuple[torch.Tensor]]): List of tuple which contains the next (h, c)

  • list_next_state (bool): whether return next_state with list format, default set to False

Returns:
  • next_state(Union[torch.Tensor, list]): the formatted next_state

_before_forward(inputs: torch.Tensor, prev_state: Union[torch.Tensor, list])torch.Tensor[source]
Overview:

Preprocess the inputs and previous states

Arguments:
  • inputs (torch.Tensor): input vector of cell, tensor of size [seq_len, batch_size, input_size]

  • prev_state (Union[torch.Tensor, list]): None or tensor of size

    [num_directions*num_layers, batch_size, hidden_size]. If None then prv_state will be initialized to all zeros.

Returns:
  • prev_state (torch.Tensor): batch previous state in lstm

LSTM

class ding.torch_utils.network.rnn.LSTM(input_size: int, hidden_size: int, num_layers: int, norm_type: Optional[str] = None, dropout: float = 0.0)[source]
Overview:

Implimentation of LSTM cell

Interface:

forward

s

For begainners, you can refer to <https://zhuanlan.zhihu.com/p/32085405> to learn the basics about lstm

forward(inputs: torch.Tensor, prev_state: torch.Tensor, list_next_state: bool = True)Tuple[torch.Tensor, Union[torch.Tensor, list]][source]
Overview:

Take the previous state and the input and calculate the output and the nextstate

Arguments:
  • inputs (torch.Tensor): input vector of cell, tensor of size [seq_len, batch_size, input_size]

  • prev_state (torch.Tensor): None or tensor of size

    [num_directions*num_layers, batch_size, hidden_size]

  • list_next_state (bool): whether return next_state with list format, default set to False

Returns:
  • x (torch.Tensor): output from lstm

  • next_state (Union[torch.Tensor, list]): hidden state from lstm

PytorchLSTM

class ding.torch_utils.network.rnn.PytorchLSTM(*args, **kwargs)[source]
Overview:

Wrap the PyTorch nn.LSTM, format the input and output

Interface:

forward

forward(inputs: torch.Tensor, prev_state: torch.Tensor, list_next_state: bool = True)Tuple[torch.Tensor, Union[torch.Tensor, list]][source]
Overview:

Wrapped nn.LSTM.forward

Arguments:
  • inputs (torch.Tensor): input vector of cell, tensor of size

    [seq_len, batch_size, input_size]

  • prev_state (torch.Tensor): None or tensor of size

    [num_directions*num_layers, batch_size, hidden_size]

  • list_next_state (bool): whether return next_state with list format, default set to False

Returns:
  • output (torch.Tensor): output from lstm

  • next_state (Union[torch.Tensor, list]): hidden state from lstm

get_lstm

Overview:

Build and return the corresponding LSTM cell

Arguments:
  • lstm_type (str): version of lstm cell, now support [‘normal’, ‘pytorch’]

  • input_size (int): size of the input vector

  • hidden_size (int): size of the hidden state vector

  • num_layers (int): number of lstm layers

  • norm_type (str): type of the normaliztion, (default: None)

  • dropout (:obj:float): dropout rate, default set to .0

  • seq_len (Optional[int]): seq len, default set to None

  • batch_size (Optional[int]): batch_size len, default set to None

Returns:
  • lstm (Union[LSTM, PytorchLSTM]): the corresponding lstm cell

network.scatter_connection

ScatterConnection

class ding.torch_utils.network.scatter_connection.ScatterConnection(scatter_type: str)[source]
Overview:

Scatter feature to its corresponding location In AlphaStar, each entity is embedded into a tensor, and these tensors are scattered into a feature map with map size.

forward(x: torch.Tensor, spatial_size: Tuple[int, int], location: torch.Tensor)torch.Tensor[source]
Overview:

scatter x into a spatial feature map

Arguments:
  • x (tensor): input tensor :math: (B, M, N) where M means the number of entity, N means the dimension of entity attributes

  • spatial_size (tuple): Tuple[H, W], the size of spatial feature x will be scattered into

  • location (tensor): :math: (B, M, 2) torch.LongTensor, each location should be (y, x)

Returns:
  • output (tensor): :math: (B, N, H, W) where H and W are spatial_size, return the scattered feature map

Shapes:
  • Input: :math: (B, M, N) where M means the number of entity, N means the dimension of entity attributes

  • Size: Tuple type :math: [H, W]

  • Location: :math: (B, M, 2) torch.LongTensor, each location should be (y, x)

  • Output: :math: (B, N, H, W) where H and W are spatial_size

Note

When there are some overlapping in locations, cover mode will result in the loss of information, we use the addition as temporal substitute.

network.soft_argmax

SoftArgmax

class ding.torch_utils.network.soft_argmax.SoftArgmax[source]
Overview:

An nn.Module that computes SoftArgmax

Interface:

__init__, forward

forward(x: torch.Tensor)torch.Tensor[source]
Overview:

Soft-argmax for location regression

Arguments:
  • x (torch.Tensor): predict heat map

Returns:
  • location (torch.Tensor): predict location

Shapes:
  • x: \((B, C, H, W)\), while B is the batch size, C is number of channels,

    H and W stands for height and width

  • location: \((B, 2)\), while B is the batch size

network.transformer

Attention

class ding.torch_utils.network.transformer.Attention(input_dim: int, head_dim: int, output_dim: int, head_num: int, dropout: torch.nn.modules.module.Module)[source]
Overview:

For each entry embedding, compute individual attention across all entries, add them up to get output attention

Interfaces:

split, forward

forward(x: torch.Tensor, mask: Optional[torch.Tensor] = None)torch.Tensor[source]
Overview:

Compute attention

Arguments:
  • x (torch.Tensor): input tensor

  • mask (Optional[torch.Tensor]): mask out invalid entries

Returns:
  • attention (torch.Tensor): attention tensor

split(x: torch.Tensor, T: bool = False)List[torch.Tensor][source]
Overview:

Split input to get multihead queries, keys, values

Arguments:
  • x (torch.Tensor): query or key or value

  • T (bool): whether to transpose output

Returns:
  • x (List[torch.Tensor]): list of output tensors for each head

TransformerLayer

class ding.torch_utils.network.transformer.TransformerLayer(input_dim: int, head_dim: int, hidden_dim: int, output_dim: int, head_num: int, mlp_num: int, dropout: torch.nn.modules.module.Module, activation: torch.nn.modules.module.Module)[source]
Overview:

In transformer layer, first computes entries’s attention and applies a feedforward layer

forward(inputs: Tuple[torch.Tensor, torch.Tensor])Tuple[torch.Tensor, torch.Tensor][source]
Overview:

Transformer layer forward

Arguments:
  • inputs (Tuple[torch.Tensor, torch.Tensor]): x and mask

Returns:
  • output (Tuple[torch.Tensor, torch.Tensor]): predict value and mask

Transformer

class ding.torch_utils.network.transformer.Transformer(input_dim: int, head_dim: int = 128, hidden_dim: int = 1024, output_dim: int = 256, head_num: int = 2, mlp_num: int = 2, layer_num: int = 3, dropout_ratio: float = 0.0, activation: torch.nn.modules.module.Module = ReLU())[source]
Overview:

Transformer implementation

Note

For details refer to Attention is all you need: http://arxiv.org/abs/1706.03762

forward(x: torch.Tensor, mask: Optional[torch.Tensor] = None)torch.Tensor[source]
Overview:

Transformer forward

Arguments:
  • x (torch.Tensor): input tensor. Shape (B, N, C), B is batch size,

    N is number of entries, C is feature dimension

  • mask (Optional[torch.Tensor]): bool tensor, can be used to mask out invalid entries in attention.

    Shape (B, N), B is batch size, N is number of entries

Returns:
  • x (torch.Tensor): transformer output