network.activation¶
GLU¶
- class ding.torch_utils.network.activation.GLU(input_dim: int, output_dim: int, context_dim: int, input_type: str = 'fc')[source]¶
- Overview:
Gating Linear Unit. This class does a thing like this:
- Interfaces:
forward
Tip
This module also supports 2D convolution, in which case, the input and context must have the same shape.
build_activation¶
- Overview:
Return the activation module according to the given type.
- Arguments:
actvation (
str): the type of activation module, now supports [‘relu’, ‘glu’, ‘prelu’]inplace (
bool): can optionally do the operation in-place in relu. DefaultNone
- Returns:
act_func (
nn.module): the corresponding activation module
network.nn_module¶
weight_init¶
- Overview:
Init weight according to the specified type.
- Arguments:
weight (
torch.Tensor): the weight that needed to initinit_type (
str): the type of init to implement, supports [“xavier”, “kaiming”, “orthogonal”]- activation (
str): the activation function name, recommend that use only with [‘relu’, ‘leaky_relu’].
- activation (
sequential_pack¶
- Overview:
Pack the layers in the input list to a nn.Sequential module. If there is a convolutional layer in module, an extra attribute out_channels will be added to the module and set to the out_channel of the conv layer.
- Arguments:
layers (
list): the input list
- Returns:
seq (
nn.Sequential): packed sequential container
conv1d_block¶
- Overview:
Create a 1-dim convlution layer with activation and normalization.
- Arguments:
in_channels (
int): Number of channels in the input tensorout_channels (
int): Number of channels in the output tensorkernel_size (
int): Size of the convolving kernelstride (
int): Stride of the convolutionpadding (
int): Zero-padding added to both sides of the inputdilation (
int): Spacing between kernel elementsgroups (
int): Number of blocked connections from input channels to output channelsactivation (
nn.Module): the optional activation functionnorm_type (
str): type of the normalization
- Returns:
block (
nn.Sequential): a sequential list containing the torch layers of the 1 dim convlution layer
conv2d_block¶
- Overview:
Create a 2-dim convlution layer with activation and normalization.
- Arguments:
in_channels (
int): Number of channels in the input tensorout_channels (
int): Number of channels in the output tensorkernel_size (
int): Size of the convolving kernelstride (
int): Stride of the convolutionpadding (
int): Zero-padding added to both sides of the inputdilation (
int): Spacing between kernel elementsgroups (
int): Number of blocked connections from input channels to output channelspad_type (
str): the way to add padding, include [‘zero’, ‘reflect’, ‘replicate’], default: Noneactivation (
nn.Module): the optional activation functionnorm_type (
str): type of the normalization, default set to None, now support [‘BN’, ‘IN’, ‘SyncBN’]
- Returns:
block (
nn.Sequential): a sequential list containing the torch layers of the 2 dim convlution layer
deconv2d_block¶
- Overview:
Create a 2-dim transopse convlution layer with activation and normalization
- Arguments:
in_channels (
int): Number of channels in the input tensorout_channels (
int): Number of channels in the output tensorkernel_size (
int): Size of the convolving kernelstride (
int): Stride of the convolutionpadding (
int): Zero-padding added to both sides of the inputpad_type (
str): the way to add padding, include [‘zero’, ‘reflect’, ‘replicate’]activation (
nn.Module): the optional activation functionnorm_type (
str): type of the normalization
- Returns:
- block (
nn.Sequential): a sequential list containing the torch layers of the 2-dim transpose convlution layer
- block (
Note
ConvTranspose2d (https://pytorch.org/docs/master/generated/torch.nn.ConvTranspose2d.html)
fc_block¶
- Overview:
Create a fully-connected block with activation, normalization and dropout. Optional normalization can be done to the dim 1 (across the channels) x -> fc -> norm -> act -> dropout -> out
- Arguments:
in_channels (
int): Number of channels in the input tensorout_channels (
int): Number of channels in the output tensoractivation (
nn.Module): the optional activation functionnorm_type (
str): type of the normalizationuse_dropout (
bool) : whether to use dropout in the fully-connected blockdropout_probability (
float) : probability of an element to be zeroed in the dropout. Default: 0.5
- Returns:
block (
nn.Sequential): a sequential list containing the torch layers of the fully-connected block
Note
you can refer to nn.linear (https://pytorch.org/docs/master/generated/torch.nn.Linear.html)
MLP¶
- Overview:
create a multi-layer perceptron using fully-connected blocks with activation, normalization and dropout, optional normalization can be done to the dim 1 (across the channels) x -> fc -> norm -> act -> dropout -> out
- Arguments:
in_channels (
int): Number of channels in the input tensorhidden_channels (
int): Number of channels in the hidden tensorout_channels (
int): Number of channels in the output tensorlayer_num (
int): Number of layerslayer_fn (
Callable): layer functionactivation (
nn.Module): the optional activation functionnorm_type (
str): type of the normalizationuse_dropout (
bool): whether to use dropout in the fully-connected blockdropout_probability (
float): probability of an element to be zeroed in the dropout. Default: 0.5
- Returns:
block (
nn.Sequential): a sequential list containing the torch layers of the fully-connected block
Note
you can refer to nn.linear (https://pytorch.org/docs/master/generated/torch.nn.Linear.html)
one_hot¶
- Overview:
Convert a
torch.LongTensorto one hot encoding. This implementation can be slightly faster thantorch.nn.functional.one_hot- Arguments:
val (
torch.LongTensor): each element contains the state to be encoded, the range should be [0, num-1]num (
int): number of states of the one hot encoding- num_first (
bool): Ifnum_firstis False, the one hot encoding is added as the last; Otherwise as the first dimension.
- num_first (
- Returns:
one_hot (
torch.FloatTensor)
- Example:
>>> one_hot(2*torch.ones([2,2]).long(),3) tensor([[[0., 0., 1.], [0., 0., 1.]], [[0., 0., 1.], [0., 0., 1.]]]) >>> one_hot(2*torch.ones([2,2]).long(),3,num_first=True) tensor([[[0., 0.], [1., 0.]], [[0., 1.], [0., 0.]], [[1., 0.], [0., 1.]]])
binary_encode¶
- Overview:
Convert elements in a tensor to its binary representation
- Arguments:
y (
torch.Tensor): the tensor to be transferred into its binary representationmax_val (
torch.Tensor): the max value of the elements in tensor
- Returns:
binary (
torch.Tensor): the input tensor in its binary representation
- Example:
>>> binary_encode(torch.tensor([3,2]),torch.tensor(8)) tensor([[0, 0, 1, 1],[0, 0, 1, 0]])
noise_block¶
- Overview:
Create a fully-connected block with activation, normalization and dropout Optional normalization can be done to the dim 1 (across the channels) x -> fc -> norm -> act -> dropout -> out
- Arguments:
in_channels (
int): Number of channels in the input tensorout_channels (
int): Number of channels in the output tensoractivation (
str): the optional activation functionnorm_type (
str): type of the normalizationuse_dropout (
bool) : whether to use dropout in the fully-connected blockdropout_probability (
float) : probability of an element to be zeroed in the dropout. Default: 0.5simga0 (
float): the sigma0 is the defalut noise volumn when init NoiseLinearLayer
- Returns:
block (
nn.Sequential): a sequential list containing the torch layers of the fully-connected block
Note
you can refer to nn.linear (https://pytorch.org/docs/master/generated/torch.nn.Linear.html)
ChannelShuffle¶
- class ding.torch_utils.network.nn_module.ChannelShuffle(group_num: int)[source]¶
- Overview:
Apply channelShuffle to the input tensor
- Interface:
forward
Note
You can see the original paper shuffle net in https://arxiv.org/abs/1707.01083
NearestUpsample¶
BilinearUpsample¶
NoiseLinearLayer¶
- class ding.torch_utils.network.nn_module.NoiseLinearLayer(in_channels: int, out_channels: int, sigma0: int = 0.4)[source]¶
- Overview:
Linear layer with random noise.
- Interface:
reset_noise, reset_parameters, forward
network.normalization¶
build_normalization¶
- Overview:
Build the corresponding normalization module
- Arguments:
norm_type (
str): type of the normaliztion, now support [‘BN’, ‘IN’, ‘SyncBN’, ‘AdaptiveIN’]dim (
int): dimension of the normalization, when norm_type is in [BN, IN]
- Returns:
norm_func (
nn.Module): the corresponding batch normalization function
Note
For beginers, you can refer to <https://zhuanlan.zhihu.com/p/34879333> to learn more about batch normalization.
network.res_block¶
ResBlock¶
- class ding.torch_utils.network.res_block.ResBlock(in_channels: int, activation: torch.nn.modules.module.Module = ReLU(), norm_type: str = 'BN', res_type: str = 'basic')[source]¶
- Overview:
- Residual Block with 2D convolution layers, including 2 types:
- basic block:
input channel: C x -> 3*3*C -> norm -> act -> 3*3*C -> norm -> act -> out __________________________________________/+
- bottleneck block:
x -> 1*1*(1/4*C) -> norm -> act -> 3*3*(1/4*C) -> norm -> act -> 1*1*C -> norm -> act -> out _____________________________________________________________________________/+
- Interfaces:
forward
ResFCBlock¶
- class ding.torch_utils.network.res_block.ResFCBlock(in_channels: int, activation: torch.nn.modules.module.Module = ReLU(), norm_type: str = 'BN')[source]¶
- Overview:
Residual Block with 2 fully connected block x -> fc1 -> norm -> act -> fc2 -> norm -> act -> out _____________________________________/+
- Interfaces:
forward
network.rnn¶
LSTMForwardWrapper¶
- class ding.torch_utils.network.rnn.LSTMForwardWrapper[source]¶
- Overview:
A class which provides methods to use before and after forward, in order to wrap the LSTM forward method.
- Interfaces:
_before_forward, _after_forward
- _after_forward(next_state: List[Tuple[torch.Tensor]], list_next_state: bool = False) → Union[torch.Tensor, list][source]¶
- Overview:
Post-process the next_state, return list or tensor type next_states
- Arguments:
next_state (
List[Tuple[torch.Tensor]]): List of tuple which contains the next (h, c)list_next_state (
bool): whether return next_state with list format, default set to False
- Returns:
next_state(
Union[torch.Tensor, list]): the formatted next_state
- _before_forward(inputs: torch.Tensor, prev_state: Union[torch.Tensor, list]) → torch.Tensor[source]¶
- Overview:
Preprocess the inputs and previous states
- Arguments:
inputs (
torch.Tensor): input vector of cell, tensor of size [seq_len, batch_size, input_size]- prev_state (
Union[torch.Tensor, list]): None or tensor of size [num_directions*num_layers, batch_size, hidden_size]. If None then prv_state will be initialized to all zeros.
- prev_state (
- Returns:
prev_state (
torch.Tensor): batch previous state in lstm
LSTM¶
- class ding.torch_utils.network.rnn.LSTM(input_size: int, hidden_size: int, num_layers: int, norm_type: Optional[str] = None, dropout: float = 0.0)[source]¶
- Overview:
Implimentation of LSTM cell
- Interface:
forward
- s
For begainners, you can refer to <https://zhuanlan.zhihu.com/p/32085405> to learn the basics about lstm
- forward(inputs: torch.Tensor, prev_state: torch.Tensor, list_next_state: bool = True) → Tuple[torch.Tensor, Union[torch.Tensor, list]][source]¶
- Overview:
Take the previous state and the input and calculate the output and the nextstate
- Arguments:
inputs (
torch.Tensor): input vector of cell, tensor of size [seq_len, batch_size, input_size]- prev_state (
torch.Tensor): None or tensor of size [num_directions*num_layers, batch_size, hidden_size]
- prev_state (
list_next_state (
bool): whether return next_state with list format, default set to False
- Returns:
x (
torch.Tensor): output from lstmnext_state (
Union[torch.Tensor, list]): hidden state from lstm
PytorchLSTM¶
- class ding.torch_utils.network.rnn.PytorchLSTM(*args, **kwargs)[source]¶
- Overview:
Wrap the PyTorch nn.LSTM, format the input and output
- Interface:
forward
Note
you can reference the <https://pytorch.org/docs/stable/generated/torch.nn.LSTM.html#torch.nn.LSTM>
- forward(inputs: torch.Tensor, prev_state: torch.Tensor, list_next_state: bool = True) → Tuple[torch.Tensor, Union[torch.Tensor, list]][source]¶
- Overview:
Wrapped nn.LSTM.forward
- Arguments:
- inputs (
torch.Tensor): input vector of cell, tensor of size [seq_len, batch_size, input_size]
- inputs (
- prev_state (
torch.Tensor): None or tensor of size [num_directions*num_layers, batch_size, hidden_size]
- prev_state (
list_next_state (
bool): whether return next_state with list format, default set to False
- Returns:
output (
torch.Tensor): output from lstmnext_state (
Union[torch.Tensor, list]): hidden state from lstm
get_lstm¶
- Overview:
Build and return the corresponding LSTM cell
- Arguments:
lstm_type (
str): version of lstm cell, now support [‘normal’, ‘pytorch’]input_size (
int): size of the input vectorhidden_size (
int): size of the hidden state vectornum_layers (
int): number of lstm layersnorm_type (
str): type of the normaliztion, (default: None)dropout (:obj:float): dropout rate, default set to .0
seq_len (
Optional[int]): seq len, default set to Nonebatch_size (
Optional[int]): batch_size len, default set to None
- Returns:
lstm (
Union[LSTM, PytorchLSTM]): the corresponding lstm cell
network.scatter_connection¶
ScatterConnection¶
- class ding.torch_utils.network.scatter_connection.ScatterConnection(scatter_type: str)[source]¶
- Overview:
Scatter feature to its corresponding location In AlphaStar, each entity is embedded into a tensor, and these tensors are scattered into a feature map with map size.
- forward(x: torch.Tensor, spatial_size: Tuple[int, int], location: torch.Tensor) → torch.Tensor[source]¶
- Overview:
scatter x into a spatial feature map
- Arguments:
x (
tensor): input tensor :math: (B, M, N) where M means the number of entity, N means the dimension of entity attributesspatial_size (
tuple): Tuple[H, W], the size of spatial feature x will be scattered intolocation (
tensor): :math: (B, M, 2) torch.LongTensor, each location should be (y, x)
- Returns:
output (
tensor): :math: (B, N, H, W) where H and W are spatial_size, return the scattered feature map
- Shapes:
Input: :math: (B, M, N) where M means the number of entity, N means the dimension of entity attributes
Size: Tuple type :math: [H, W]
Location: :math: (B, M, 2) torch.LongTensor, each location should be (y, x)
Output: :math: (B, N, H, W) where H and W are spatial_size
Note
When there are some overlapping in locations,
covermode will result in the loss of information, we use the addition as temporal substitute.
network.soft_argmax¶
SoftArgmax¶
- class ding.torch_utils.network.soft_argmax.SoftArgmax[source]¶
- Overview:
An nn.Module that computes SoftArgmax
- Interface:
__init__, forward
- forward(x: torch.Tensor) → torch.Tensor[source]¶
- Overview:
Soft-argmax for location regression
- Arguments:
x (
torch.Tensor): predict heat map
- Returns:
location (
torch.Tensor): predict location
- Shapes:
- x: \((B, C, H, W)\), while B is the batch size, C is number of channels,
H and W stands for height and width
location: \((B, 2)\), while B is the batch size
network.transformer¶
Attention¶
- class ding.torch_utils.network.transformer.Attention(input_dim: int, head_dim: int, output_dim: int, head_num: int, dropout: torch.nn.modules.module.Module)[source]¶
- Overview:
For each entry embedding, compute individual attention across all entries, add them up to get output attention
- Interfaces:
split, forward
TransformerLayer¶
- class ding.torch_utils.network.transformer.TransformerLayer(input_dim: int, head_dim: int, hidden_dim: int, output_dim: int, head_num: int, mlp_num: int, dropout: torch.nn.modules.module.Module, activation: torch.nn.modules.module.Module)[source]¶
- Overview:
In transformer layer, first computes entries’s attention and applies a feedforward layer
Transformer¶
- class ding.torch_utils.network.transformer.Transformer(input_dim: int, head_dim: int = 128, hidden_dim: int = 1024, output_dim: int = 256, head_num: int = 2, mlp_num: int = 2, layer_num: int = 3, dropout_ratio: float = 0.0, activation: torch.nn.modules.module.Module = ReLU())[source]¶
- Overview:
Transformer implementation
Note
For details refer to Attention is all you need: http://arxiv.org/abs/1706.03762
- forward(x: torch.Tensor, mask: Optional[torch.Tensor] = None) → torch.Tensor[source]¶
- Overview:
Transformer forward
- Arguments:
- x (
torch.Tensor): input tensor. Shape (B, N, C), B is batch size, N is number of entries, C is feature dimension
- x (
- mask (
Optional[torch.Tensor]): bool tensor, can be used to mask out invalid entries in attention. Shape (B, N), B is batch size, N is number of entries
- mask (
- Returns:
x (
torch.Tensor): transformer output