template.VAC¶
Please Reference ding/model/template/vac.py for usage
VAC¶
- class ding.model.template.VAC(obs_shape: Union[int, ding.utils.type_helper.SequenceType], action_shape: Union[int, ding.utils.type_helper.SequenceType], share_encoder: bool = True, continuous: bool = False, encoder_hidden_size_list: ding.utils.type_helper.SequenceType = [128, 128, 64], actor_head_hidden_size: int = 64, actor_head_layer_num: int = 2, critic_head_hidden_size: int = 64, critic_head_layer_num: int = 1, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None)[source]¶
- Overview:
The VAC model.
- Interfaces:
__init__,forward,compute_actor,compute_critic
- __init__(obs_shape: Union[int, ding.utils.type_helper.SequenceType], action_shape: Union[int, ding.utils.type_helper.SequenceType], share_encoder: bool = True, continuous: bool = False, encoder_hidden_size_list: ding.utils.type_helper.SequenceType = [128, 128, 64], actor_head_hidden_size: int = 64, actor_head_layer_num: int = 2, critic_head_hidden_size: int = 64, critic_head_layer_num: int = 1, activation: Optional[torch.nn.modules.module.Module] = ReLU(), norm_type: Optional[str] = None) → None[source]¶
- Overview:
Init the VAC Model according to arguments.
- Arguments:
obs_shape (
Union[int, SequenceType]): Observation’s space.action_shape (
Union[int, SequenceType]): Action’s space.share_encoder (
bool): Whether share encoder.continuous (
bool): Whether collect continuously.encoder_hidden_size_list (
SequenceType): Collection ofhidden_sizeto pass toEncoderactor_head_hidden_size (
Optional[int]): Thehidden_sizeto pass to actor-nn’sHead.- actor_head_layer_num (
int): The num of layers used in the network to compute Q value output for actor’s nn.
- actor_head_layer_num (
critic_head_hidden_size (
Optional[int]): Thehidden_sizeto pass to critic-nn’sHead.- critic_head_layer_num (
int): The num of layers used in the network to compute Q value output for critic’s nn.
- critic_head_layer_num (
- activation (
Optional[nn.Module]): The type of activation function to use in
MLPthe afterlayer_fn, ifNonethen default set tonn.ReLU()
- activation (
- norm_type (
Optional[str]): The type of normalization to use, see
ding.torch_utils.fc_blockfor more details`
- norm_type (
- compute_actor(x: torch.Tensor) → Dict[source]¶
- Overview:
Execute parameter updates with
'compute_actor'mode Use encoded embedding tensor to predict output.- Arguments:
- inputs (
torch.Tensor): The encoded embedding tensor, determined with given
hidden_size, i.e.(B, N=hidden_size).hidden_size = actor_head_hidden_size
- inputs (
- Returns:
- outputs (
Dict): Run with encoder and head.
- outputs (
- ReturnsKeys:
logit (
torch.Tensor): Logit encoding tensor, with same size as inputx.
- Shapes:
logit (
torch.FloatTensor): \((B, N)\), where B is batch size and N isaction_shape
- Examples:
>>> model = VAC(64,64) >>> inputs = torch.randn(4, 64) >>> actor_outputs = model(inputs,'compute_actor') >>> assert actor_outputs['action'].shape == torch.Size([4, 64])
- compute_actor_critic(x: torch.Tensor) → Dict[source]¶
- Overview:
Execute parameter updates with
'compute_actor_critic'mode Use encoded embedding tensor to predict output.- Arguments:
inputs (
torch.Tensor): The encoded embedding tensor.
- Returns:
- outputs (
Dict): Run with encoder and head.
- outputs (
- ReturnsKeys:
logit (
torch.Tensor): Logit encoding tensor, with same size as inputx.value (
torch.Tensor): Q value tensor with same size as batch size.
- Shapes:
logit (
torch.FloatTensor): \((B, N)\), where B is batch size and N isaction_shapevalue (
torch.FloatTensor): \((B, )\), where B is batch size.
- Examples:
>>> model = VAC(64,64) >>> inputs = torch.randn(4, 64) >>> outputs = model(inputs,'compute_actor_critic') >>> outputs['value'] tensor([0.0252, 0.0235, 0.0201, 0.0072], grad_fn=<SqueezeBackward1>) >>> assert outputs['logit'].shape == torch.Size([4, 64])
Note
compute_actor_criticinterface aims to save computation when shares encoder. Returning the combination dictionry.
- compute_critic(x: torch.Tensor) → Dict[source]¶
- Overview:
Execute parameter updates with
'compute_critic'mode Use encoded embedding tensor to predict output.- Arguments:
- inputs (
torch.Tensor): The encoded embedding tensor, determined with given
hidden_size, i.e.(B, N=hidden_size).hidden_size = critic_head_hidden_size
- inputs (
- Returns:
- outputs (
Dict): Run with encoder and head.
- Necessary Keys:
value (
torch.Tensor): Q value tensor with same size as batch size.
- outputs (
- Shapes:
value (
torch.FloatTensor): \((B, )\), where B is batch size.
- Examples:
>>> model = VAC(64,64) >>> inputs = torch.randn(4, 64) >>> critic_outputs = model(inputs,'compute_critic') >>> critic_outputs['value'] tensor([0.0252, 0.0235, 0.0201, 0.0072], grad_fn=<SqueezeBackward1>)
- forward(inputs: Union[torch.Tensor, Dict], mode: str) → Dict[source]¶
- Overview:
Use encoded embedding tensor to predict output. Parameter updates with VAC’s MLPs forward setup.
- Arguments:
- Forward with
'compute_actor'or'compute_critic': - inputs (
torch.Tensor): The encoded embedding tensor, determined with given
hidden_size, i.e.(B, N=hidden_size). Whetheractor_head_hidden_sizeorcritic_head_hidden_sizedepend onmode.
- inputs (
- Forward with
- Returns:
- outputs (
Dict): Run with encoder and head.
- Forward with
'compute_actor', Necessary Keys: logit (
torch.Tensor): Logit encoding tensor, with same size as inputx.
- Forward with
'compute_critic', Necessary Keys: value (
torch.Tensor): Q value tensor with same size as batch size.
- Forward with
- outputs (
- Shapes:
inputs (
torch.Tensor): \((B, N)\), where B is batch size and N correspondinghidden_sizelogit (
torch.FloatTensor): \((B, N)\), where B is batch size and N isaction_shapevalue (
torch.FloatTensor): \((B, )\), where B is batch size.
- Actor Examples:
>>> model = VAC(64,128) >>> inputs = torch.randn(4, 64) >>> actor_outputs = model(inputs,'compute_actor') >>> assert actor_outputs['logit'].shape == torch.Size([4, 128])
- Critic Examples:
>>> model = VAC(64,64) >>> inputs = torch.randn(4, 64) >>> critic_outputs = model(inputs,'compute_critic') >>> critic_outputs['value'] tensor([0.0252, 0.0235, 0.0201, 0.0072], grad_fn=<SqueezeBackward1>)
- Actor-Critic Examples:
>>> model = VAC(64,64) >>> inputs = torch.randn(4, 64) >>> outputs = model(inputs,'compute_actor_critic') >>> outputs['value'] tensor([0.0252, 0.0235, 0.0201, 0.0072], grad_fn=<SqueezeBackward1>) >>> assert outputs['logit'].shape == torch.Size([4, 64])