distribution¶
Pd¶
- class ding.torch_utils.distribution.Pd[source]¶
- Overview:
Abstract class for parameterizable probability distributions and sampling functions.
- Interface:
neglogp, entropy, noise_mode, mode, sample
Tip
In dereived classes, logits should be an attribute member stored in class.
- entropy() → torch.Tensor[source]¶
- Overview:
Calculate the softmax entropy of logits
- Arguments:
reduction (
str): support [None, ‘mean’], default set to ‘mean’
- Returns:
entropy (
torch.Tensor): the calculated entropy
CategoricalPd¶
- class ding.torch_utils.distribution.CategoricalPd(logits: Optional[torch.Tensor] = None)[source]¶
- Overview:
Catagorical probility distribution sampler
- Interface:
update_logits, neglogp, entropy, noise_mode, mode, sample
- entropy(reduction: str = 'mean') → torch.Tensor[source]¶
- Overview:
Calculate the softmax entropy of logits
- Arguments:
reduction (
str): support [None, ‘mean’], default set to mean
- Returns:
entropy (
torch.Tensor): the calculated entropy
- mode(viz: bool = False) → Tuple[torch.Tensor, Dict[str, numpy.ndarray]][source]¶
- Overview:
return logits argmax result
- Argiments:
- viz (
bool): Whether to return numpy from of logits, noise and noise_logits; Short for “visualize”. (Because tensor type cannot visualize in tb or text log)
- viz (
- Returns:
result (
torch.Tensor): the logits argmax resultviz_feature (
Dict[str, np.ndarray]): ndarray type data for visualization.
- neglogp(x, reduction: str = 'mean') → torch.Tensor[source]¶
- Overview:
Calculate cross_entropy between input x and logits
- Arguments:
x (
torch.Tensor): the input tensorreduction (
str): support [None, ‘mean’], default set to mean
- Return:
cross_entropy (
torch.Tensor): the returned cross_entropy loss
- noise_mode(viz: bool = False) → Tuple[torch.Tensor, Dict[str, numpy.ndarray]][source]¶
- Overview:
add noise to logits
- Arguments:
- viz (
bool): Whether to return numpy from of logits, noise and noise_logits; Short for “visualize”. (Because tensor type cannot visualize in tb or text log)
- viz (
- Returns:
result (
torch.Tensor): noised logitsviz_feature (
Dict[str, np.ndarray]): ndarray type data for visualization.
- sample(viz: bool = False) → Tuple[torch.Tensor, Dict[str, numpy.ndarray]][source]¶
- Overview:
Sample from logits’s distribution by using softmax
- Arguments:
- viz (
bool): Whether to return numpy from of logits, noise and noise_logits; Short for “visualize”. (Because tensor type cannot visualize in tb or text log)
- viz (
- Returns:
result (
torch.Tensor): the logits sampled resultviz_feature (
Dict[str, np.ndarray]): ndarray type data for visualization.
CategoricalPdPytorch¶
- class ding.torch_utils.distribution.CategoricalPdPytorch(probs: Optional[torch.Tensor] = None)[source]¶
- Overview:
Wrapped
torch.distributions.Categorical- Notes:
- Please refer to
torch.distributions.Categoricaldoc:
- Please refer to
- Interface:
update_logits, updata_probs, sample, neglogp, mode, entropy
- entropy(reduction: Optional[str] = None) → torch.Tensor[source]¶
- Overview:
Calculate the softmax entropy of logits
- Arguments:
reduction (
str): support [None, ‘mean’], default set to mean
- Returns:
entropy (
torch.Tensor): the calculated entropy
- mode() → torch.Tensor[source]¶
- Overview:
Return logits argmax result
- Return:
result(:obj: torch.Tensor): the logits argmax result
- neglogp(actions: torch.Tensor, reduction: str = 'mean') → torch.Tensor[source]¶
- Overview:
Calculate cross_entropy between input x and logits
- Arguments:
actions (
torch.Tensor): the input action tensorreduction (
str): support [None, ‘mean’], default set to mean
- Return:
cross_entropy (
torch.Tensor): the returned cross_entropy loss