distribution¶

Pd¶

class ding.torch_utils.distribution.Pd[source]¶

Overview:: Abstract class for parameterizable probability distributions and sampling functions.
Interface:: neglogp, entropy, noise_mode, mode, sample

Tip

In dereived classes, logits should be an attribute member stored in class.

entropy() → torch.Tensor[source]¶

Overview:

Calculate the softmax entropy of logits

Arguments:

reduction (str): support [None, ‘mean’], default set to ‘mean’

Returns:

entropy (torch.Tensor): the calculated entropy

mode()[source]¶

Overview:: Return logits argmax result. This method is designed for deterministic.

neglogp(x: torch.Tensor) → torch.Tensor[source]¶

Overview:

Calculate cross_entropy between input x and logits

Arguments:

x (torch.Tensor): the input tensor

Return:

cross_entropy (torch.Tensor): the returned cross_entropy loss

noise_mode()[source]¶

Overview:: Add noise to logits. This method is designed for randomness

sample()[source]¶

Overview:: Sample from logits’s distribution by using softmax. This method is designed for multinomial.

CategoricalPd¶

class ding.torch_utils.distribution.CategoricalPd(logits: Optional[torch.Tensor] = None)[source]¶

Overview:: Catagorical probility distribution sampler
Interface:: update_logits, neglogp, entropy, noise_mode, mode, sample

entropy(reduction: str = 'mean') → torch.Tensor[source]¶

Overview:

Calculate the softmax entropy of logits

Arguments:

reduction (str): support [None, ‘mean’], default set to mean

Returns:

entropy (torch.Tensor): the calculated entropy

mode(viz: bool = False) → Tuple[torch.Tensor, Dict[str, numpy.ndarray]][source]¶

Overview:

return logits argmax result

Argiments:

viz (bool): Whether to return numpy from of logits, noise and noise_logits;
Short for “visualize”. (Because tensor type cannot visualize in tb or text log)

Returns:

result (torch.Tensor): the logits argmax result
viz_feature (Dict[str, np.ndarray]): ndarray type data for visualization.

neglogp(x, reduction: str = 'mean') → torch.Tensor[source]¶

Overview:

Calculate cross_entropy between input x and logits

Arguments:

x (torch.Tensor): the input tensor
reduction (str): support [None, ‘mean’], default set to mean

Return:

cross_entropy (torch.Tensor): the returned cross_entropy loss

noise_mode(viz: bool = False) → Tuple[torch.Tensor, Dict[str, numpy.ndarray]][source]¶

Overview:

add noise to logits

Arguments:

viz (bool): Whether to return numpy from of logits, noise and noise_logits;
Short for “visualize”. (Because tensor type cannot visualize in tb or text log)

Returns:

result (torch.Tensor): noised logits
viz_feature (Dict[str, np.ndarray]): ndarray type data for visualization.

sample(viz: bool = False) → Tuple[torch.Tensor, Dict[str, numpy.ndarray]][source]¶

Overview:

Sample from logits’s distribution by using softmax

Arguments:

viz (bool): Whether to return numpy from of logits, noise and noise_logits;
Short for “visualize”. (Because tensor type cannot visualize in tb or text log)

Returns:

result (torch.Tensor): the logits sampled result
viz_feature (Dict[str, np.ndarray]): ndarray type data for visualization.

update_logits(logits: torch.Tensor) → None[source]¶

Overview:

Updata logits

Arguments:

logits (:obj:torch.Tensor): logits to update

CategoricalPdPytorch¶

class ding.torch_utils.distribution.CategoricalPdPytorch(probs: Optional[torch.Tensor] = None)[source]¶

Overview:

Wrapped torch.distributions.Categorical

Notes:

Please refer to torch.distributions.Categorical doc:

https://pytorch.org/docs/stable/distributions.html?highlight=torch%20distributions#module-torch.distributions: Categorical

Interface:

update_logits, updata_probs, sample, neglogp, mode, entropy

entropy(reduction: Optional[str] = None) → torch.Tensor[source]¶

Overview:

Calculate the softmax entropy of logits

Arguments:

reduction (str): support [None, ‘mean’], default set to mean

Returns:

entropy (torch.Tensor): the calculated entropy

mode() → torch.Tensor[source]¶

Overview:

Return logits argmax result

Return:

result(:obj: torch.Tensor): the logits argmax result

neglogp(actions: torch.Tensor, reduction: str = 'mean') → torch.Tensor[source]¶

Overview:

Calculate cross_entropy between input x and logits

Arguments:

actions (torch.Tensor): the input action tensor
reduction (str): support [None, ‘mean’], default set to mean

Return:

cross_entropy (torch.Tensor): the returned cross_entropy loss

sample() → torch.Tensor[source]¶

Overview:

Sample from logits’s distribution by using softmax

Return:

result (torch.Tensor): the logits sampled result

update_logits(logits: torch.Tensor) → None[source]¶

Overview:

Updata logits

Arguments:

logits (:obj:torch.Tensor): logits to update