distribution

Pd

class ding.torch_utils.distribution.Pd[source]
Overview:

Abstract class for parameterizable probability distributions and sampling functions.

Interface:

neglogp, entropy, noise_mode, mode, sample

Tip

In dereived classes, logits should be an attribute member stored in class.

entropy()torch.Tensor[source]
Overview:

Calculate the softmax entropy of logits

Arguments:
  • reduction (str): support [None, ‘mean’], default set to ‘mean’

Returns:
  • entropy (torch.Tensor): the calculated entropy

mode()[source]
Overview:

Return logits argmax result. This method is designed for deterministic.

neglogp(x: torch.Tensor)torch.Tensor[source]
Overview:

Calculate cross_entropy between input x and logits

Arguments:
  • x (torch.Tensor): the input tensor

Return:
  • cross_entropy (torch.Tensor): the returned cross_entropy loss

noise_mode()[source]
Overview:

Add noise to logits. This method is designed for randomness

sample()[source]
Overview:

Sample from logits’s distribution by using softmax. This method is designed for multinomial.

CategoricalPd

class ding.torch_utils.distribution.CategoricalPd(logits: Optional[torch.Tensor] = None)[source]
Overview:

Catagorical probility distribution sampler

Interface:

update_logits, neglogp, entropy, noise_mode, mode, sample

entropy(reduction: str = 'mean')torch.Tensor[source]
Overview:

Calculate the softmax entropy of logits

Arguments:
  • reduction (str): support [None, ‘mean’], default set to mean

Returns:
  • entropy (torch.Tensor): the calculated entropy

mode(viz: bool = False)Tuple[torch.Tensor, Dict[str, numpy.ndarray]][source]
Overview:

return logits argmax result

Argiments:
  • viz (bool): Whether to return numpy from of logits, noise and noise_logits;

    Short for “visualize”. (Because tensor type cannot visualize in tb or text log)

Returns:
  • result (torch.Tensor): the logits argmax result

  • viz_feature (Dict[str, np.ndarray]): ndarray type data for visualization.

neglogp(x, reduction: str = 'mean')torch.Tensor[source]
Overview:

Calculate cross_entropy between input x and logits

Arguments:
  • x (torch.Tensor): the input tensor

  • reduction (str): support [None, ‘mean’], default set to mean

Return:
  • cross_entropy (torch.Tensor): the returned cross_entropy loss

noise_mode(viz: bool = False)Tuple[torch.Tensor, Dict[str, numpy.ndarray]][source]
Overview:

add noise to logits

Arguments:
  • viz (bool): Whether to return numpy from of logits, noise and noise_logits;

    Short for “visualize”. (Because tensor type cannot visualize in tb or text log)

Returns:
  • result (torch.Tensor): noised logits

  • viz_feature (Dict[str, np.ndarray]): ndarray type data for visualization.

sample(viz: bool = False)Tuple[torch.Tensor, Dict[str, numpy.ndarray]][source]
Overview:

Sample from logits’s distribution by using softmax

Arguments:
  • viz (bool): Whether to return numpy from of logits, noise and noise_logits;

    Short for “visualize”. (Because tensor type cannot visualize in tb or text log)

Returns:
  • result (torch.Tensor): the logits sampled result

  • viz_feature (Dict[str, np.ndarray]): ndarray type data for visualization.

update_logits(logits: torch.Tensor)None[source]
Overview:

Updata logits

Arguments:
  • logits (:obj:torch.Tensor): logits to update

CategoricalPdPytorch

class ding.torch_utils.distribution.CategoricalPdPytorch(probs: Optional[torch.Tensor] = None)[source]
Overview:

Wrapped torch.distributions.Categorical

Notes:
Please refer to torch.distributions.Categorical doc:
https://pytorch.org/docs/stable/distributions.html?highlight=torch%20distributions#module-torch.distributions

Categorical

Interface:

update_logits, updata_probs, sample, neglogp, mode, entropy

entropy(reduction: Optional[str] = None)torch.Tensor[source]
Overview:

Calculate the softmax entropy of logits

Arguments:
  • reduction (str): support [None, ‘mean’], default set to mean

Returns:
  • entropy (torch.Tensor): the calculated entropy

mode()torch.Tensor[source]
Overview:

Return logits argmax result

Return:
  • result(:obj: torch.Tensor): the logits argmax result

neglogp(actions: torch.Tensor, reduction: str = 'mean')torch.Tensor[source]
Overview:

Calculate cross_entropy between input x and logits

Arguments:
  • actions (torch.Tensor): the input action tensor

  • reduction (str): support [None, ‘mean’], default set to mean

Return:
  • cross_entropy (torch.Tensor): the returned cross_entropy loss

sample()torch.Tensor[source]
Overview:

Sample from logits’s distribution by using softmax

Return:
  • result (torch.Tensor): the logits sampled result

update_logits(logits: torch.Tensor)None[source]
Overview:

Updata logits

Arguments:
  • logits (:obj:torch.Tensor): logits to update