rl_utils.exploration

exploration

get_epsilon_greedy_fn

Overview:

Generate an epsilon_greedy function with decay, which inputs current timestep and outputs current epsilon.

Arguments:
  • start (float): Epsilon start value. For ‘linear’, it should be 1.0.

  • end (float): Epsilon end value.

  • decay (int): Controls the speed that epsilon decreases from start to end. We recommend epsilon decays according to env step rather than iteration.

  • type (str): How epsilon decays, now supports [‘linear’, ‘exp’(exponential)]

Returns:
  • eps_fn (function): The epsilon greedy function with decay

BaseNoise

class ding.rl_utils.exploration.BaseNoise[source]
Overview:

Base class for action noise

Interface:

__init__, __call__

Examples:
>>> noise_generator = OUNoise()  # init one type of noise
>>> noise = noise_generator(action.shape, action.device)  # generate noise
abstract __call__(shape: tuple, device: str)torch.Tensor[source]
Overview:

Generate noise according to action tensor’s shape, device

Arguments:
  • shape (tuple): size of the action tensor, output noise’s size should be the same

  • device (str): device of the action tensor, output noise’s device should be the same as it

Returns:
  • noise (torch.Tensor): generated action noise, have the same shape and device with the input action tensor

__init__()None[source]
Overview:

Initialization method

GaussianNoise

class ding.rl_utils.exploration.GaussianNoise(mu: float = 0.0, sigma: float = 1.0)[source]
Overview:

Derived class for generating gaussian noise, which satisfies \(X \sim N(\mu, \sigma^2)\)

Interface:

__init__, __call__

__call__(shape: tuple, device: str)torch.Tensor[source]
Overview:

Generate gaussian noise according to action tensor’s shape, device

Arguments:
  • shape (tuple): size of the action tensor, output noise’s size should be the same

  • device (str): device of the action tensor, output noise’s device should be the same as it

Returns:
  • noise (torch.Tensor): generated action noise, have the same shape and device with the input action tensor

__init__(mu: float = 0.0, sigma: float = 1.0)None[source]
Overview:

Initialize \(\mu\) and \(\sigma\) in Gaussian Distribution

Arguments:
  • mu (float): \(\mu\) , mean value

  • sigma (float): \(\sigma\) , standard deviation, should be positive

OUNoise

class ding.rl_utils.exploration.OUNoise(mu: float = 0.0, sigma: float = 0.3, theta: float = 0.15, dt: float = 0.01, x0: Optional[Union[float, torch.Tensor]] = 0.0)[source]
Overview:

Derived class for generating Ornstein-Uhlenbeck process noise. Satisfies \(dx_t=\theta(\mu-x_t)dt + \sigma dW_t\), where \(W_t\) denotes Weiner Process, acting as a random perturbation term.

Interface:

__init__, reset, __call__

__call__(shape: tuple, device: str, mu: Optional[float] = None)torch.Tensor[source]
Overview:

Generate gaussian noise according to action tensor’s shape, device

Arguments:
  • shape (tuple): size of the action tensor, output noise’s size should be the same

  • device (str): device of the action tensor, output noise’s device should be the same as it

  • mu (float): new mean value \(\mu\), you can set it to None if don’t need it

Returns:
  • noise (torch.Tensor): generated action noise, have the same shape and device with the input action tensor

__init__(mu: float = 0.0, sigma: float = 0.3, theta: float = 0.15, dt: float = 0.01, x0: Optional[Union[float, torch.Tensor]] = 0.0)None[source]
Overview:

Initialize _alpha \(= heta * dt\\) \(= \sigma * \sqrt{dt}\), in Ornstein-Uhlenbeck process

Arguments:
  • mu (float): \(\mu\) , mean value

  • sigma (float): \(\sigma\) , standard deviation of the perturbation noise

  • theta (float): how strongly the noise reacts to perturbations, greater value means stronger reaction

  • dt (float): derivative of time t

  • x0 (float or torch.Tensor): initial action

reset()None[source]
Overview:

Reset _x to the initial state _x0

create_noise_generator

Overview:

Given the key (noise_type), create a new noise generator instance if in noise_mapping’s values, or raise an KeyError. In other words, a derived noise generator must first register, then call create_noise generator to get the instance object.

Arguments:
  • noise_type (str): the type of noise generator to be created

Returns:
  • noise (BaseNoise): the created new noise generator, should be an instance of one of noise_mapping’s values