rl_utils.exploration¶

exploration¶

get_epsilon_greedy_fn¶

Overview:

Generate an epsilon_greedy function with decay, which inputs current timestep and outputs current epsilon.

Arguments:

start (float): Epsilon start value. For ‘linear’, it should be 1.0.
end (float): Epsilon end value.
decay (int): Controls the speed that epsilon decreases from start to end. We recommend epsilon decays according to env step rather than iteration.
type (str): How epsilon decays, now supports [‘linear’, ‘exp’(exponential)]

Returns:

eps_fn (function): The epsilon greedy function with decay

BaseNoise¶

class ding.rl_utils.exploration.BaseNoise[source]¶

Overview:

Base class for action noise

Interface:

__init__, __call__

Examples:

>>> noise_generator = OUNoise()  # init one type of noise
>>> noise = noise_generator(action.shape, action.device)  # generate noise

abstract __call__(shape: tuple, device: str) → torch.Tensor[source]¶

Overview:

Generate noise according to action tensor’s shape, device

Arguments:

shape (tuple): size of the action tensor, output noise’s size should be the same
device (str): device of the action tensor, output noise’s device should be the same as it

Returns:

noise (torch.Tensor): generated action noise, have the same shape and device with the input action tensor

__init__() → None[source]¶

Overview:: Initialization method

GaussianNoise¶

class ding.rl_utils.exploration.GaussianNoise(mu: float = 0.0, sigma: float = 1.0)[source]¶

Overview:: Derived class for generating gaussian noise, which satisfies \(X \sim N(\mu, \sigma^2)\)
Interface:: __init__, __call__

__call__(shape: tuple, device: str) → torch.Tensor[source]¶

Overview:

Generate gaussian noise according to action tensor’s shape, device

Arguments:

shape (tuple): size of the action tensor, output noise’s size should be the same
device (str): device of the action tensor, output noise’s device should be the same as it

Returns:

noise (torch.Tensor): generated action noise, have the same shape and device with the input action tensor

__init__(mu: float = 0.0, sigma: float = 1.0) → None[source]¶

Overview:

Initialize \(\mu\) and \(\sigma\) in Gaussian Distribution

Arguments:

mu (float): \(\mu\) , mean value
sigma (float): \(\sigma\) , standard deviation, should be positive

OUNoise¶

class ding.rl_utils.exploration.OUNoise(mu: float = 0.0, sigma: float = 0.3, theta: float = 0.15, dt: float = 0.01, x0: Optional[Union[float, torch.Tensor]] = 0.0)[source]¶

Overview:: Derived class for generating Ornstein-Uhlenbeck process noise. Satisfies \(dx_t=\theta(\mu-x_t)dt + \sigma dW_t\), where \(W_t\) denotes Weiner Process, acting as a random perturbation term.
Interface:: __init__, reset, __call__

__call__(shape: tuple, device: str, mu: Optional[float] = None) → torch.Tensor[source]¶

Overview:

Generate gaussian noise according to action tensor’s shape, device

Arguments:

shape (tuple): size of the action tensor, output noise’s size should be the same
device (str): device of the action tensor, output noise’s device should be the same as it
mu (float): new mean value \(\mu\), you can set it to None if don’t need it

Returns:

noise (torch.Tensor): generated action noise, have the same shape and device with the input action tensor

__init__(mu: float = 0.0, sigma: float = 0.3, theta: float = 0.15, dt: float = 0.01, x0: Optional[Union[float, torch.Tensor]] = 0.0) → None[source]¶

Overview:

Initialize _alpha \(= heta * dt\\) \(= \sigma * \sqrt{dt}\), in Ornstein-Uhlenbeck process

Arguments:

mu (float): \(\mu\) , mean value
sigma (float): \(\sigma\) , standard deviation of the perturbation noise
theta (float): how strongly the noise reacts to perturbations, greater value means stronger reaction
dt (float): derivative of time t
x0 (float or torch.Tensor): initial action

reset() → None[source]¶

Overview:: Reset _x to the initial state _x0

create_noise_generator¶

Overview:

Given the key (noise_type), create a new noise generator instance if in noise_mapping’s values, or raise an KeyError. In other words, a derived noise generator must first register, then call create_noise generator to get the instance object.

Arguments:

noise_type (str): the type of noise generator to be created

Returns:

noise (BaseNoise): the created new noise generator, should be an instance of one of noise_mapping’s values