rl_utils.exploration¶
exploration¶
get_epsilon_greedy_fn¶
- Overview:
Generate an epsilon_greedy function with decay, which inputs current timestep and outputs current epsilon.
- Arguments:
start (
float): Epsilon start value. For ‘linear’, it should be 1.0.end (
float): Epsilon end value.decay (
int): Controls the speed that epsilon decreases fromstarttoend. We recommend epsilon decays according to env step rather than iteration.type (
str): How epsilon decays, now supports [‘linear’, ‘exp’(exponential)]
- Returns:
eps_fn (
function): The epsilon greedy function with decay
BaseNoise¶
- class ding.rl_utils.exploration.BaseNoise[source]¶
- Overview:
Base class for action noise
- Interface:
__init__, __call__
- Examples:
>>> noise_generator = OUNoise() # init one type of noise >>> noise = noise_generator(action.shape, action.device) # generate noise
- abstract __call__(shape: tuple, device: str) → torch.Tensor[source]¶
- Overview:
Generate noise according to action tensor’s shape, device
- Arguments:
shape (
tuple): size of the action tensor, output noise’s size should be the samedevice (
str): device of the action tensor, output noise’s device should be the same as it
- Returns:
noise (
torch.Tensor): generated action noise, have the same shape and device with the input action tensor
GaussianNoise¶
- class ding.rl_utils.exploration.GaussianNoise(mu: float = 0.0, sigma: float = 1.0)[source]¶
- Overview:
Derived class for generating gaussian noise, which satisfies \(X \sim N(\mu, \sigma^2)\)
- Interface:
__init__, __call__
- __call__(shape: tuple, device: str) → torch.Tensor[source]¶
- Overview:
Generate gaussian noise according to action tensor’s shape, device
- Arguments:
shape (
tuple): size of the action tensor, output noise’s size should be the samedevice (
str): device of the action tensor, output noise’s device should be the same as it
- Returns:
noise (
torch.Tensor): generated action noise, have the same shape and device with the input action tensor
OUNoise¶
- class ding.rl_utils.exploration.OUNoise(mu: float = 0.0, sigma: float = 0.3, theta: float = 0.15, dt: float = 0.01, x0: Optional[Union[float, torch.Tensor]] = 0.0)[source]¶
- Overview:
Derived class for generating Ornstein-Uhlenbeck process noise. Satisfies \(dx_t=\theta(\mu-x_t)dt + \sigma dW_t\), where \(W_t\) denotes Weiner Process, acting as a random perturbation term.
- Interface:
__init__, reset, __call__
- __call__(shape: tuple, device: str, mu: Optional[float] = None) → torch.Tensor[source]¶
- Overview:
Generate gaussian noise according to action tensor’s shape, device
- Arguments:
shape (
tuple): size of the action tensor, output noise’s size should be the samedevice (
str): device of the action tensor, output noise’s device should be the same as itmu (
float): new mean value \(\mu\), you can set it to None if don’t need it
- Returns:
noise (
torch.Tensor): generated action noise, have the same shape and device with the input action tensor
- __init__(mu: float = 0.0, sigma: float = 0.3, theta: float = 0.15, dt: float = 0.01, x0: Optional[Union[float, torch.Tensor]] = 0.0) → None[source]¶
- Overview:
Initialize
_alpha\(= heta * dt\\) \(= \sigma * \sqrt{dt}\), in Ornstein-Uhlenbeck process- Arguments:
mu (
float): \(\mu\) , mean valuesigma (
float): \(\sigma\) , standard deviation of the perturbation noisetheta (
float): how strongly the noise reacts to perturbations, greater value means stronger reactiondt (
float): derivative of time tx0 (
floatortorch.Tensor): initial action
create_noise_generator¶
- Overview:
Given the key (noise_type), create a new noise generator instance if in noise_mapping’s values, or raise an KeyError. In other words, a derived noise generator must first register, then call
create_noise generatorto get the instance object.- Arguments:
noise_type (
str): the type of noise generator to be created
- Returns:
noise (
BaseNoise): the created new noise generator, should be an instance of one of noise_mapping’s values