torchrl.envs¶
The environment is the world that the agent interacts with, it could be a game, a physics engine or anything you would like. It should receive and execute an action and return to the agent the next observation and a reward.
BaseEnv¶
-
class
torchrl.envs.
BaseEnv
(env_name)[source]¶ Bases:
abc.ABC
Abstract base class used for implementing new environments.
Includes some basic functionalities, like the option to use a running mean and standard deviation for normalizing states.
Parameters: - env_name (str) – The environment name.
- fixed_normalize_states (bool) – If True, use the state min and max value to normalize the states (Default is False).
- running_normalize_states (bool) – If True, use the running mean and std to normalize the states (Default is False).
- scale_reward (bool) – If True, use the running std to scale the rewards (Default is False).
-
get_state_info
()[source]¶ Returns a dict containing information about the state space.
The dict should contain two keys:
shape
indicating the state shape, anddtype
indicating the state type.Example
State space containing 4 continuous actions:
return dict(shape=(4,), dtype='continuous')
-
get_action_info
()[source]¶ Returns a dict containing information about the action space.
The dict should contain two keys:
shape
indicating the action shape, anddtype
indicating the action type.If dtype is
int
it will be assumed a discrete action space.Example
Action space containing 4 float numbers:
return dict(shape=(4,), dtype='float')
-
simulator
¶ Returns the name of the simulator being used as a string.
-
reset
()[source]¶ Resets the environment to an initial state.
Returns: A numpy array with the state information. Return type: numpy.ndarray
-
step
(action)[source]¶ Receives an action and execute it on the environment.
Parameters: action (int or float or numpy.ndarray) – The action to be executed in the environment, it should be an int
for discrete enviroments andfloat
for continuous. There’s also the possibility of executing multiple actions (if the environment supports so), in this case it should be anumpy.ndarray
.Returns: - next_state (numpy.ndarray) – A numpy array with the state information.
- reward (float) – The reward.
- done (bool) – Flag indicating the termination of the episode.
- info (dict) – Dict containing additional information about the state.
GymEnv¶
-
class
torchrl.envs.
GymEnv
(env_name, **kwargs)[source]¶ Bases:
torchrl.envs.base_env.BaseEnv
Creates and wraps a gym environment.
Parameters: -
simulator
¶ Returns the name of the simulator being used as a string.
-
reset
()[source]¶ Calls the reset method on the gym environment.
Returns: state – A numpy array with the state information. Return type: numpy.ndarray
-
step
(action)[source]¶ Calls the step method on the gym environment.
Parameters: action (int or float or numpy.ndarray) – The action to be executed in the environment, it should be an int for discrete enviroments and float for continuous. There’s also the possibility of executing multiple actions (if the environment supports so), in this case it should be a numpy.ndarray. Returns: - next_state (numpy.ndarray) – A numpy array with the state information.
- reward (float) – The reward.
- done (bool) – Flag indicating the termination of the episode.
-
get_state_info
()[source]¶ Dictionary containing the shape and type of the state space. If it is continuous, also contains the minimum and maximum value.
-
get_action_info
()[source]¶ Dictionary containing the shape and type of the action space. If it is continuous, also contains the minimum and maximum value.
-
RoboschoolEnv¶
-
class
torchrl.envs.
RoboschoolEnv
(*args, **kwargs)[source]¶ Bases:
torchrl.envs.gym_env.GymEnv
Support for gym Roboschool.
-
get_action_info
()¶ Dictionary containing the shape and type of the action space. If it is continuous, also contains the minimum and maximum value.
-
static
get_space_info
(space)¶ Gets the shape of the possible types of states in gym.
Parameters: space (gym.spaces) – Space object that describes the valid actions and observations Returns: Dictionary containing the space shape and type Return type: dict
-
get_state_info
()¶ Dictionary containing the shape and type of the state space. If it is continuous, also contains the minimum and maximum value.
-
reset
()¶ Calls the reset method on the gym environment.
Returns: state – A numpy array with the state information. Return type: numpy.ndarray
-
simulator
¶ Returns the name of the simulator being used as a string.
-
step
(action)¶ Calls the step method on the gym environment.
Parameters: action (int or float or numpy.ndarray) – The action to be executed in the environment, it should be an int for discrete enviroments and float for continuous. There’s also the possibility of executing multiple actions (if the environment supports so), in this case it should be a numpy.ndarray. Returns: - next_state (numpy.ndarray) – A numpy array with the state information.
- reward (float) – The reward.
- done (bool) – Flag indicating the termination of the episode.
-