a2rl.TransitionRecorder#

class a2rl.TransitionRecorder(env, recording=True)[source]#

Bases: Wrapper[Any, ndarray]

Record the transitions in the OpenAI gym gym.Env into a Whatif data frame.

Parameters:
  • env (Env) – a gym environment.

  • recording (bool) – When True, immediately start capturing steps. When False, callers need to call start() to start capturing steps.

Examples

>>> import gym
>>> import a2rl as wi

>>> def do_steps(env):
...     env.reset()
...     for _ in range(5):
...         env.step(0)

>>> env = wi.TransitionRecorder(env=gym.make("Taxi-v3"))
>>> do_steps(env)
>>> env.df.info()  
<class 'a2rl._dataframe.WiDataFrame'>
Int64Index: 5 entries, 0 to 0
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   0       5 non-null      float64
 1   1       5 non-null      float64
 2   2       5 non-null      float64
dtypes: float64(3)
memory usage: ...

>>> env.stop()
>>> do_steps(env)
>>> env.df.info()  
<class 'a2rl._dataframe.WiDataFrame'>
Int64Index: 5 entries, 0 to 0
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   0       5 non-null      float64
 1   1       5 non-null      float64
 2   2       5 non-null      float64
dtypes: float64(3)
memory usage: ...

>>> env.start();
>>> do_steps(env)
>>> env.df.info()  
<class 'a2rl._dataframe.WiDataFrame'>
Int64Index: 10 entries, 0 to 0
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   0       10 non-null     float64
 1   1       10 non-null     float64
 2   2       10 non-null     float64
dtypes: float64(3)
memory usage: ...

Wraps an environment to allow a modular transformation of the step() and reset() methods.

Parameters:
  • env (Env) – The environment to wrap

  • new_step_api – Whether the wrapper’s step method will output in new or old step API

Methods

reset(**kwargs)

Wrapper to gym.Wrapper.reset() which resets the environment to an initial state and returns an initial observation.

start()

Start capturing subsequent steps.

step(action)

Wrapper to gym.Wrapper.step() which records one timestep of the environment's dynamics.

stop()

Stop capturing steps.