a2rl.TransitionRecorder#

class a2rl.TransitionRecorder(env, recording=True)[source]#

Record the transitions in the OpenAI gym gym.Env into a Whatif data frame.

Parameters:

env (Env) – a gym environment.
recording (bool) – When True, immediately start capturing steps. When False, callers need to call start() to start capturing steps.

Examples

>>> import gym
>>> import a2rl as wi

>>> def do_steps(env):
...     env.reset()
...     for _ in range(5):
...         env.step(0)

>>> env = wi.TransitionRecorder(env=gym.make("Taxi-v3"))
>>> do_steps(env)
>>> env.df.info()  
<class 'a2rl._dataframe.WiDataFrame'>
Int64Index: 5 entries, 0 to 0
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   0       5 non-null      float64
 1   1       5 non-null      float64
 2   2       5 non-null      float64
dtypes: float64(3)
memory usage: ...

>>> env.stop()
>>> do_steps(env)
>>> env.df.info()  
<class 'a2rl._dataframe.WiDataFrame'>
Int64Index: 5 entries, 0 to 0
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   0       5 non-null      float64
 1   1       5 non-null      float64
 2   2       5 non-null      float64
dtypes: float64(3)
memory usage: ...

>>> env.start();
>>> do_steps(env)
>>> env.df.info()  
<class 'a2rl._dataframe.WiDataFrame'>
Int64Index: 10 entries, 0 to 0
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype
---  ------  --------------  -----
 0   0       10 non-null     float64
 1   1       10 non-null     float64
 2   2       10 non-null     float64
dtypes: float64(3)
memory usage: ...

Wraps an environment to allow a modular transformation of the step() and reset() methods.

Parameters:

env (Env) – The environment to wrap
new_step_api – Whether the wrapper’s step method will output in new or old step API

Methods

`reset`(**kwargs)	Wrapper to `gym.Wrapper.reset()` which resets the environment to an initial state and returns an initial observation.
`start`()	Start capturing subsequent steps.
`step`(action)	Wrapper to `gym.Wrapper.step()` which records one timestep of the environment's dynamics.
`stop`()	Stop capturing steps.