a2rl.WiDataFrame.add_value#

WiDataFrame.add_value(alpha=0.1, gamma=0.6, sarsa=True, value_col='value', override='replace')[source]#

Append column value_col into this dataframe (restriction: df must NOT contain column names _state, _action, _reward, and the value_col).

Parameters:
  • alpha (float) – Learning rate in Q-Learning and SARSA. Must be be within 0 and 1.

  • gamma (float) – Discount factor of future reward in Q-Learning and SARSA. Must be within 0 and 1.

  • sarsa (bool) – When True, compute the value using the SARSA Bellman equation which is a conservative on-policy temporal difference update. When False, use the Q-Learning Bellman equation which is an off-policy temporal difference update.

  • value_col (str) – The column name for the computed values.

  • override (Literal['replace', 'warn', 'error']) – What to do when this dataframe has had column value_col. Valid values are replace to silently override, warn to show a warning, and raise to raise a ValueError.

Return type:

WiDataFrame

Returns:

This dataframe, modified with an additional value_col column. This return value is provided to facilitate chaining as-per the functional programming style.