a2rl.utils.action_reward#
- a2rl.utils.action_reward(df, lag, mask=False)[source]#
Test for the effect of the action on the reward in the data H(reward|prev_action).
- Parameters:
df (
WiDataFrame
) – a discretized dataframe.lag (
int
) – int for the lag.
- Return type:
- Returns:
Returns the conditional entropy of future reward given various lags. It is masked if the information gain is better than random
See also