a2rl.utils.reward_function#
- a2rl.utils.reward_function(df, lag, mask=False)[source]#
Test for a reward function in the data H(r|state,action) based on their conditional entropies.
- Parameters:
df (
WiDataFrame
) – a discretized dataframe.lag (
int
) – int for the lag.
- Return type:
- Returns:
Returns the conditional entropy of reward given various lags. It is masked if the information gain is better than random