a2rl.Simulator.get_valid_actions#

Simulator.get_valid_actions(seq, max_size)[source]#

Return a dataframe of sampled action tokens, given the input context.

Parameters:
  • seq (ndarray) – Input context sequence (1-dim) where context = (s, a, r, …, s) which ends with state dataframe tokens.

  • max_size (int) – Number of samples to draw

Return type:

WiDataFrame

Returns:

Whatif dataframe where each row denotes a sample with action columns, and the actions are in the tokenized forms.