a2rl.GPTBuilder#

class a2rl.GPTBuilder(tokenizer, model_dir=None, config=None, kw_args=<factory>)[source]#

Bases: BaseBuilder[GPT, Trainer]

Provides high-level APIs to train and evaluate a GPT model based on the data loaded in AutoTokenizer.

It has no knowledge of dataframe shape, and which values belong to action/states/reward.

Parameters:

tokenizer (AutoTokenizer) – This is a AutoTokenizer.
model_dir (Union[str, Path, None]) – Model directory for saving and loading. When set to None, automatically generate the directory name.
config (Union[dict, str, Path, None]) – Custom configuration file or dictionary. When set to None, use the built-in configuration in a2rl/config.yaml.

Note

For configuration, precedence start with config as parameter, followed by custom file indicated by config_dir and config_name.

If none are specified, default configuration located in src/a2rl/config.yaml will be used.

Configuration file must meet the following yaml format.

train_config:
    epochs: 5
    batch_size: 512
    embedding_dim: 512
    gpt_n_layer: 1
    gpt_n_head: 1
    learning_rate: 6e-4
    num_workers: 0
    lr_decay: True

Examples

Train a model, and save to a temporary directory.

>>> import tempfile
>>> import a2rl as wi
>>> from a2rl.simulator import AutoTokenizer, GPTBuilder

>>> wi_df = wi.read_csv_dataset(wi.sample_dataset_path("chiller"))
>>> tokenizer = AutoTokenizer(wi_df, block_size_row=2)
>>> with tempfile.TemporaryDirectory() as model_dir:
...     builder = GPTBuilder(tokenizer, model_dir)
...     model = builder.fit()  

Load a pretrained model.

>>> wi_df = wi.read_csv_dataset(wi.sample_dataset_path("chiller"))
>>> tokenizer = AutoTokenizer(wi_df, block_size_row=2)
>>> with tempfile.TemporaryDirectory() as model_dir:
...     builder = GPTBuilder(tokenizer, model_dir)
...     model = builder.fit()  
...     model = builder.load_model() 

Pass in a custom configuration via parameter.

>>> custom_config = {
...     "train_config": {
...         "epochs": 1,
...         "batch_size": 512,
...         "embedding_dim": 512,
...         "gpt_n_layer": 1,
...         "gpt_n_head": 1,
...         "learning_rate": 0.0006,
...         "num_workers": 0,
...         "lr_decay": True,
...     }
... }
>>> wi_df = wi.read_csv_dataset(wi.sample_dataset_path("chiller"))
>>> tokenizer = AutoTokenizer(wi_df, block_size_row=2)
>>> with tempfile.TemporaryDirectory() as model_dir:
...     builder = GPTBuilder(tokenizer, model_dir, custom_config)
...     model = builder.fit()  

Methods

`config_loader`()	Load training configuration.
`evaluate`([context_len, sample, horizon])	This is to evaluate the raw GPT model.
`fit`([validate])	Start training model.
`load_model`()	Load a trained model.
`sample`(seq, n_steps[, temperature, sample, ...])	Sample the next `n_steps` token.
`save_model`()	Save trained pytorch model, training config, and associated tokenizer.

Attributes

model