a2rl.experimental.lightgpt.LightGPTBuilder#
- class a2rl.experimental.lightgpt.LightGPTBuilder(tokenizer, model_dir=None, config=None, kw_args=<factory>)[source]#
Bases:
BaseBuilder
[LightGPT
,Trainer
]High-level APIs to train and evaluate a Lightning-based GPT model based on the data loaded in
AutoTokenizer
.It has no knowledge of dataframe shape, and which values belong to action/states/reward.
- Parameters:
model_dir (str | Path | None) – Model directory.
tokenizer (AutoTokenizer) – This is a
AutoTokenizer
.config (dict | str | Path | None) – Custom configuration file or dictionary. When set to
None
, use the built-in configuration ina2rl/experimental/lightgpt/config.yaml
.
Configuration file must meet the following
yaml
format.train_config: epochs: 5 batch_size: 512 embedding_dim: 512 gpt_n_layer: 1 gpt_n_head: 1 learning_rate: 6e-4 num_workers: 1 lr_decay: True
Examples
Train a model, and save to a temporary directory.
>>> import pytorch_lightning as pl >>> import a2rl as wi >>> from a2rl import AutoTokenizer >>> from a2rl.experimental.lightgpt import LightGPTBuilder, WarmupCosineLearningRateDecay
>>> wi_df = wi.read_csv_dataset(wi.sample_dataset_path("chiller")) >>> tokenizer = AutoTokenizer(wi_df, block_size_row=2) >>> with tempfile.TemporaryDirectory() as model_dir: ... builder = LightGPTBuilder(tokenizer, model_dir, kw_args={"trainer": trainer}) ... model = builder.fit()
Train a model with a custom
pytorch_lightning.Trainer
.>>> import pytorch_lightning as pl >>> import a2rl as wi >>> from a2rl import AutoTokenizer >>> from a2rl.experimental.lightgpt import LightGPTBuilder, WarmupCosineLearningRateDecay >>> wi_df = wi.read_csv_dataset(wi.sample_dataset_path("chiller")) >>> tokenizer = AutoTokenizer(wi_df, block_size_row=2) >>> with tempfile.TemporaryDirectory() as model_dir: ... # PyTorch Lightning stuffs. See Pytorch Lightning docs for more details. ... max_epochs = 5 # Will ignore config.yaml ... epoch_tokens = len(tokenizer.train_dataset) ... lr_decay = WarmupCosineLearningRateDecay( ... learning_rate=6e-4, ... warmup_tokens=epoch_tokens // 2, ... final_tokens=max_epochs * epoch_tokens, ... ) ... trainer = pl.Trainer( ... accelerator="auto", ... devices="auto", ... benchmark=False, ... max_epochs=max_epochs, ... gradient_clip_val=1.0, ... callbacks=[lr_decay, pl.callbacks.ModelSummary(max_depth=2)], ... default_root_dir=model_dir, ... ) ... ... # Bread-and-butter stuffs (i.e., business-as-usual) with A2RL model builder. ... builder = LightGPTBuilder(tokenizer, model_dir, kw_args={"trainer": trainer}) ... model = builder.fit()
The rest examples follow a similar structure to
a2rl.GPTBuilder
, but remember to create and pass a PyTorch Lightning’s trainer accordingly.Methods
Load training configuration.
evaluate
([context_len, sample, horizon])This is to evaluate the raw GPT model.
fit
([validate])Start training model.
Load a trained model.
sample
(seq, n_steps[, temperature, sample, ...])Sample the next
n_steps
token.Save trained pytorch model, training config, and associated tokenizer.
Attributes