train
– Trainer for the language models
train.py¶
usage: train.py [-h] [--model MODEL] [--max_epochs MAX_EPOCHS]
[--batch_size BATCH_SIZE] [--updates UPDATES] [--profile]
[--dbg] [--reset_cache] [--subset SUBSET]
[--conv_ckpt CONV_CKPT] [--tf32 TF32] [--layers LAYERS]
[--heads HEADS] [--hidden_size HIDDEN_SIZE]
[--continue_from CONTINUE_FROM]
- -h, --help¶
show this help message and exit
- --model <model>¶
Model to use for pretraining.
- --max_epochs <max_epochs>¶
Number of epochs to pretrain for.
- --batch_size <batch_size>¶
Batch size to use for pretraining.
- --updates <updates>¶
Batches to wait before logging training progress.
- --profile¶
Whether to profile the training process.
- --dbg¶
Whether to run with single thread.
- --reset_cache¶
Whether to reset the cache before training.
- --subset <subset>¶
Fraction of the dataset to use across train/val/test.
- --conv_ckpt <conv_ckpt>¶
Converts a Lightning checkpoint to a HuggingFace checkpoint.
- --tf32 <tf32>¶
Whether to use tf32 precision on Ampere GPUs.
- --layers <layers>¶
Number of layers to use for the model.
- --heads <heads>¶
Number of heads to use for the model.
Size for the hidden state of the model if applicable.
- --continue_from <continue_from>¶
Path to a checkpoint to continue training from.