SageMaker HyperPod Recipes Overview

Amazon SageMaker HyperPod recipes help you get started with training and fine-tuning publicly available foundation models. The recipes provide a pre-packaged set of training stack configurations that enable state-of-art training performance on SageMaker HyperPod. You can also easily switch between GPU-based instances and TRN-based instances with a simple recipe change.

The recipes are pre-configured training configurations for the following model families:

Llama 3.1
Llama 3.2
Mistral
Mixtral

To run the recipes within SageMaker HyperPod you use the Amazon SageMaker HyperPod training adapter as the framework to help you run end-to-end training workflows. The training adapter is built on NVIDIA NeMo framework and Neuronx Distributed Training package. If you’re familiar with using NeMo, the process of using the training adapter is the same. The training adapter runs the recipe on your cluster.

HyperPod recipes

You can also train your own model by defining your own custom recipe.

Verified instance types, instance counts

P5.48xlarge

Supported Models

Pre-Training

Please refer to the link here for the list of supported model configs for Pre training.

Fine-Tuning

Please refer to the link here for the list of supported model configs for fine tuning.

Verified instance types, instance counts​

Supported Models​

Pre-Training​

Fine-Tuning​

Verified instance types, instance counts

Supported Models

Pre-Training

Fine-Tuning