Skip to main content
AI on SageMaker HyperPod
Orchestrated by EKS
Initial cluster setup
AWS Trainium
Distributed Data Parallel
Fully Sharded Data Parallel
NVIDIA Megatron LM
Ray Train
Orchestrated by SLURM
Initial cluster setup
AWS Trainium
Distributed Data Parallel
Fully Sharded Data Parallel
NVIDIA Megatron LM
Useful links
GitHub
EKS Blueprints
Inference
Inference
Inference examples on different architectures.
🗃️ Inference Operator
2 items
🗃️ Load Balancer Inference
1 item
🗃️ Ray Service
1 item