Containers

What are containers?

Containers are a great way to package software, they wrap the runtime of the software up with the application's code. This allows you to pull down optimized software containers and run them out of the box without all the complications of compiling them for a new system. In this blog we'll focus on the nvidia container repository (ngc) since they have optimized containers for applications like pytorch, nemo, and BERT.

Containers in Slurm

What are the issues with using docker containers in traditional Slurm clusters?

Containers typically require a privileged runtime, i.e. the person invoking the container needs sudo access. This is a problem for Slurm clusters which are typically multi-user environments with POSIX file permissions used to give access to certain files and directories based on users and groups.

To solve this Nvidia published enroot - this uses the linux kernel feature chroot(1) to create an isolated runtime environment for the container. Think of this like creating a mount point /tmp/container in which the container can only see it's local directory i.e. container/. This serves to separate the outside OS from the container's runtime. Here's an example of enroot in action:

# Import and start an Amazon Linux image from DockerHub
enroot import docker://amazonlinux:latest
enroot create amazonlinux+latest.sqsh
enroot start amazonlinux+latest

In the above example we imported a container from dockerhub, converted it to an enroot container with enroot create and then ran it with enroot start.

So how do you schedule and run containers with Slurm?

Slurm provides a container capability for OCI containers that's half baked. It requires users to pull down their container images, convert them to an OCI runtime, then point Slurm at that OCI image. To solve this, Nvidia introduced Pyxis which is a plugin for Slurm that allows you to run containers using the native OCI runtime capabilities and only specifying the container uri, i.e. amazonlinux/latest. An example of this is like so:

#!/bin/bash
#SBATCH --container-image nvcr.io\#nvidia/pytorch:21.12-py3

python -c 'import torch ; print(torch.__version__)'

Pretty cool right?

Containers on SageMaker HyperPod

So how do we set this all up with SageMaker HyperPod?

We've already setup docker, pyxis and enroot as part of the lifecycle scripts. You should have both docker and enroot in your path:
```
docker --help
enroot --help
```
info
If you see the following error ERROR: permission denied while trying to connect to the Docker daemon socket at... when trying to run docker, you'll need to add the user to the docker group by running:
sudo usermod -aG docker ${USER}
Then log out with exit and log back in.

We can test that pyxis and enroot installed correctly by running the cuda:11.6.2 ubuntu image:

srun --container-image=nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi

info

If you see the following error [WARN] Kernel module nvidia_uvm is not loaded. Make sure the NVIDIA device driver is installed and loaded. make sure you're on Nvidia GPU cluster. this won't work with trn1.32xlarge instances.

[ec2-user@ip-172-31-28-27 ~]$ srun --container-image=nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi
pyxis: importing docker image: nvidia/cuda:11.6.2-base-ubuntu20.04
pyxis: imported docker image: nvidia/cuda:11.6.2-base-ubuntu20.04
Tue May 16 22:07:21 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.141.03   Driver Version: 470.141.03   CUDA Version: 11.6     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla T4            Off  | 00000000:00:1E.0 Off |                    0 |
| N/A   40C    P0    27W /  70W |      0MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

This can also be run in a sbatch script like so. Think of this as the equivalent of docker run nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi on the compute node.
```
#!/bin/bash
#SBATCH --container-image=nvidia/cuda:11.6.2-base-ubuntu20.04

nvidia-smi
```

info

It is possible that you get an error that looks like srun: unrecognized option '--container-image'. To fix this, you need to run:

NUM_NODES= <Replace with the number of compute nodes>
srun -N $NUM_NODES sudo scontrol reconfigure

What are containers?​

Containers in Slurm​

Containers on SageMaker HyperPod​

What are containers?

Containers in Slurm

Containers on SageMaker HyperPod