General Recommendations¶
You can easily modify the bootstrap sequences for all SOCA nodes using Jinja2. You can extend these scripts as needed by adding your own customizations
Guidelines & Recommendations¶
-
Avoid running echo commands to print log. Instead, use the various logs wrappers:
log_info "This will be printed with <timestamp> [INFO] prefix" log_debug "This will be printed with <timestamp> [DEBUG] prefix" log_error "This will be printed with <timestamp> [ERROR] prefix" log_warning "This will be printed with <timestamp> [WARNING] prefix"
-
Use
exit_fail
wrapper to exit your script after an error
if ! mycmd; then
exit_fail "mycmd was unsuccessful"
-
Do not invoke
aws
API directly, instead useaws_cli
wrapper. This wrapper will automatically add the correct--region
if not specified. (e.g:aws_cli ec2 describe-instance-types --instance-types m6i.xlarge
) -
Do not install packages using distro specific package manager (yum, dnf, apt-get). Use the distro-agnostic wrappers under
cluster_node_bootstrap/templates/linux/packages_management.sh.j2
:# Install Packages packages_install <list_of_package> packages_install kmod-lustre-client lustre-client # Remove Packages: packages_remove <list_of_packages> packages_remove ntp # Other commands: packages_generic_command <command> packages_generic_command groupinstall "Server with GUI" -y --skip-broken
-
Use
verify_package_installed
to quickly verify if a package is installed on your system
if ! verify_package_installed chrony; then
log_info "chrony is not installed, installing it ... "
packages_install chrony
fi
Templates Location¶
All Jinja2 templates are located on /apps/soca/<CLUSTER_ID>/cluster_node_bootstrap/
:
.
├── compute_node # Compute Node Bootstrap Sequence
├── controller # Controller Node Bootstrap Sequence
├── login_node # Login Node Bootstrap Sequence
├── templates # All Templates (Windows & Linux)
└── windows_virtual_desktop # Windows VDI Bootstrap Sequence
Parent templates include all Linux and Windows child templates available under templates/
directory
.
├── linux
│ ├── aws_cloudwatch_agent.sh.j2
│ ├── aws_ssm_agent.sh.j2
│ ├── awscli.sh.j2
│ ├── efa.sh.j2
│ ├── epel.sh.j2
│ ├── gpu
│ │ ├── amd_drivers.sh.j2
│ │ ├── disable_nouveau_driver.sh.j2
│ │ ├── install_drivers.sh.j2
│ │ ├── nvidia_drivers.sh.j2
│ │ └── optimize_gpu.sh.j2
│ ├── shared_storage
│ │ ├── fsx
│ │ │ ├── lustre_client_tuning_postmount.sh.j2
│ │ │ └── lustre_client_tuning_prereboot.sh.j2
│ │ ├── mount_efs.sh.j2
│ │ ├── mount_fsx_lustre.sh.j2
│ │ ├── mount_fsx_ontap.sh.j2
│ │ ├── mount_fsx_openzfs.sh.j2
│ │ ├── mount_standalone_nfs.sh.j2
│ │ └── nfs_wrapper.sh.j2
# ... TRUNCATED - NOT ALL TEMPLATES ARE SHOWN ...
└── windows
├── auto_user_logon.ps.j2
├── awscli.ps.j2
├── dcv
│ └── session_storage.ps.j2
├── disable_internetexplorer_enhanced_security.ps.j2
├── disable_user_access_control.ps.j2
├── join_activedirectory.ps.j2
├── log.ps.j2
├── tag_ebs.ps.j2
└── wrapper_secretsmanager.ps.j2
Customize the bootstrap sequence¶
You can customize the entire bootstrap sequence by customizing the general script templates (cluster_node_bootstrap/compute_node
) or directly via the cluster_node_boostrap/templates
folder.
Info
compute_node
templates (for HPC, Login and Virtual Desktops nodes) feature a script called 99_setup_user_customization.sh.j2
for you to add any specific Linux commands you want to execute on the machine.
Danger - Read me
Modifying a file is effective immediately and will not require any service restart.
Any error in your script may prevent SOCA to successfully provision capacity. If you suspect an error, check the logs in the path mentioned above.
Always create a backup of the file before modifying it.
Example file: cluster_node_bootstrap/compute_node/setup.sh.j2
This file contains the entire bootstrap sequence. It has minimal code and simply reference other templates such as:
# Disable SELINUX & firewalld
{% include "templates/linux/disable_selinux.sh.j2" %}
To understand what code will be generated, open cluster_node_bootstrap/templates/linux/disable_selinux.sh.j2
# Begin: Disable SE Linux
{% if context.get("/configuration/BaseOS") in ("amazonlinux2", "amazonlinux2023", "centos7", "rhel7", "rhel8", "rhel9", "rocky8", "rocky9") %}
function disable_selinux () {
log_info "Disable SELinux"
if ! sestatus | grep -q "disabled"; then
# disables selinux for current session
sestatus 0
# reboot is required to apply this change permanently. ensure reboot is the last line called from userdata.
sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
set_reboot_required "Disable SE Linux"
fi
}
disable_selinux
{% endif %}
# End: Disable SE Linux
Supported SOCA variables¶
You may have noticed this specific condition:
{% if context.get("/configuration/BaseOS") in ("amazonlinux2", "amazonlinux2023", "centos7", "rhel7", "rhel8", "rhel9", "rocky8", "rocky9") %}
// code
{% endif %}
context.get("/configuration/BaseOS")
point to your actual SOCA environment configuration, some examples include:
- "/configuration/BaseOS": The OS specified at install time (e.g:
amazonlinux2
) - "/configuration/Region": The region you have deployed SOCA (e.g:
us-east-2
) - "/configuration/PrivateSubnets": List of Private Subnets deployed on SOCA that you can use
- "/configuration/Analytics/enabled": Return
true
if you have AWS OpenSearch integration - "/packages/system": List all Linux Packages installed
- and more ....
There are over 200 supported variables, refer to this page to learn more about all the variables you can use in your templates