General Recommendations¶

You can easily modify the bootstrap sequences for all SOCA nodes using Jinja2. You can extend these scripts as needed by adding your own customizations

Guidelines & Recommendations¶

Avoid running echo commands to print log. Instead, use the various logs wrappers:

log_info "This will be printed with <timestamp> [INFO] prefix"
log_debug "This will be printed with <timestamp> [DEBUG] prefix"
log_error "This will be printed with <timestamp> [ERROR] prefix"
log_warning "This will be printed with <timestamp> [WARNING] prefix"

Use exit_fail wrapper to exit your script after an error

if ! mycmd; then
  exit_fail "mycmd was unsuccessful"

Do not invoke aws API directly, instead use aws_cli wrapper. This wrapper will automatically add the correct --region if not specified.

aws_cli ec2 describe-instance-types --instance-types m6i.xlarge

Do not install packages using distro specific package manager (yum, dnf, apt-get). Use the distro-agnostic wrappers under cluster_node_bootstrap/templates/linux/packages_management.sh.j2:

# Install Packages packages_install <list_of_package>
packages_install kmod-lustre-client lustre-client

# Remove Packages: packages_remove <list_of_packages> 
packages_remove ntp

# Other commands: packages_generic_command <command>
packages_generic_command groupinstall "Server with GUI" -y --skip-broken

Use verify_package_installed to quickly verify if a package is installed on your system

if ! verify_package_installed chrony; then
    log_info "chrony is not installed, installing it ... "
    packages_install chrony
  fi

Use file_download to download a file from an HTTPS or S3 location

# Simple download and save target as a specific file name 
file_download --download-url "https://example.com/test.txt" --save-as "test.txt"

# Detect S3 URI and download from S3 via awscli automatically 
file_download --download-url "s3://mybucket/folder/test.txt" --save-as "test.txt"

# Download from HTTP/S3 but also proceed to a data integrity check
file_download --download-url "https://example.com/test.txt" --save-as "test.txt" --sha256-checksum "XXXX"

Templates Location¶

All Jinja2 templates are located on /opt/soca/<CLUSTER_ID>/cluster_node_bootstrap/:

.
├── compute_node # Compute Node Bootstrap Sequence
├── controller # Controller Node Bootstrap Sequence
├── login_node # Login Node Bootstrap Sequence
├── templates # All Templates (Windows & Linux)
└── windows_virtual_desktop # Windows VDI Bootstrap Sequence

Parent templates include all Linux and Windows child templates available under templates/ directory

.
├── linux
│   ├── aws_cloudwatch_agent.sh.j2
│   ├── aws_ssm_agent.sh.j2
│   ├── awscli.sh.j2
│   ├── efa.sh.j2
│   ├── epel.sh.j2
│   ├── gpu
│   │   ├── amd_drivers.sh.j2
│   │   ├── disable_nouveau_driver.sh.j2
│   │   ├── install_drivers.sh.j2
│   │   ├── nvidia_drivers.sh.j2
│   │   └── optimize_gpu.sh.j2
│   ├── shared_storage
│   │   ├── fsx
│   │   │   ├── ontap
│   │   │   │    └── first_setup.sh.j2
│   │   │   └── lustre
│   │   │   ├── lustre_client_tuning_postmount.sh.j2
│   │   │   └── lustre_client_tuning_prereboot.sh.j2
│   │   ├── mount_efs.sh.j2
│   │   ├── mount_fsx_lustre.sh.j2
│   │   ├── mount_fsx_ontap.sh.j2
│   │   ├── mount_fsx_openzfs.sh.j2
│   │   ├── mount_s3.sh.j2
│   │   ├── mount_standalone_nfs.sh.j2
│   │   └── fstab_wrapper.sh.j2
# ... TRUNCATED - NOT ALL TEMPLATES ARE SHOWN ...
└── windows
    ├── auto_user_logon.ps.j2
    ├── awscli.ps.j2
    ├── dcv
    │   └── session_storage.ps.j2
    ├── disable_internetexplorer_enhanced_security.ps.j2
    ├── disable_user_access_control.ps.j2
    ├── join_activedirectory.ps.j2
    ├── log.ps.j2
    ├── tag_ebs.ps.j2
    └── wrapper_secretsmanager.ps.j2

Customize the bootstrap sequence¶

You can customize the entire bootstrap sequence by customizing the general script templates (cluster_node_bootstrap/compute_node) or directly via the cluster_node_boostrap/templates folder.

Info

compute_node templates (for HPC, Login and Virtual Desktops nodes) feature a script called 99_setup_user_customization.sh.j2 for you to add any specific Linux commands you want to execute on the machine.

Danger - Read me

Modifying a file is effective immediately and will not require any service restart.

Any error in your script may prevent SOCA to successfully provision capacity. If you suspect an error, check the logs in the path mentioned above.

Always create a backup of the file before modifying it.

Example file: cluster_node_bootstrap/compute_node/setup.sh.j2

This file contains the entire bootstrap sequence. It has minimal code and simply reference other templates such as:

# Disable SELINUX & firewalld
{% include "templates/linux/disable_selinux.sh.j2" %}

To understand what code will be generated, open cluster_node_bootstrap/templates/linux/disable_selinux.sh.j2

# Begin: Disable SE Linux
{% if context.get("/configuration/BaseOS") in ("amazonlinux2", "amazonlinux2023", "centos7", "rhel7", "rhel8", "rhel9", "rocky8", "rocky9") %}
function disable_selinux () {
  log_info "Disable SELinux"
  if ! sestatus | grep -q "disabled"; then
    # disables selinux for current session
    sestatus 0
    # reboot is required to apply this change permanently. ensure reboot is the last line called from userdata.
    sed -i 's/SELINUX=enforcing/SELINUX=disabled/g' /etc/selinux/config
    set_reboot_required "Disable SE Linux"
  fi
}
disable_selinux
{% endif %}
# End: Disable SE Linux

This is the actual content that will be rendered in the shell script. The shell script is generated via Jinja2 Templates.

Supported SOCA variables¶

You may have noticed this specific condition:

{% if context.get("/configuration/BaseOS") in ("amazonlinux2", "amazonlinux2023", "centos7", "rhel7", "rhel8", "rhel9", "rocky8", "rocky9") %}
// code
{% endif %}

context.get("/configuration/BaseOS") point to your actual SOCA environment configuration, some examples include:

"/configuration/BaseOS": The OS specified at install time (e.g: amazonlinux2)
"/configuration/Region": The region you have deployed SOCA (e.g: us-east-2)
"/configuration/PrivateSubnets": List of Private Subnets deployed on SOCA that you can use
"/configuration/Analytics/enabled": Return true if you have AWS OpenSearch integration
and more ....

There are over 200 supported variables, refer to this page to learn more about all the variables you can use in your templates