Launch AlwaysOn nodes
Why AlwaysOn instances?¶
By default, Engineering Development Hub provisions on-demand capacity when there are jobs in the normal queue. This means any job submitted will wait in the queue 5 to 8 minutes until EC2 capacity is ready.
If you want to avoid this penalty, you can provision "AlwaysOn instances" (or pre-bake an on-demand AMI). Please note you will be charged until you manually terminate it or specify --terminate_when_idle option.
How to launch an AlwaysOn instances?¶
./edhctl nodes create-always-on --scheduler-identifier openpbs-default-soca-te61 \
--job-owner mcrozes \
--job-queue alwayson \
--instance-type c6i.xlarge \
--nodes 2 --base-os amazonlinux2023 \
--instance-ami ami-025ca978d4c1d9825
Capacity tagged alwayson-5d4a6513-fffc-4c24-bc69-ee7b1d762390 has been created. This capacity will run 24/7 until you delete the associated Cloudformation stack.
./edhctl config get --key "/configuration/Schedulers/" --output json
{
"/configuration/Schedulers/openpbs-default-soca-te61": {
"enabled": true,
"provider": "openpbs",
"endpoint": "ip-11-0-64-70.us-east-2.compute.internal",
"binary_folder_paths": "/opt/edh/soca-te61/schedulers/default/pbs/bin",
"soca_managed_nodes_provisioning": true,
"identifier": "openpbs-default-soca-te61",
"pbs_configuration": {
"install_prefix_path": "/opt/edh/soca-te61/schedulers/default/pbs",
"pbs_home": "/opt/edh/soca-te61/schedulers/default/pbs/var/spool/pbs"
}
}
}
./edhctl nodes create-always-on --help
Usage: app.py nodes create-always-on [OPTIONS]
Options:
--scheduler-identifier TEXT Create a new always on for a specific
[required]
--job-owner TEXT Job owner for the new always on node(s)
[required]
--job-queue TEXT Job Queue [required]
--instance-type TEXT Instance Type to provision [required]
--nodes INTEGER Number of nodes to create [required]
--instance-ami TEXT EC2 Image ID for the node(s) [required]
--base-os [amazonlinux2|amazonlinux2023|rhel8|rhel9|rocky8|rocky9|ubuntu2204|ubuntu2404]
Operating system to use [required]
--root-size INTEGER Size of the root partition. Will use the
default one configured for your AMI if not
set
--instance-profile TEXT IAM Instance Profile for the node(s)
--security-groups TEXT Security group(s) to assign
--subnet-id TEXT Subnet(s) to deploy capacity on
--capacity-reservation-id TEXT Capacity Reservation ID to use
--anonymous-metrics / --no-anonymous-metrics
Enable or disable anonymous data tracking
--fsx-lustre TEXT FSx for Lustre association: True for new
FSxL or provide fs-id
--fsx-lustre-size INTEGER FSx for Lustre size if used
--fsx-lustre-deployment-type TEXT
FSx for Lustre deployment type
--fsx-lustre-per-unit-throughput INTEGER
FSxL throughput per unit
--fsx-lustre-storage-type TEXT FSxL storage type
--scratch-iops INTEGER Use io2 for scratch instead of gp3 (IOPS)
--scratch-size INTEGER Custom /scratch size in GiB
--spot-price TEXT Spot price (float, int, or 'auto')
--spot-allocation-count INTEGER
Spot allocation count
--spot-allocation-strategy [capacity-optimized|lowest-price|diversified]
Spot allocation strategy
--keep-ebs / --no-keep-ebs Preserve EBS after capacity deletion
--placement-group / --no-placement-group
Enable placement group for the node(s)
--efa-support / --no-efa-support
Enable Elastic Fabric Adapter (EFA)
--force-ri / --no-force-ri Require Reserved Instance only
--ht-support / --no-ht-support Enable or disable Hyper-Threading
--help Show this message and exit.
```
List all Always On instance
```bash
# This command will return all EC2 instance information
./edhctl nodes list --node-lifecycles alwayson
# You can filter with jq to only get specific info such as the Private DNS Name of all your Always On instances
./edhctl nodes list --node-lifecycles alwayson | jq -r '.[].PrivateDnsName'
ip-11-0-64-70.us-east-2.compute.internal
ip-11-0-250-76.us-east-2.compute.internal
ip-11-0-243-250.us-east-2.compute.internal
ip-11-0-253-157.us-east-2.compute.internal