Your SOCA environment configuration is stored on AWS System Manager Parameter Store and use the following prefix:
- /soca/CLUSTER_ID/configuration/
: Store SOCA environment specific parameters such as BaseOS, ClusterId, AuthProvider, IAM roles, Security Groups etc .. - /soca/CLUSTER_ID/system/
: Store SOCA system specific variables such as the list of packages to install, the link to download Python, OpenMPI, CloudWatch Log Agent, EFA etc ...
SSM keys are organized by hierarchy:
- Query the parent folder (ex: /soca/CLUSTER_ID/configuration/Cache/) to get all child keys
- Specifically query a child key, for example:
- /soca/CLUSTER_ID/configuration/Cache/port
- /soca/CLUSTER_ID/configuration/Cache/kms_key_id
- /soca/CLUSTER_ID/configuration/Cache/enabled
- /soca/CLUSTER_ID/configuration/Cache/port
- etc ...
How to update your configuration¶
It's NOT RECOMMENDED to update your configuration on AWS Parameter Store AWS console directly as the configuration values are cached on Amazon ElastiCache to improve performance. In order to successfully update your configuration, you must also invalidate the cache value.
To make it easier, we have developed a new socactl command-line (CLI) utility which will perform both actions (update SSM parameter value and update ElastiCache) at the same time.
Info
socactl
utility must be executed from the controller host.
Invoke socactl
via /apps/soca/<CLUSTER_ID>/cluster_manager/socactl
SSM keys are prefixed with /soca/<CLUSTER_ID>/
. socactl
will automatically add it if you query just the key name (e.g: /Configuration/BaseOS
)
Here is a simple example assuming you want to update the list of private subnets configured on your SOCA environment:
Retrieve Configuration Key¶
First, check the current private subnets associated to your cluster using config get keyword:
./socactl config get \
--key "/configuration/PrivateSubnets"
['subnet-0145161ba2ffb677a', 'subnet-0102a5b1b21cb1aef', 'subnet-0bc3f861ffb2322fe']
This is the value stored on AWS SSM. Let's also validate if the cached value is the same using cache get keyword:
./socactl cache get \
--key "/configuration/PrivateSubnets"
['subnet-0145161ba2ffb677a', 'subnet-0102a5b1b21cb1aef', 'subnet-0bc3f861ffb2322fe']
Update Configuration Key¶
Now let's assume you no longer want to use subnet-0bc3f861ffb2322fe
, you can update the value via:
./socactl config set \
--key "/configuration/PrivateSubnets" \
--value "['subnet-0145161ba2ffb677a', 'subnet-0102a5b1b21cb1aef']"
Cache updated
Success: Key has been updated successfully
You can verify if your SSM and cache are updated correctly and in sync:
./socactl config get --key "/configuration/PrivateSubnets"
['subnet-0145161ba2ffb677a', 'subnet-0102a5b1b21cb1aef']
./socactl cache get --key "/configuration/PrivateSubnets"
['subnet-0145161ba2ffb677a', 'subnet-0102a5b1b21cb1aef']
Retrieve Configuration Key History¶
AWS Parameter Store support versioning. You can get the entire history of a specific config parameter via config history
./socactl config history --key "/configuration/PrivateSubnets" --output json
{
"2": {
"Version": 2,
"Value": "['subnet-0145161ba2ffb677a', 'subnet-0102a5b1b21cb1aef']",
"LastModifiedDate": "2024-09-26 09:56:50.202000+00:00"
},
"1": {
"Version": 1,
"Value": "['subnet-0145161ba2ffb677a', 'subnet-0102a5b1b21cb1aef', 'subnet-0bc3f861ffb2322fe']",
"LastModifiedDate": "2024-09-23 12:56:02.767000+00:00"
}
}
Rollback Configuration Key to a previous version¶
Additionally, if needed, you can easily rollback to a previous configuration (note: you can rollback to a specific version via --version
if needed)
./socactl config rollback --key "/configuration/PrivateSubnets"
No --version specified, Rollback to the previous version
RollBack to: {'Version': 1, 'Value': "['subnet-0145161ba2ffb677a', 'subnet-0102a5b1b21cb1aef', 'subnet-0bc3f861ffb2322fe']", 'LastModifiedDate': datetime.datetime(2024, 9, 23, 12, 56, 2, 767000, tzinfo=tzlocal())}
Confirm (use --force to skip)? (Yes, No): yes
Cache updated
Success: Key has been updated successfully
Let's verify the config has been updated successfully
/socactl config get \
--key "/configuration/PrivateSubnets"
['subnet-0145161ba2ffb677a', 'subnet-0102a5b1b21cb1aef', 'subnet-0bc3f861ffb2322fe']
# See updated history with the new entry
./socactl config history --key "/configuration/PrivateSubnets" --output json
{
"3": {
"Version": 3,
"Value": "['subnet-0145161ba2ffb677a', 'subnet-0102a5b1b21cb1aef', 'subnet-0bc3f861ffb2322fe']",
"LastModifiedDate": "2024-09-26 11:34:52.816000+00:00"
},
"2": {
"Version": 2,
"Value": "['subnet-0145161ba2ffb677a', 'subnet-0102a5b1b21cb1aef']",
"LastModifiedDate": "2024-09-26 09:56:50.202000+00:00"
},
"1": {
"Version": 1,
"Value": "['subnet-0145161ba2ffb677a', 'subnet-0102a5b1b21cb1aef', 'subnet-0bc3f861ffb2322fe']",
"LastModifiedDate": "2024-09-23 12:56:02.767000+00:00"
}
}
Retrieve the entire configuration
Run ./socactl config snapshot --output jsom
to quickly visualize all your SOCA environment config
{
"/cdk_completed": "true",
"/configuration/ClusterId": "soca-mcrozes",
"/configuration/ControllerInstanceId": "i-0f59375e417e71f4e",
"/configuration/DCVAllowedInstances": "['m7i-flex.*', 'm7i.*', 'm6i.*', 'm6g.*', 'm5.*', 'g4dn.*', 'g4ad.*', 'g5.*', 'g6.*', 'gr6.*', 'c6i.*', 'c6a.*', 'r6i.*', 'r6a.*']",
"/configuration/HPC": "{'scheduler_engine': 'openpbs', 'deployment_type': 'tgz'}",
"/configuration/PrivateSubnets": "['subnet-0145161ba2ffb677a', 'subnet-0102a5b1b21cb1aef', 'subnet-0bc3f861ffb2322fe']",
"/configuration/PublicSubnets": "['subnet-01df1d82e8e125daf', 'subnet-02747c0e81532b9b9', 'subnet-0ac8c9c3cf4fdb82a']",
"/configuration/SchedulerDeploymentType": "tgz",
"/configuration/BaseOS": "amazonlinux2",
"/configuration/ClusterId": "soca-mcrozes",
"/configuration/ComputeNodeIAMRole": "soca--ComputeNodeRole7A9ECFBB-jFIV23XPETZq",
"/configuration/ControllerSecurityGroup": "sg-0e8a210e8d91a8fb4",
...... TRUNCATED
Please note all data stored on SSM are casted as string.
Integration with Bootstrap Nodes Setup¶
You can use all these variables in your template scripts (except in UserData where only a subset of these variables are available).
To retrieve a variable, simply use context.get("<VARIABLE_NAME>")
. See examples below:
Example: Configure OpenLDAP or Microsoft AD
# Configure OpenLDAP or Microsoft AD
{% if context.get("/configuration/UserDirectory/provider") in ["existing_openldap", "openldap"] %}
# OpenLDAP configuration
{% include "templates/linux/openldap_client.sh.j2" %}
{% elif context.get("/configuration/UserDirectory/provider") in ["existing_activedirectory","aws_ds_managed_activedirectory", "aws_ds_simple_activedirectory"] %}
# Active Directory configuration
{% include "templates/linux/join_activedirectory.sh.j2" %}
{% else %}
log_error "UserDirectory/provider must be existing_activedirectory, existing_openldap, openldap, aws_ds_simple_activedirectory, aws_ds_managed_activedirectory, detected {{ context.get("/configuration/UserDirectory/provider") }}"
exit 1
{% endif %}
Example: Mount /data based on the FileSystem configured:
# /data
{% if context.get("/configuration/FileSystemDataProvider") == "efs" %}
mount_efs "{{ context.get("/configuration/FileSystemData") }}:/" "/data"
{% elif context.get("/configuration/FileSystemDataProvider") == "fsx_openzfs" %}
mount_fsx_openzfs "{{ context.get("/configuration/FileSystemData") }}:/fsx" "/data"
{% elif context.get("/configuration/FileSystemDataProvider") == "fsx_lustre" %}
mount_fsx_lustre "{{ context.get("/configuration/FileSystemData") }}" "/data"
{% elif context.get("/configuration/FileSystemDataProvider") == "fsx_ontap" %}
mount_fsx_ontap "{{ context.get("/configuration/FileSystemData") }}:/vol1" "/data"
{% endif %}
Additionally, we will expose ephemeral /job/*
hierarchy tree for job specific informations (JobID, JobOwner).
This tree is not stored on AWS System Manager Parameter Store but made available temporarily during the bootstrap sequence on all EC2 nodes provisioned for the given job.
Example: Add EFA configuration if efa_support=True is specified during job submission
{% if context.get("/job/Efa", False) %}
{% include "templates/linux/efa.sh.j2" %}
{% endif %}