Install your Scale-Out Computing on AWS cluster
Step 1 - Pre-Requisite¶
Installation of SOCA is fully automated by Amazon Cloud Development Kit (CDK). Prior to installing SOCA:
-
Make sure you have an AWS account with a Security Keypair configured.
-
You must create an Amazon S3 bucket. This bucket will be used to store the installer artifacts.
-
Some basic understanding of how Amazon Cloud Development Kit (CDK) works.
Step 2 - Minimal IAM permissions required¶
IAM policies required to install SOCA
This step is optional if:
- You already have an IAM user/role with Admin privileges configured.
- You are planning to install SOCA via CloudShell as CloudShell has all required Admin permissions
We do provide a list of minimal IAM permissions required to install SOCA via installer/SOCAInstallerIamPolicy.json
.
Minimal IAM permissions
Based on your deployment type, you may not need all of them, for example, if you disable AWS Backup Integration, then all policy starting with backups:
are not needed.
First, create an IAM policy that contains the minimal API calls required to install SOCA (SOCAInstallerIamPolicy)
- 1) Go to IAM console, select Policies in the left sidebar menu then click Create Policy.
- 2) Select JSON and copy/paste the content of
installer/SOCAInstallerIamPolicy.json
.
[Optional - Not needed if you already have a policy/user that allow cdk bootstrap
command] Create another IAM policy that will be used by CDK to install SOCA (SOCAInstallerCDKIamPolicy)
- 1) Click Create Policy again to create a new policy for CDK.
- 2) Select JSON and copy/paste the content of
installer/SOCAInstallerCDKIamPolicy.json
. Make sure to update<AWS_ACCOUNT_ID_REPLACE_ME>
with your AWS account ID and<POLICY_NAME_REPLACE_ME>
with the name of the Policy you have just created (SOCAInstallerIamPolicy
if you are using the same name mentioned on this doc)
Finally, create an IAM User or IAM role
Step 2 - Download SOCA¶
You can choose to install SOCA via AWS CloudShell or using your own Linux/Mac/Windows computer:
=== "Install SOCA via AWS CloudShell"
The easiest option is to install SOCA via [AWS CloudShell](https://aws.amazon.com/cloudshell/).
Launch the AWS CloudShell in the region you are planning to install SOCA by clicking the icon in the top right as indicated in the picture below.
![](../../imgs/cloudshell.png)
This will open a shell where you can copy/paste commands. Run the following commands to prepare your CloudShell environment:
```bash
sudo yum install -y gcc python3.11
```
This command will ensure your system has all required dependency installed. Once done, update your system's python to use python3.11.
```bash
sudo update-alternatives --install /usr/bin/python3 python3 /usr/bin/python3.11 2
# Verify default system python is now 3.11.x. Output of this command should be "Python 3.11.x"
python3 --version
```
AWS CloudShell's $HOME directory has a storage limitation. For the purpose of this workshop, we will create a new folder called /soca_installer and launch the installer from there.
Let's go back to the AWS CloudShell shell and run this command to create the new folder:
```bash
sudo mkdir /soca_installer
```
Run this command to give your cloudshell-user required permissions for this folder:
```bash
sudo chown cloudshell-user /soca_installer
```
**Download SOCA**:
Scale-Out Computing on AWS is open-source and available on Github ([https://github.com/awslabs/scale-out-computing-on-aws](https://github.com/awslabs/scale-out-computing-on-aws)).
To get started, simply clone the repository:
~~~bash
# First, CD into the right location
cd /soca_installer
# (Option1): Clone the repo using HTTPS
user@host: git clone https://github.com/awslabs/scale-out-computing-on-aws .
# (Option2) Clone the repo using SSH
user@host: git clone git@github.com:awslabs/scale-out-computing-on-aws.git .
~~~
Then move to Step 3 below.
=== "Install SOCA on your local system"
Make sure you have configured your IAM user/role (see Step1) in your `awscli` and ensure `git` is installed and configured on your computer.
**Download SOCA**:
Scale-Out Computing on AWS is open-source and available on Github ([https://github.com/awslabs/scale-out-computing-on-aws](https://github.com/awslabs/scale-out-computing-on-aws)).
To get started, simply clone the repository:
~~~bash
# (Option1): Clone the repo using HTTPS
user@host: git clone https://github.com/awslabs/scale-out-computing-on-aws .
# (Option2) Clone the repo using SSH
user@host: git clone git@github.com:awslabs/scale-out-computing-on-aws.git .
~~~
Then move to Step 3 below.
Step 3 - Run the installer¶
Once you have cloned your repository, execute installer/soca_installer.sh
script. The installer will perform the following tasks:
- Check if Python3 is available on your system
- Create a custom Python virtual-environment and install required libraries
- Install NodeJS, NPM, CDK and AWS CLI if needed
- Setup your SOCA cluster
Installer is built with AWS Cloud Development Kit (CDK). Learn more information about CDK here.
Execute ./soca_installer.sh
script located in the installer
folder:
# Assuming your current working directory is the root level of SOCA
./installer/soca_installer.sh
Use Custom IAM policy
If you created a custom IAM policy for CDK in step1, you must pass it via:
./installer/soca_installer.sh --cdk-cloudformation-execution-policies <policy_arn>
Please note you must specify the Policy ARN, not just the Policy Name.
Use non default IAM profile
If you are planning to use a local IAM profile configured in your ~/.aws/credentials
other than default
, you must pass it via
./installer/soca_installer.sh --profile <profile_name>
You will then be prompted for your cluster parameters. Follow the instructions and choose a S3 bucket you own, the name of your cluster, the SSH keypair to use and other cluster parameters.
Silent Installation
You can pass all parameters via arguments to automate the installation process.
Run ./soca_installer.sh --help
to see all options available
Once all the parameters are specified, installer will run cdk bootstrap
.
This action will create a staging S3 bucket and store all assets generated by CDK. No actions will be performed if you already have your environment enabled for CDK.
SOCA will then upload the scripts (<100 mb) required to configure the scheduler to the S3 bucket you specified during installation.
Finally, the installer will trigger a cdk deploy
command and the deployment will start. This will create a new CloudFormation stack on your AWS account.
Once the cloudformation stack is created, the installer will verify if your SOCA cluster is configured correctly. The installer will exit once your SOCA is fully configured and reachable.
Use existing AWS resources¶
If needed, you can tell SOCA to re-use existing AWS resources running on your AWS account. This is particularly useful for blue/green deployment or when you want to upgrade your SOCA cluster without affecting your production workflows.
Here is a list of all existing AWS resources you can specify when installing a new SOCA cluster:
- VPC
- Subnets
- Security Groups
- IAM roles
- AWS Directory Service
- AWS OpenSearch (formerly Elasticsearch)
To re-use existing resources, enter "existing" when asked. SOCA installer will automatically scan your AWS resources and provide you with options
Configuration checks
Installer will verify your security groups configuration and provide you with recommendation if your security groups are missing key rules
Installer will automatically append required SOCA policies when re-using IAM roles
(Optional) Customize default values¶
SOCA gives you the ability to customize all resources created during the installation. For example, you can choose how many NAT Gateways to deploy (default to 1), the KMS encryption to use for your filesystems (default to aws/key), the instance type (default to m5.large) to provision for the scheduler and more.
Edit installer/default_config.yml
if you want to change the default values
Uninstall SOCA¶
SOCA is managed by CloudFormation. To uninstall SOCA, simply delete the stack associated to your cluster. As a safety measure SOCA backups (EFS, Scheduler) are not deleted by default and you will have to remove them manually from AWS Backups.
Using custom CDK Policy
Make sure that you add all required Delete
API permissions If you are using a custom CDK policy.
Post Install¶
What if SSH port (22) is blocked by your IT?¶
Scale-Out Computing on AWS supports AWS Session Manager in case you corporate firewall is blocking SSH port (22). SSM let you open a secure shell on your EC2 instance through a secure web-based session.
First, access your AWS EC2 Console and select your Scheduler instance, then click "Connect" button
Select "Session Manager" and click Connect
You now have access to a secure shell directly within your browser
Operational Metrics¶
This solution includes an option to send anonymous operational metrics to AWS. We use this data to better understand how customers use this solution and related services and products. Note that AWS will own the data gathered via this survey. Data collection will be subject to the AWS Privacy Policy.
To opt out of this feature, modify the /apps/soca/$SOCA_CONFIGURATION/cluster_manager/cloudformation_builder
and set allow_anonymous_data_collection
variable to False
When enabled, the following information is collected and sent to AWS:
- Solution ID: The AWS solution identifier
- Base Operating System: The operating system selected for the solution deployment
- Unique ID (UUID): Randomly generated, unique identifier for each solution deployment
- Timestamp: Data-collection timestamp
- Instance Data: Type or count of the state and type of instances that are provided for by the Amazon EC2 scheduler instance for each job in each AWS Region
- Keep Forever: If instances are running when no job is running
- EFA Support: If EFA support was selected
- Spot Support: If Spot support was invoked for new auto-scaling stacks
- Stack Creation Version: The version of the stack that is created or deleted
- Status: The status of the stack (stack_created or stack_deleted)
- Scratch Disk Size: The size of the scratch disk selected for each solution deployment
- Region: The region where the stack is deployed
- FSxLustre: If the job is using FSx for Lustre
What's next ?¶
Learn how to access your cluster, how to submit your first job or even how to change your Scale-Out Computing on AWS DNS to match your personal domain name.