Manifest Deployment Guide

Deploying Kubeflow with RDS and S3 using Kustomize or Helm

Note: Helm installation option is still in preview.

This guide can be used to deploy Kubeflow Pipelines (KFP) and Katib with RDS and S3.

RDS

Amazon Relational Database Service (RDS) is a managed relational database service that facilitates several database management tasks such as database scaling, database backups, database software patching, OS patching, and more.

In the default Kubeflow installation, the KFP and Katib components both use their own MySQL pod to persist KFP data (such as experiments, pipelines, jobs, etc.) and Katib experiment observation logs, respectively.

Compared to the MySQL setup in the default installation, using RDS provides the following advantages:

Availability: RDS provides high availability and failover support for DB instances using Multi Availability Zone (Mulit-AZ) deployments with a single standby DB instance, increasing the availability of KFP and Katib services during unexpected network events.
Scalability: RDS can be configured to handle availability and scaling needs. The default Kubeflow installation uses an EBS-hosted Persistent Volume Claim that is AZ-bound and does not support automatic online resizing.
Persistent data: KFP and Katib data can persist beyond single Kubeflow installations. Using RDS decouples the KFP and Katib datastores from the Kubeflow deployment, allowing multiple Kubeflow installations to reuse the same RDS instance provided that the KFP component versions store data in a format that is compatible.
Customization and management: RDS provides management features to facilitate changing database instance types, updating SQL versions, and more.

S3

Amazon Simple Storage Service (S3) is an object storage service that is highly scalable, available, secure, and performant.

In the default Kubeflow installation, the KFP component uses the MinIO object storage service that can be configured to store objects in S3. However, by default the installation hosts the object store service locally in the cluster. KFP stores data such as pipeline architectures and pipeline run artifacts in MinIO.

Configuring MinIO to read and write to S3 provides the following advantages:

Scalability and availability: S3 offers industry-leading scalability and availability and is more durable than the default MinIO object storage solution provided by Kubeflow.
Persistent artifacts: KFP artifacts can persist beyond single Kubeflow installations. Using S3 decouples the KFP artifact store from the Kubeflow deployment, allowing multiple Kubeflow installations to access the same artifacts provided that the KFP component versions store data in a format that is compatible.
Customization and management: S3 provides management features to help optimize, organize, and configure access to your data to meet your specific business, organizational, and compliance requirements.

To get started with configuring and installing your Kubeflow installation with RDS and S3 follow the install steps below to configure and deploy the Kustomize manifest.

Install

The following steps show how to configure and deploy Kubeflow with supported AWS services.

Using only RDS or only S3

Steps relevant only to the RDS installation are prefixed with [RDS].

Steps relevant only to the S3 installation are prefixed with [S3].

Steps without any prefixing are necessary for all installations.

To install for only RDS or only S3, complete the steps relevant to your installation choice.

To install for both RDS and S3, complete all the steps below.

1.0 Prerequisites

Follow the steps in Prerequisites to make sure that you have everything you need to get started.

Make sure you are starting from the repository root directory. Export the below variable:

export REPO_ROOT=$(pwd)

2.0 Set up RDS, S3, and configure Secrets

There are two ways to create RDS and S3 resources before you deploy the Kubeflow manifests. Either use the automated setup Python script that is mentioned in the following step, or follow the manual setup instructions.

As of Kubeflow 1.7, there are two options to configure Amazon S3 as an artifact store for pipelines. Choose one of the following options:

Note: IRSA is only supported in KFPv1, if you plan to use KFPv2, choose the IAM User option. IRSA support for KFPv2 will be added in the next release.

Option 1 - IRSA (Recommended): IAM Role for Service Account (IRSA) which allows the use of AWS IAM permission boundaries at the Kubernetes pod level. A Kubernetes service account (SA) is associated with an IAM role with a role policy that scopes the IAM permissions (e.g. S3 read/write access, etc.). When a pod in the SA namespace is annotated with the SA name, EKS injects the IAM role ARN and a token is used to get the credentials so that the pod can make requests to AWS services within the scope of the role policy associated with the IRSA. For more information, see Amazon EKS IAM roles for service accounts.
Option 2 - IAM User (Deprecated): Create an IAM user with permissions to get bucket locations and allow read and write access to objects in an S3 bucket where you want to store the Kubeflow artifacts. Take note of the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY of the IAM user that you created to use in the following step, which will be referenced as minio_aws_access_key_id and minio_aws_secret_access_key respectively.

Export your desired PIPELINE_S3_CREDENTIAL_OPTION:

IRSA
IAM User

export PIPELINE_S3_CREDENTIAL_OPTION=irsa

export PIPELINE_S3_CREDENTIAL_OPTION=static

2.1 Option 1: Automated Setup

Note: Automated Setup is only supported for RDS AND S3 Deployments, for RDS/S3 only use the manual steps.

This setup performs all the manual steps in an automated fashion.

The script takes care of creating the S3 bucket, setting up IRSA to access to S3 or creating the S3 Secrets if using static credentials, setting up the RDS database, and creating the RDS Secret using the Secrets manager. The script also edits the required configuration files for Kubeflow Pipelines to be properly configured for the RDS database during Kubeflow installation. The script also handles cases where the resources already exist. In this case, the script will simply skip the step.

Note: The script will not delete any resource. Therefore, if a resource already exists (eg: Secret, database with the same name, or S3 bucket), it will skip the creation of those resources and use the existing resources instead. This is by design in order to prevent unwanted results, such as accidental deletion. For example, if a database with the same name already exists, the script will skip the database creation setup. If you forgot to change the database name used for creation, then this gives you the chance to retry the script with the proper value. See python auto-rds-s3-setup.py --help for the list of parameters, as well as their default values.

Navigate to the tests/e2e directory.
```
cd tests/e2e
```

Export values for CLUSTER_REGION, CLUSTER_NAME, S3_BUCKET.

export CLUSTER_REGION=<>
export CLUSTER_NAME=<>
export S3_BUCKET=<>
export DB_INSTANCE_NAME=<>
export DB_SUBNET_GROUP_NAME=<>
export RDS_SECRET_NAME=<>

Export your desired PIPELINE_S3_CREDENTIAL_OPTION specific values

IRSA
IAM User

export PIPELINE_S3_CREDENTIAL_OPTION=irsa

export S3_SECRET_NAME=<>
export MINIO_AWS_ACCESS_KEY_ID=<>
export MINIO_AWS_SECRET_ACCESS_KEY=<>
export PIPELINE_S3_CREDENTIAL_OPTION=static

Run the auto-rds-s3-setup.py script

IRSA
IAM User

PYTHONPATH=.. python utils/rds-s3/auto-rds-s3-setup.py --region $CLUSTER_REGION --cluster $CLUSTER_NAME --bucket $S3_BUCKET --db_instance_name $DB_INSTANCE_NAME  --rds_secret_name $RDS_SECRET_NAME --db_subnet_group_name $DB_SUBNET_GROUP_NAME --pipeline_s3_credential_option $PIPELINE_S3_CREDENTIAL_OPTION

PYTHONPATH=.. python utils/rds-s3/auto-rds-s3-setup.py --region $CLUSTER_REGION --cluster $CLUSTER_NAME --bucket $S3_BUCKET --s3_aws_access_key_id $MINIO_AWS_ACCESS_KEY_ID --s3_aws_secret_access_key $MINIO_AWS_SECRET_ACCESS_KEY --db_instance_name $DB_INSTANCE_NAME --s3_secret_name $S3_SECRET_NAME  --rds_secret_name $RDS_SECRET_NAME --db_subnet_group_name $DB_SUBNET_GROUP_NAME --pipeline_s3_credential_option $PIPELINE_S3_CREDENTIAL_OPTION

Advanced customization

The auto-rds-s3-setup.py script applies default values for the user password, max storage, storage type, instance type, and more. You can customize those preferences by specifying different values.

Learn more about the different parameters with the following command:

PYTHONPATH=.. python utils/rds-s3/auto-rds-s3-setup.py --help

2.2 Option 2: Manual Setup

Follow this step if you prefer to manually set up each component.

[S3] Create an S3 Bucket

Refer to the S3 documentation for steps on creating an S3 bucket. Take note of your S3 bucket name to use in the following steps.
[RDS] Create an RDS Instance

Refer to the RDS documentation for steps on creating an RDS MySQL instance.

When creating the RDS instance for security and connectivity reasons, we recommend that:
- The RDS instance is in the same VPC as the cluster
- The RDS instance subnets must belong to at least two private subnets within the VPC
- The RDS instance security group is the same security group used by the EKS node instances
To complete the following steps you will need to keep track of the following:
- RDS database name (not to be confused with the DB identifier)
- RDS database admin username
- RDS database admin password
- RDS database endpoint URL
- RDS database port

2.2.1 RDS Setup

Export values:

export RDS_SECRET="<your rds secret name>"
export DB_HOST="<your rds db host>"
export MLMD_DB=metadata_db

Create Secrets in AWS Secrets Manager
1. [RDS] Create the RDS Secret and configure the Secret provider:
  1. Configure a Secret (e.g rds-secret), with the RDS DB name, RDS endpoint URL, RDS DB port, and RDS DB credentials that were configured when creating your RDS instance.
    - For example, if your database name is kubeflow, your endpoint URL is rm12abc4krxxxxx.xxxxxxxxxxxx.us-west-2.rds.amazonaws.com, your DB port is 3306, your DB username is admin, and your DB password is Kubefl0w your secret should look similar to the following:
    - aws secretsmanager create-secret --name $RDS_SECRET --secret-string '{"username":"admin","password":"Kubefl0w","database":"kubeflow","host":"rm12abc4krxxxxx.xxxxxxxxxxxx.us-west-2.rds.amazonaws.com","port":"3306"}' --region $CLUSTER_REGION
  2. Rename the parameters.objects.objectName field in the RDS Secret provider configuration to the name of the Secret.
    - Rename the field with the following command: Select the package manager of your choice.
      - Kustomize
      - Helm
      yq e -i '.spec.parameters.objects |= sub("rds-secret",env(RDS_SECRET))' awsconfigs/common/aws-secrets-manager/rds/secret-provider.yaml
      
      yq e '.rds.secretName = env(RDS_SECRET)' -i charts/common/aws-secrets-manager/rds-only/values.yaml yq e '.rds.secretName = env(RDS_SECRET)' -i charts/common/aws-secrets-manager/rds-s3/values.yaml

2.2.2 S3 Setup

Export values:

export S3_BUCKET="<your s3 bucket name>"
export MINIO_SERVICE_HOST=s3.amazonaws.com

As of Kubeflow 1.7, there are two options to configure Amazon S3 as an artifact store for pipelines. Choose one of the following options:

Note: IRSA is only supported in KFPv1, if you plan to use KFPv2, choose the IAM User option. IRSA support for KFPv2 will be added in the next release.

Option 1 - IRSA (Recommended): Follow Configure using IRSA
Option 2 - IAM User (Deprecated): Follow Configure using IAM User

2.2.2.1 Configure using IRSA

IAM Role for Service Account (IRSA) which allows the use of AWS IAM permission boundaries at the Kubernetes pod level. A Kubernetes service account (SA) is associated with an IAM role with a role policy that scopes the IAM permissions (e.g. S3 read/write access, etc.). When a pod in the SA namespace is annotated with the SA name, EKS injects the IAM role ARN and a token is used to get the credentials so that the pod can make requests to AWS services within the scope of the role policy associated with the IRSA. For more information, see Amazon EKS IAM roles for service accounts.

Create and Configure IAM Roles:

An OIDC provider must exist for your cluster to use IRSA. Create an OIDC provider and associate it with your EKS cluster by running the following command if your cluster doesn’t already have one:
```
eksctl utils associate-iam-oidc-provider --cluster ${CLUSTER_NAME} \
--region ${CLUSTER_REGION} --approve
```

Get the identity issuer URL by running the following commands:

export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query "Account" --output text)
export OIDC_PROVIDER_URL=$(aws eks describe-cluster --name $CLUSTER_NAME --region $CLUSTER_REGION \
--query "cluster.identity.oidc.issuer" --output text | cut -c9-)

Create an IAM policy with access to the S3 bucket where pipeline artifacts will be stored. The following policy grants full access to the S3 bucket, you can scope it down by giving read, write and GetBucketLocation permissions.

cat <<EOF > s3_policy.json
{
   "Version": "2012-10-17",
   "Statement": [
            {
         "Effect": "Allow",
         "Action": "s3:*",
         "Resource": [
            "arn:aws:s3:::${S3_BUCKET}",
            "arn:aws:s3:::${S3_BUCKET}/*"
               ]
            }
      ]
}
EOF

Create Pipeline Backend Role

cat <<EOF > backend-trust.json
{
"Version": "2012-10-17",
"Statement": [
   {
   "Effect": "Allow",
   "Principal": {
      "Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER_URL}"
   },
   "Action": "sts:AssumeRoleWithWebIdentity",
   "Condition": {
      "StringEquals": {
      "${OIDC_PROVIDER_URL}:aud": "sts.amazonaws.com",
      "${OIDC_PROVIDER_URL}:sub": "system:serviceaccount:kubeflow:ml-pipeline"
      }
   }
   }
]
}
EOF

export PIPELINE_BACKEND_ROLE_NAME=kf-pipeline-backend-role-$CLUSTER_NAME
aws --region $CLUSTER_REGION iam create-role --role-name $PIPELINE_BACKEND_ROLE_NAME --assume-role-policy-document file://backend-trust.json
export BACKEND_ROLE_ARN=$(aws --region $CLUSTER_REGION iam get-role --role-name $PIPELINE_BACKEND_ROLE_NAME --output text --query 'Role.Arn')

Create Profile Role

cat <<EOF > profile-trust.json
{
"Version": "2012-10-17",
"Statement": [
   {
   "Effect": "Allow",
   "Principal": {
      "Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_PROVIDER_URL}"
   },
   "Action": "sts:AssumeRoleWithWebIdentity",
   "Condition": {
      "StringEquals": {
      "${OIDC_PROVIDER_URL}:aud": "sts.amazonaws.com",
      "${OIDC_PROVIDER_URL}:sub": "system:serviceaccount:kubeflow-user-example-com:default-editor"
      }
   }
   }
]
}
EOF

export PROFILE_ROLE_NAME=kf-pipeline-profile-role-$CLUSTER_NAME
aws --region $CLUSTER_REGION iam create-role --role-name $PROFILE_ROLE_NAME --assume-role-policy-document file://profile-trust.json
export PROFILE_ROLE_ARN=$(aws --region $CLUSTER_REGION iam get-role --role-name $PROFILE_ROLE_NAME --output text --query 'Role.Arn')

Attach S3 Policy to Roles

aws --region $CLUSTER_REGION iam put-role-policy --role-name $PIPELINE_BACKEND_ROLE_NAME --policy-name kf-pipeline-s3 --policy-document file://s3_policy.json
aws --region $CLUSTER_REGION iam put-role-policy --role-name $PROFILE_ROLE_NAME --policy-name kf-pipeline-s3 --policy-document file://s3_policy.json

Configure the manifests with role ARNs.

Select the package manager of your choice.

Kustomize
Helm

yq e '.metadata.annotations."eks.amazonaws.com/role-arn"=env(BACKEND_ROLE_ARN)' -i awsconfigs/apps/pipeline/s3/service-account.yaml
yq e '.spec.plugins[0].spec."awsIamRole"=env(PROFILE_ROLE_ARN)' -i awsconfigs/common/user-namespace/overlay/profile.yaml

yq e '.s3.roleArn = env(BACKEND_ROLE_ARN)' -i charts/apps/kubeflow-pipelines/rds-s3/values.yaml
yq e '.s3.roleArn = env(BACKEND_ROLE_ARN)' -i charts/apps/kubeflow-pipelines/s3-only/values.yaml
yq e '.awsIamForServiceAccount.awsIamRole = env(PROFILE_ROLE_ARN)' -i charts/common/user-namespace/values.yaml

2.2.2.2 Configure using IAM User

Create an IAM user with permissions to get bucket locations and allow read and write access to objects in an S3 bucket where you want to store the Kubeflow artifacts. Take note of the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY of the IAM user that you created to use in the following step, which will be referenced as minio_aws_access_key_id and minio_aws_secret_access_key respectively.

Create and configure S3 Secrets:

Export values:

export S3_SECRET="<your s3 secret name>"
export MINIO_AWS_ACCESS_KEY_ID="<your s3 user access key>"
export MINIO_AWS_SECRET_ACCESS_KEY="<your s3 user secret  key>"

Configure a Secret (e.g. s3-secret) with your AWS credentials. These need to be long-term credentials from an IAM user and not temporary.
- For more details about configuring or finding your AWS credentials, see AWS security credentials
- ```
aws secretsmanager create-secret --name $S3_SECRET --secret-string '{"accesskey":"'$MINIO_AWS_ACCESS_KEY_ID'","secretkey":"'$MINIO_AWS_SECRET_ACCESS_KEY'"}' --region $CLUSTER_REGION
```

Rename the parameters.objects.objectName field in the S3 Secret provider configuration to the name of the Secret.

Select the package manager of your choice.

Kustomize
Helm

yq e -i '.spec.parameters.objects |= sub("s3-secret",env(S3_SECRET))' awsconfigs/common/aws-secrets-manager/s3/secret-provider.yaml

yq e '.s3.secretName = env(S3_SECRET)' -i charts/common/aws-secrets-manager/s3-only/values.yaml
yq e '.s3.secretName = env(S3_SECRET)' -i charts/common/aws-secrets-manager/rds-s3/values.yaml

Install CSI Driver and update KFP configurations

Install AWS Secrets & Configuration Provider with Kubernetes Secrets Store CSI driver

Run the following commands to enable OIDC and create an iamserviceaccount with permissions to retrieve the Secrets created with AWS Secrets Manager.

eksctl utils associate-iam-oidc-provider --region=$CLUSTER_REGION --cluster=$CLUSTER_NAME --approve

eksctl create iamserviceaccount  --name kubeflow-secrets-manager-sa  --namespace kubeflow  --cluster $CLUSTER_NAME --attach-policy-arn  arn:aws:iam::aws:policy/AmazonSSMReadOnlyAccess --attach-policy-arn arn:aws:iam::aws:policy/SecretsManagerReadWrite --override-existing-serviceaccounts   --approve --region $CLUSTER_REGION

Run the following commands to install AWS Secrets & Configuration Provider with Kubernetes Secrets Store CSI driver:

 kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/secrets-store-csi-driver/v1.3.2/deploy/rbac-secretproviderclass.yaml
 kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/secrets-store-csi-driver/v1.3.2/deploy/csidriver.yaml
 kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/secrets-store-csi-driver/v1.3.2/deploy/secrets-store.csi.x-k8s.io_secretproviderclasses.yaml
 kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/secrets-store-csi-driver/v1.3.2/deploy/secrets-store.csi.x-k8s.io_secretproviderclasspodstatuses.yaml
 kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/secrets-store-csi-driver/v1.3.2/deploy/secrets-store-csi-driver.yaml
 kubectl apply -f https://raw.githubusercontent.com/kubernetes-sigs/secrets-store-csi-driver/v1.3.2/deploy/rbac-secretprovidersyncing.yaml
 kubectl apply -f https://raw.githubusercontent.com/aws/secrets-store-csi-driver-provider-aws/main/deployment/aws-provider-installer.yaml

Update the KFP configurations.

[RDS] Configure the RDS endpoint URL and the metadata DB name:

Select the package manager of your choice.

Kustomize
Helm

printf '
dbHost='$DB_HOST'
mlmdDb='$MLMD_DB'
' > awsconfigs/apps/pipeline/rds/params.env

yq e '.rds.dbHost = env(DB_HOST)' -i charts/apps/kubeflow-pipelines/rds-s3/values.yaml
yq e '.rds.dbHost = env(DB_HOST)' -i charts/apps/kubeflow-pipelines/rds-s3-static/values.yaml
yq e '.rds.dbHost = env(DB_HOST)' -i charts/apps/kubeflow-pipelines/rds-only/values.yaml
yq e '.rds.mlmdDb = env(MLMD_DB)' -i charts/apps/kubeflow-pipelines/rds-s3/values.yaml
yq e '.rds.mlmdDb = env(MLMD_DB)' -i charts/apps/kubeflow-pipelines/rds-s3-static/values.yaml
yq e '.rds.mlmdDb = env(MLMD_DB)' -i charts/apps/kubeflow-pipelines/rds-only/values.yaml

[S3] Configure the S3 bucket name and S3 bucket region:

Select the package manager of your choice.

Kustomize
Helm

printf '
bucketName='$S3_BUCKET'
minioServiceHost='$MINIO_SERVICE_HOST'
minioServiceRegion='$CLUSTER_REGION'
' > awsconfigs/apps/pipeline/s3/params.env

yq e '.s3.bucketName = env(S3_BUCKET)' -i charts/apps/kubeflow-pipelines/rds-s3/values.yaml
yq e '.s3.minioServiceRegion = env(CLUSTER_REGION)' -i charts/apps/kubeflow-pipelines/rds-s3/values.yaml
yq e '.s3.minioServiceHost = env(MINIO_SERVICE_HOST)' -i charts/apps/kubeflow-pipelines/rds-s3/values.yaml
yq e '.s3.bucketName = env(S3_BUCKET)' -i charts/apps/kubeflow-pipelines/rds-s3-static/values.yaml
yq e '.s3.minioServiceRegion = env(CLUSTER_REGION)' -i charts/apps/kubeflow-pipelines/rds-s3-static/values.yaml
yq e '.s3.minioServiceHost = env(MINIO_SERVICE_HOST)' -i charts/apps/kubeflow-pipelines/rds-s3-static/values.yaml
yq e '.s3.bucketName = env(S3_BUCKET)' -i charts/apps/kubeflow-pipelines/s3-only/values.yaml
yq e '.s3.minioServiceHost = env(MINIO_SERVICE_HOST)' -i charts/apps/kubeflow-pipelines/s3-only/values.yaml
yq e '.s3.minioServiceRegion = env(CLUSTER_REGION)' -i charts/apps/kubeflow-pipelines/s3-only/values.yaml

(Optional) Configure Culling for Notebooks

Enable culling for notebooks by following the instructions in configure culling for notebooks guide.

3.0 Build Manifests and install Kubeflow

Once you have the resources ready, you can deploy the Kubeflow manifests for one of the following deployment options:

both RDS and S3
RDS only
S3 only

Navigate to the root of repository

cd $REPO_ROOT

[RDS and S3] Deploy both RDS and S3

Use the following command to deploy the Kubeflow manifests for both RDS and S3:

Kustomize
Helm

make deploy-kubeflow INSTALLATION_OPTION=kustomize DEPLOYMENT_OPTION=rds-s3 PIPELINE_S3_CREDENTIAL_OPTION=$PIPELINE_S3_CREDENTIAL_OPTION

make deploy-kubeflow INSTALLATION_OPTION=helm DEPLOYMENT_OPTION=rds-s3 PIPELINE_S3_CREDENTIAL_OPTION=$PIPELINE_S3_CREDENTIAL_OPTION

[RDS] Deploy RDS only

Use the following command to deploy the Kubeflow manifests for RDS only:

Kustomize
Helm

make deploy-kubeflow INSTALLATION_OPTION=kustomize DEPLOYMENT_OPTION=rds-only PIPELINE_S3_CREDENTIAL_OPTION=$PIPELINE_S3_CREDENTIAL_OPTION

make deploy-kubeflow INSTALLATION_OPTION=helm DEPLOYMENT_OPTION=rds-only PIPELINE_S3_CREDENTIAL_OPTION=$PIPELINE_S3_CREDENTIAL_OPTION

[S3] Deploy S3 only

Use the following command to deploy the Kubeflow manifests for S3 only:

Kustomize
Helm

make deploy-kubeflow INSTALLATION_OPTION=kustomize DEPLOYMENT_OPTION=s3-only PIPELINE_S3_CREDENTIAL_OPTION=$PIPELINE_S3_CREDENTIAL_OPTION

make deploy-kubeflow INSTALLATION_OPTION=helm DEPLOYMENT_OPTION=s3-only PIPELINE_S3_CREDENTIAL_OPTION=$PIPELINE_S3_CREDENTIAL_OPTION

Once everything is installed successfully, you can access the Kubeflow Central Dashboard by logging in to your cluster.

You can now start experimenting and running your end-to-end ML workflows with Kubeflow!

4.0 Creating Profiles

A default profile named kubeflow-user-example-com for email user@example.com has been configured with this deployment. If you are using IRSA as PIPELINE_S3_CREDENTIAL_OPTION, any additional profiles that you create will also need to be configured with IRSA and S3 Bucket access. Follow the pipeline profiles for instructions on how to create additional profiles.

If you are not using this feature, you can create a profile by just specifying email address of the user.

5.0 Verify the installation

5.1 Verify RDS

Connect to your RDS instance from a pod within the cluster with the following command:

kubectl run -it --rm --image=mysql:5.7 --restart=Never mysql-client -- mysql -h <YOUR RDS ENDPOINT> -u <YOUR LOGIN> -p<YOUR PASSWORD>

You can find your credentials by visiting AWS Secrets Manager or by using the AWS CLI.

For example, use the following command to retrieve the value of a Secret named rds-secret:

aws secretsmanager get-secret-value \
    --region $CLUSTER_REGION \
    --secret-id $RDS_SECRET_NAME \
    --query 'SecretString' \
    --output text

Once you are connected to your RDS instance, verify that the databases kubeflow and mlpipeline exist.

mysql> show databases;

+--------------------+
| Database           |
+--------------------+
| cachedb            |
| information_schema |
| kubeflow           |
| metadb             |
| mlpipeline         |
| mysql              |
| performance_schema |
| sys                |
+--------------------+

Verify that the database mlpipeline has the following tables:

mysql> use mlpipeline; show tables;

+----------------------+
| Tables_in_mlpipeline |
+----------------------+
| db_statuses          |
| default_experiments  |
| experiments          |
| jobs                 |
| pipeline_versions    |
| pipelines            |
| resource_references  |
| run_details          |
| run_metrics          |
| tasks                |
+----------------------+

Access the Kubeflow Central Dashboard by logging in to your cluster and navigate to Katib (under Experiments (AutoML)).
Create an experiment using the following yaml file.
Once the experiment is complete, verify that the following table exists:

mysql> use kubeflow; show tables;

+----------------------+
| Tables_in_kubeflow   |
+----------------------+
| observation_logs     |
+----------------------+

Describe the observation_logs to verify that they are being populated.

mysql> select * from observation_logs;

5.2 Verify S3

Access the Kubeflow Central Dashboard by logging in to your cluster and navigate to Kubeflow Pipelines (under Pipelines).
Create an experiment named test and create a run using the sample pipeline [Demo] XGBoost - Iterative model training.
Once the run is completed, go to the S3 AWS console and open the bucket that you specified for your Kubeflow installation.
Verify that the bucket is not empty and was populated by the outputs of the experiment.

6.0 Uninstall Kubeflow

Run the following command to uninstall your Kubeflow deployment:

Note: Make sure you have the correct INSTALLATION_OPTION, DEPLOYMENT_OPTION and PIPELINE_S3_CREDENTIAL_OPTION environment variables set for your chosen installation

Kustomize
Helm

make delete-kubeflow INSTALLATION_OPTION=kustomize DEPLOYMENT_OPTION=rds-s3 PIPELINE_S3_CREDENTIAL_OPTION=$PIPELINE_S3_CREDENTIAL_OPTION

make delete-kubeflow INSTALLATION_OPTION=helm DEPLOYMENT_OPTION=rds-s3 PIPELINE_S3_CREDENTIAL_OPTION=$PIPELINE_S3_CREDENTIAL_OPTION

To uninstall AWS resources created by the automated setup, run the cleanup script:

Navigate to the tests/e2e directory.

cd tests/e2e

Install the script dependencies.

pip install -r requirements.txt

Make sure that you have the configuration file created by the script in tests/e2e/utils/rds-s3/metadata.yaml.

PYTHONPATH=.. python utils/rds-s3/auto-rds-s3-cleanup.py

Last modified June 8, 2023: Fix helm deployment for rds-s3-static (#749) (4e247e37)