Manifest Deployment Guide
Note: Helm installation option is still in preview.
This guide describes how to deploy Kubeflow on Amazon EKS using Cognito for your identity provider, RDS for your database, and S3 for your artifact storage.
Prerequisites
Refer to the general prerequisites guide and the RDS and S3 setup guide in order to:
- Install the CLI tools
- Clone the repositories
- Create an EKS cluster
- Create an S3 Bucket
- Create an RDS Instance
- Configure AWS Secrets or IAM Role for S3
- Configure AWS Secrets for RDS
- Install AWS Secrets and Kubernetes Secrets Store CSI driver
- Configure an RDS endpoint and an S3 bucket name for Kubeflow Pipelines
Configure Custom Domain and Cognito
- Follow the Section 2.0 of Cognito setup guide in order to:
- Create a custom domain
- Create TLS certificates for the domain
- Create a Cognito Userpool
- Configure Ingress
(Optional) Configure Culling for Notebooks
Enable culling for notebooks by following the instructions in configure culling for notebooks guide.
-
Deploy Kubeflow.
-
Export your pipeline-s3-credential-option
# Pipeline S3 Credential Option to configure export PIPELINE_S3_CREDENTIAL_OPTION="irsa"
# Pipeline S3 Credential Option to configure export PIPELINE_S3_CREDENTIAL_OPTION="static"
-
Install Kubeflow using the following command:
-
make deploy-kubeflow INSTALLATION_OPTION=kustomize DEPLOYMENT_OPTION=cognito-rds-s3 PIPELINE_S3_CREDENTIAL_OPTION=$PIPELINE_S3_CREDENTIAL_OPTION
make deploy-kubeflow INSTALLATION_OPTION=helm DEPLOYMENT_OPTION=cognito-rds-s3 PIPELINE_S3_CREDENTIAL_OPTION=$PIPELINE_S3_CREDENTIAL_OPTION
-
Follow the rest of the Cognito guide from section 5.0 (Updating the domain with ALB address) in order to:
- Add/Update the DNS records in a custom domain with the ALB address
- Create a user in a Cognito user pool
- Create a profile for the user from the user pool
- Connect to the central dashboard
Creating Profiles
A default profile named kubeflow-user-example-com
for email user@example.com
has been configured with this deployment. If you are using IRSA as PIPELINE_S3_CREDENTIAL_OPTION
, any additional profiles that you create will also need to be configured with IRSA and S3 Bucket access. Follow the pipeline profiles for instructions on how to create additional profiles.
If you are not using this feature, you can create a profile by just specifying email address of the user.
Uninstall Kubeflow
Note: Delete all the resources you might have created in your profile namespaces before running these steps.
-
Run the following commands to delete the profiles, ingress and corresponding ingress managed load balancer
kubectl delete profiles --all
-
Delete the kubeflow deployment:
make delete-kubeflow INSTALLATION_OPTION=kustomize DEPLOYMENT_OPTION=cognito-rds-s3 PIPELINE_S3_CREDENTIAL_OPTION=$PIPELINE_S3_CREDENTIAL_OPTION
make delete-kubeflow INSTALLATION_OPTION=helm DEPLOYMENT_OPTION=cognito-rds-s3 PIPELINE_S3_CREDENTIAL_OPTION=$PIPELINE_S3_CREDENTIAL_OPTION
-
To delete the rest of resources (subdomain, certificates etc.), run the following commands from the root of your repository:
-
Ensure you have the configuration file
tests/e2e/utils/cognito_bootstrap/config.yaml
updated by thecognito_post_deployment.py
script.. If you did not use the script, update the name, ARN, or ID of the resources that you created in a yaml file intests/e2e/utils/cognito_bootstrap/config.yaml
by referring to the following sample:cognitoUserpool: ARN: arn:aws:cognito-idp:us-west-2:123456789012:userpool/us-west-2_yasI9dbxF appClientId: 5jmk7ljl2a74jk3n0a0fvj3l31 domainAliasTarget: xxxxxxxxxx.cloudfront.net domain: auth.platform.example.com name: kubeflow-users kubeflow: alb: serviceAccount: name: alb-ingress-controller namespace: kubeflow policyArn: arn:aws:iam::123456789012:policy/alb_ingress_controller_kube-eks-clusterxxx cluster: name: kube-eks-cluster region: us-west-2 route53: rootDomain: certARN: arn:aws:acm:us-east-1:123456789012:certificate/9d8c4bbc-3b02-4a48-8c7d-d91441c6e5af hostedZoneId: XXXXX name: example.com subDomain: us-west-2-certARN: arn:aws:acm:us-west-2:123456789012:certificate/d1d7b641c238-4bc7-f525-b7bf-373cc726 hostedZoneId: XXXXX name: platform.example.com us-east-1-certARN: arn:aws:acm:us-east-1:123456789012:certificate/373cc726-f525-4bc7-b7bf-d1d7b641c238
- Run the following command to install the script dependencies and delete the resources:
You can rerun the script in case some resources fail to delete
cd tests/e2e pip install -r requirements.txt PYTHONPATH=.. python utils/cognito_bootstrap/cognito_resources_cleanup.py cd -
-
-
To delete the rest of RDS-S3 resources:
Make sure that you have the configuration file created by the script in
tests/e2e/utils/rds-s3/metadata.yaml
.PYTHONPATH=.. python utils/rds-s3/auto-rds-s3-cleanup.py