Everything you need to get started with Kubeflow on AWS
Kubeflow on AWS provides its own Kubeflow manifests that support integrations with various AWS services that are highly available and scalable. This reduces the operational overhead of maintaining the Kubeflow platform.
If you want to deploy Kubeflow with minimal changes, but optimized for Amazon Elastic Kubernetes Service (Amazon EKS), then consider the vanilla deployment option. The Kubeflow control plane is installed on top of Amazon EKS, which is a managed container service used to run and scale Kubernetes applications in the cloud.
To take greater advantage of the distribution and make use of the AWS managed services, choose one of the following deployment options according to your organization’s requirements:
- Kubeflow on AWS provides integration with the Amazon Relational Database Service (RDS) for highly scalable and available pipelines and metadata store. RDS removes the need to manage a local MYSQL database service and storage. For more information, see the RDS deployment guide.
- Integrate your deployment with Amazon Simple Storage Service (S3) for an easy-to-use pipeline artifacts store. S3 removes the need to host the local object storage MinIO. For more information, see the S3 deployment guide.
- You can also deploy Kubeflow on AWS with both RDS and S3 integrations using the RDS and S3 deployment guide.
- Use AWS Cognito for Kubeflow user authentication, which removes the complexity of managing users or Dex connectors in Kubeflow’s native Dex authentication service. For more information, see the Cognito deployment guides.
- You can also deploy Kubeflow on AWS with all three service integrations by following the Cognito, RDS, and S3 deployment guide.
Kubeflow on AWS offers add-ons for additional service integrations, which can be used with any of the available deployment options.
- If you want to expose your Kubeflow dashboard to external traffic, then use AWS Application Load Balancer (ALB) for secure traffic management by following the Load Balancer add-on guide.
- Use Amazon Elastic File System (EFS) backed persistent volumes with Kubeflow Notebooks or your training and inference workloads (Jupyter, model training, model tuning) to create a persistent, scalable, and shareable workspace that automatically grows and shrinks as you add and remove files with no need for management. See the EFS add-on guide for more information.
- Use Amazon FSx for Lustre (Amazon FSx) volumes to cache training data with direct connectivity to Amazon S3 as the backing store. These volumes can support Jupyter notebook servers or distributed training. FSx for Lustre provides consistent submillisecond latencies and high concurrency, and can scale to TB/s of throughput and millions of IOPS. Refer to the FSx add-on guide for more information.
- Integrate with Amazon CloudWatch for persistent log management, which addresses the default K8s log limits and improves your log availability and monitoring capabilities. For more information, see the CloudWatch add-on guide.
Deploy Kubeflow on AWS using Amazon Elastic Kubernetes Service (EKS)
Deploying Kubeflow with RDS and S3
Deploying Kubeflow with AWS Cognito as identity provider
Deploying Kubeflow with Amazon Cognito, RDS and S3
Add-on installation guides for Kubeflow on AWS
Delete Kubeflow deployments and Amazon EKS clusters