Deployment Add-ons

Add-on installation guides for Kubeflow on AWS

Kubeflow on AWS offers add-ons for additional service integrations, which can be used with any of the available deployment options.

  • If you want to expose your Kubeflow dashboard to external traffic, then use AWS Application Load Balancer (ALB) for secure traffic management by following the Load Balancer add-on guide.
  • Use Amazon Elastic File System (EFS) backed persistent volumes with Kubeflow Notebooks or your training and inference workloads (Jupyter, model training, model tuning) to create a persistent, scalable, and shareable workspace that automatically grows and shrinks as you add and remove files with no need for management. See the EFS add-on guide for more information.
  • Use Amazon FSx for Lustre (Amazon FSx) volumes to cache training data with direct connectivity to Amazon S3 as the backing store. These volumes can support Jupyter notebook servers or distributed training. FSx for Lustre provides consistent submillisecond latencies and high concurrency, and can scale to TB/s of throughput and millions of IOPS. Refer to the FSx add-on guide for more information.
  • Integrate with Amazon CloudWatch for persistent log management, which addresses the default K8s log limits and improves your log availability and monitoring capabilities. For more information, see the CloudWatch add-on guide.
  • Integrate with Amazon Managed Prometheus and Amazon Managed Grafana to monitor Kubeflow and cluster metrics on a scalable and secure managed platform. For further information, see the Amazon Managed Prometheus and Grafana addon-guide.

Load Balancer

Expose Kubeflow over Load Balancer on AWS

EFS

Use Amazon EFS as persistent storage with Kubeflow on AWS

FSx for Lustre

Use Amazon FSx as persistent storage with Kubeflow on AWS

CloudWatch

Set up CloudWatch ContainerInsights on Amazon EKS

Prometheus

Use Prometheus, Amazon Managed Service for Prometheus, and Amazon Managed Grafana to monitor metrics with Kubeflow on AWS

Last modified September 1, 2023: v1.7.0-aws-b1.0.3 website changes (#791) (7faf1a5)