Spark Operator with YuniKorn
Introduction
The EKS Cluster design for the Data on EKS blueprint is optimized for running Spark applications with Spark Operator and Apache YuniKorn as the batch scheduler. This blueprint shows both options of leveraging Cluster Autoscaler and Karpenter for Spark Workloads. AWS for FluentBit is employed for logging, and a combination of Prometheus, Amazon Managed Prometheus, and open source Grafana are used for observability. Additionally, the Spark History Server Live UI is configured for monitoring running Spark jobs through an NLB and NGINX ingress controller.
Spark workloads with Karpenter
Spark workloads with ClusterAutoscaler and Managed NodeGroups
NVMe SSD Instance Storage for Spark Shuffle data
Spark Operator
Deploying the Solution
Execute Sample Spark job with Karpenter
Execute Sample Spark job with Cluster Autoscaler and Managed Node groups
Example for TPCDS Benchmark test
Cleanup
caution
To avoid unwanted charges to your AWS account, delete all the AWS resources created during this deployment