EMR on EKS Stack
Production-ready Amazon EMR on EKS examples and configurations for running Apache Spark workloads on Amazon EKS with managed EMR capabilities. Choose from infrastructure deployment and storage optimization use cases.
Getting Started
Deploy Infrastructure
Start with the infrastructure deployment guide to set up your EMR on EKS foundation
Choose Storage Strategy
Select the storage example that matches your performance and cost requirements
Submit Spark Jobs
Run Spark applications with EMR managed runtime and optimized configurations
Monitor & Optimize
Use EMR Studio, CloudWatch, and Spark UI for observability and performance tuning
Infrastructure Deployment
Complete infrastructure deployment guide for EMR on EKS with virtual cluster setup, IAM roles, and Karpenter configuration
EBS Hostpath Storage
Cost-effective EBS root volume storage for Spark shuffle data. Simple setup with shared node storage and ~70% cost reduction vs per-pod PVCs
EBS Dynamic PVC Storage
Production-ready EBS Dynamic PVC with automatic volume provisioning for Spark shuffle storage. Isolated storage per executor with gp3 volumes
NVMe SSD Storage
Maximum I/O performance with NVMe instance store SSDs. Leverage local NVMe drives on Graviton instances for ultra-fast shuffle operations
EMR Spark Operator
Declarative Spark job management with Kubernetes-native EMR Spark Operator. GitOps-ready with SparkApplication CRDs for streamlined workflows