Skip to main content

EMR on EKS Stack

Production-ready Amazon EMR on EKS examples and configurations for running Apache Spark workloads on Amazon EKS with managed EMR capabilities. Choose from infrastructure deployment and storage optimization use cases.

Getting Started

1

Deploy Infrastructure

Start with the infrastructure deployment guide to set up your EMR on EKS foundation

2

Choose Storage Strategy

Select the storage example that matches your performance and cost requirements

3

Submit Spark Jobs

Run Spark applications with EMR managed runtime and optimized configurations

4

Monitor & Optimize

Use EMR Studio, CloudWatch, and Spark UI for observability and performance tuning

💾

EBS Hostpath Storage

Cost-effective EBS root volume storage for Spark shuffle data. Simple setup with shared node storage and ~70% cost reduction vs per-pod PVCs

StorageOptimization
💿

EBS Dynamic PVC Storage

Production-ready EBS Dynamic PVC with automatic volume provisioning for Spark shuffle storage. Isolated storage per executor with gp3 volumes

StoragePerformance
🎯

EMR Spark Operator

Declarative Spark job management with Kubernetes-native EMR Spark Operator. GitOps-ready with SparkApplication CRDs for streamlined workflows

GuideInfrastructure