Flink Infrastructure Deployment
Flink on EKS Infrastructure Deployment
Complete guide for deploying and configuring the Flink on EKS infrastructure for streaming workloads.
Prerequisites
Before deploying, ensure you have the following tools installed:
- AWS CLI - Install Guide
- Terraform (>= 1.0) - Install Guide
- kubectl - Install Guide
- Helm (>= 3.0) - Install Guide
- AWS credentials configured - Run
aws configureor use IAM roles
Overview
The Flink on EKS infrastructure provides a production-ready foundation for Apache Flink streaming workloads on Amazon EKS. It includes:
- EKS Cluster with streaming-optimized configurations
- Flink Operator for native Kubernetes Flink job management
- Kafka Cluster for event streaming and data ingestion
- State Backend with S3 storage for fault tolerance
- Monitoring Stack with Flink-specific metrics and dashboards
Quick Start
1. Clone the Repository
# Clone the repository
git clone https://github.com/awslabs/data-on-eks.git
cd data-on-eks/data-stacks/emr-on-eks
2. Review Configuration
Edit terraform/data-stack.tfvars to customize your deployment:
# EMR on EKS Data Stack Configuration
# This file enables EMR on EKS Virtual Clusters for running Spark jobs
name = "emr-on-eks"
region = "us-west-2"
deployment_id = "your-unique-id"
# Enable EMR on EKS Virtual Clusters
enable_emr_on_eks = true
# Enable EMR Spark Operator for declarative Spark job management
enable_emr_spark_operator = true
# Enable EMR Flink Kubernetes Operator, replacing the opensource
enable_emr_flink_operator = true
# Optional: Enable additional addons as needed
enable_ingress_nginx = true
enable_ipv6 = false
3. Deploy Infrastructure
./deploy.sh
This script will:
- Initialize Terraform
- Create VPC and networking (if not exists)
- Deploy EKS cluster with managed node groups
- Install Karpenter for autoscaling
- Install YuniKorn scheduler
- Create EMR virtual clusters for Team A and Team B
- Configure IAM roles and pod identity associations
- Set up S3 buckets for logs and data
- Deploy EMR Flink Kubernetes Operator
Deployment Time
Initial deployment takes approximately 30-40 minutes. Subsequent updates are faster.
4. View Terraform Outputs
After deployment completes, view the infrastructure details:
cd terraform/_local
terraform output
You should see output similar to:
cluster_arn = "arn:aws:eks:us-west-2:123456789:cluster/emr-on-eks"
cluster_name = "emr-on-eks"
configure_kubectl = "aws eks --region us-west-2 update-kubeconfig --name emr-on-eks"
deployment_id = "abcdefg"
emr_on_eks = {
"cloudwatch_log_groups" = {
"emr-data-team-a" = {
"arn" = "arn:aws:logs:us-west-2:301444719761:log-group:/emr-on-eks-logs/emr-on-eks/emr-data-team-a"
"name" = "/emr-on-eks-logs/emr-on-eks/emr-data-team-a"
}
"emr-data-team-b" = {
"arn" = "arn:aws:logs:us-west-2:301444719761:log-group:/emr-on-eks-logs/emr-on-eks/emr-data-team-b"
"name" = "/emr-on-eks-logs/emr-on-eks/emr-data-team-b"
}
}
"job_execution_role_arns" = {
"emr-data-team-a" = "arn:aws:iam::301444719761:role/emr-on-eks-emr-data-team-a"
"emr-data-team-b" = "arn:aws:iam::301444719761:role/emr-on-eks-emr-data-team-b"
}
"namespaces" = {
"emr-data-team-a" = "emr-data-team-a"
"emr-data-team-b" = "emr-data-team-b"
}
"virtual_clusters" = {
"emr-data-team-a" = {
"arn" = "arn:aws:emr-containers:us-west-2:301444719761:/virtualclusters/rthjrl76dgz7x1xixlf11lbc0"
"id" = "rthjrl76dgz7x1xixlf11lbc0"
"name" = "emr-on-eks-emr-data-team-a"
"namespace" = "emr-data-team-a"
}
"emr-data-team-b" = {
"arn" = "arn:aws:emr-containers:us-west-2:301444719761:/virtualclusters/agvpvoyl5poe1to9mwjizrbsk"
"id" = "agvpvoyl5poe1to9mwjizrbsk"
"name" = "emr-on-eks-emr-data-team-b"
"namespace" = "emr-data-team-b"
}
}
}
emr_s3_bucket_name = "emr-on-eks-spark-logs-123456789"
region = "us-west-2"
note
If deployment fails:
- Rerun the same command:
./deploy.sh - If it still fails, debug using kubectl commands or raise an issue
Post-Deployment Verification
The deployment script automatically configures kubectl. Verify the cluster is ready:
source set-env.sh
# Set kubeconfig
export KUBECONFIG=kubeconfig.yaml
# Verify cluster nodes
kubectl get nodes
# Check all namespaces
kubectl get namespaces
# Verify ArgoCD applications
kubectl get applications -n argocd
Quick Verification
Run these commands to verify successful deployment:
# 1. Check nodes are ready
kubectl get nodes
# Expected: 4-5 nodes with STATUS=Ready
# 2. Check ArgoCD applications are synced
kubectl get applications -n argocd
# Expected: All apps showing "Synced" and "Healthy"
# 3. Check Karpenter NodePools ready
kubectl get nodepools
Cleanup
Infrastructure Cleanup
# Complete cleanup
./cleanup.sh
Next Steps
After deploying the infrastructure: