Skip to main content

Airflow Infrastructure Deployment

Apache Airflow on EKS Infrastructure

Deploy Apache Airflow platform on Amazon EKS with GitOps, auto-scaling, and observability.

Architecture

This stack deploys Airflow on EKS with Karpenter for elastic node provisioning and ArgoCD for GitOps-based application management.

alt text

Prerequisites

Before deploying, ensure you have the following tools installed:

Step 1: Clone Repository & Navigate

git clone https://github.com/awslabs/data-on-eks.git
cd data-on-eks/data-stacks/airflow-on-eks

Step 2: Customize Stack

Edit the terraform/data-stack.tfvars file to customize settings if required. For example, you can open it with vi, nano, or any other text editor.

Step 3: Deploy Infrastructure

Run the deployment script:

./deploy.sh
note

If deployment fails:

  • Rerun the same command: ./deploy.sh
  • If it still fails, debug using kubectl commands or raise an issue
info

Expected deployment time: 15-20 minutes

Step 4: Verify Deployment

The deployment script automatically configures kubectl. Verify the cluster is ready:

# Set kubeconfig
export KUBECONFIG=kubeconfig.yaml

# Verify cluster nodes
kubectl get nodes

# Check all namespaces
kubectl get namespaces

# Verify ArgoCD applications
kubectl get applications -n argocd
Quick Verification

Run these commands to verify successful deployment:

# 1. Check nodes are ready
kubectl get nodes
# Expected: 4-5 nodes with STATUS=Ready

# 2. Check ArgoCD applications are synced
kubectl get applications -n argocd
# Expected: All apps showing "Synced" and "Healthy"

# 5. Check Karpenter NodePools ready
kubectl get nodepools

Expected Output Examples

Nodes:

NAME                                          STATUS   ROLES    AGE     VERSION
ip-100-64-112-7.us-west-2.compute.internal Ready <none> 6h16m v1.33.5-eks-113cf36
ip-100-64-113-62.us-west-2.compute.internal Ready <none> 6h16m v1.33.5-eks-113cf36
ip-100-64-12-166.us-west-2.compute.internal Ready <none> 6h10m v1.33.5-eks-113cf36

ArgoCD Applications:

NAME                           SYNC STATUS   HEALTH STATUS
airflow Scyned Healthy
argo-events Synced Healthy
argo-workflows Synced Healthy
aws-for-fluentbit Synced Healthy
...

Karpenter NodePools:

NAME                              TYPE          CAPACITY    ZONE         NODE                                          READY   AGE
general-purpose-d2mrg m7g.4xlarge spot us-west-2a ip-100-64-12-166.us-west-2.compute.internal True 6h12m
general-purpose-d5mgm m7g.4xlarge spot us-west-2a ip-100-64-50-67.us-west-2.compute.internal True 6h15m
memory-optimized-graviton-xkn5x r8g.4xlarge on-demand us-west-2a ip-100-64-59-123.us-west-2.compute.internal True 6h14m

Step 5: Access ArgoCD UI

The deployment script displays ArgoCD credentials at the end. Access the UI:

# Port forward ArgoCD server
kubectl port-forward svc/argocd-server -n argocd 8080:443

Open https://localhost:8080 in your browser:

  • Username: admin
  • Password: Displayed at end of deploy.sh output
info

It may take additional ~5 minutes for all applications to show Synced and Healthy status.

ArgoCD Applications

Troubleshooting

Common Issues

Pods stuck in Pending:

# Check node capacity
kubectl describe nodes

# Check Karpenter logs
kubectl logs -n karpenter -l app.kubernetes.io/name=karpenter

ArgoCD applications not syncing:

# Check ArgoCD application status
kubectl get applications -n argocd

# Check specific application
kubectl describe application Airflow-operator -n argocd

Try refreshing and syncing the applications in ArgoCD

Next Steps

With infrastructure deployed, you can now run any Airflow examples:

Cleanup

To remove all resources, use the dedicated cleanup script:

# Navigate to stack directory
cd data-on-eks/data-stacks/airflow-on-eks

# Run cleanup script
./cleanup.sh
warning

This command will delete all resources and data. Make sure to backup any important data first.

note

If cleanup fails:

  • Rerun the same command: ./cleanup.sh
  • Keep rerunning until all resources are deleted
  • Some AWS resources may have dependencies that require multiple cleanup attempts