ClickHouse on EKS Stack
ClickHouse deployment on Amazon EKS — a high-performance, column-oriented OLAP database for real-time analytics on petabyte-scale datasets. This stack provisions a sharded, replicated ClickHouse cluster managed by the ClickHouse Kubernetes operator with a dedicated ClickHouse Keeper ensemble for replication and distributed DDL coordination.
Getting Started
Deploy Infrastructure
Provision the EKS cluster, Karpenter node pools, and the ClickHouse operator with ArgoCD
Launch a ClickHouse Cluster
Deploy a sharded, replicated ClickHouse installation backed by ClickHouse Keeper
Load Sample Data
Ingest the ClickHouse hits dataset from S3 into a Distributed/ReplicatedMergeTree table
Query and Test Failover
Run analytical queries, inspect index usage with EXPLAIN, and validate replica failover
Infrastructure Deployment
Deploy a scalable ClickHouse platform on Amazon EKS with Terraform, Karpenter for node auto-scaling, ArgoCD for GitOps management, and the ClickHouse operator for cluster lifecycle.
Sample Workload: Hits Dataset
Load the canonical ClickHouse hits Parquet dataset from S3 into a Distributed table over ReplicatedMergeTree, run analytical queries, and demonstrate replica failover by deleting a pod.