StarRocks on EKS: Shared-Data vs Shared-Nothing

Introduction

StarRocks is a high-performance analytical database that supports two deployment architectures on Kubernetes. Which architecture you choose has major implications for cost, elasticity, and performance — so this benchmark gives you hard numbers to make an informed decision.

What is StarRocks and Where Does It Fit?

If you are coming from Spark, Trino/Presto, ClickHouse, or a traditional data warehouse, StarRocks will feel familiar but different. It is a massively parallel processing (MPP) analytical database with its own vectorized execution engine, cost-based optimizer, and columnar storage format. A cluster has two roles: Frontend (FE) nodes parse SQL, plan queries, and hold metadata, and Backend (BE) or Compute Node (CN) workers execute those plans in parallel. Unlike Spark — which spins up an application per workload — a StarRocks cluster is a long-lived service that users connect to with standard MySQL clients, returning results in milliseconds to seconds. Three patterns drive most production adoption on EKS:

OLAP directly on the data lake (external catalogs) — Teams with terabytes to petabytes already in S3 as Iceberg, Hudi, Delta Lake, or Hive tables point StarRocks at the Glue or Hive metastore and query those tables in place, with zero ETL. The vectorized engine plus the local NVMe data cache delivers interactive performance on top of the existing lakehouse, while Spark, EMR, and Athena keep full interop.
Real-time analytics with streaming ingestion — Data lands continuously from Kafka (Routine Load), Flink, or CDC sources into native StarRocks tables. Writes are visible within seconds, making StarRocks a good fit for live dashboards, fraud detection, IoT telemetry, and user-facing product metrics.
High-concurrency customer-facing analytics — Embedded dashboards, SaaS reporting, and internal BI tools that must serve thousands of concurrent users with sub-second latency — usually a mix of native tables, materialized views, and selective external-catalog queries.

Native Tables vs. External Catalogs, and Scaling to Thousands of Users

The most important mental model in StarRocks is the distinction between native (internal) tables and external catalogs. Native tables use StarRocks' own columnar format — sorted, bucketed, indexed, supporting UPDATE and DELETE through the primary-key model, and accelerated by synchronous and asynchronous materialized views that transparently rewrite queries to pre-aggregated results. They live on BE local disk in shared-nothing or on S3 with NVMe caching on CNs in shared-data. External catalogs leave the data untouched in open formats (Iceberg / Hudi / Delta / Hive / Paimon / JDBC) and query it in place via the underlying metastore — you get lakehouse interoperability at the cost of slightly higher first-query latency until the data cache warms up. Most production deployments use both: external catalogs for cold lakehouse data, native tables for hot and high-concurrency workloads, and materialized views to bridge the two.

Serving thousands of concurrent OLAP users is where the shared-data architecture is designed to shine. Because CNs are stateless, you can scale compute horizontally in seconds without re-sharding data, run multiple isolated warehouses against the same S3 storage to separate ad-hoc BI from batch ETL, and drive it all with Kubernetes-native autoscaling through HPA and Karpenter. Combined with the data cache on NVMe, the query cache, and materialized views, a StarRocks cluster on EKS can sustain very high QPS cost-effectively. Shared-nothing, by contrast, couples compute and storage and trades elasticity for predictable single-query latency on smaller, steadier workloads. This benchmark quantifies exactly that trade-off on identical hardware.

Workload Isolation Within a Single Cluster

A common question is whether you can keep dimension tables on BE and fact tables on CN inside one cluster. You cannot. StarRocks' storage mode is cluster-wide — a cluster is either shared-nothing (BE) or shared-data (CN); FEs only speak one mode, and you cannot colocate BE and CN under the same FE quorum. What you can do to get similar effects:

Multi-warehouse (shared-data only) — run multiple isolated CN warehouses backed by the same S3 storage so an ad-hoc BI warehouse and a batch-ETL warehouse never steal CPU from each other. Scale each warehouse independently.
Resource groups (shared-nothing and shared-data) — available since v2.2, with hard CPU limits via exclusive_cpu_cores in v3.3.5+. Partition CPU, memory, and concurrency across users / roles / query types inside a single BE or CN pool. Use this when a separate warehouse is overkill.
Colocated joins + materialized views — the idiomatic way to accelerate dimension × fact joins. Put dims and facts in the same colocation group (same bucket key / count) and joins execute locally with no shuffle; layer async MVs on top to pre-compute the expensive joins and let the optimizer rewrite queries to them transparently.
Two-cluster federation — for teams that genuinely want both profiles, run a small shared-nothing cluster for hot dims and a large shared-data cluster for facts, and join across them using an external catalog.

Capabilities That Shape Benchmark Results

A few StarRocks features directly influence the numbers you see in the sections below. Understanding them helps you read the results in context and reproduce them on your own data:

Table models (4 types) — Duplicate (append-only), Aggregate (pre-aggregated on load), Unique (latest-wins by key), and Primary Key (real-time updates/deletes, partial updates, fast point lookups). TPC-DS uses Duplicate tables, which is why this benchmark stresses scan + join + aggregation rather than upsert paths.
Asynchronous materialized views with query rewrite (v2.5+) — the optimizer transparently rewrites SPJG queries (Scan / Filter / Project / Aggregate) to matching MVs, including MVs built over Hive / Hudi / Iceberg external catalogs. A raw-TPC-DS run (like this one) intentionally disables MVs so the engine / storage comparison stays apples-to-apples; in production you'd typically add MVs for your hot query shapes.
Data cache (two-tier) — in-memory page cache for decoded pages + on-disk block cache on BE/CN local disk (NVMe strongly recommended). This is what closes the gap between S3-backed shared-data and EBS-backed shared-nothing once the working set is warm, and is why we report cold vs. warm iterations separately.
Query cache — caches partial aggregation results across queries with overlapping predicates. Helpful for dashboard fan-out, not a factor in single-pass TPC-DS runs.
Colocated joins — when dim and fact share bucketing, join is purely local; network shuffle disappears. Heavily relevant to TPC-DS store_sales × date_dim / item / customer patterns.
Streaming / batch ingestion — Routine Load (Kafka), Stream Load (HTTP), Broker Load (S3/HDFS bulk), and the Flink connector land data in seconds into native tables; Iceberg/Hudi/Delta external catalogs let you skip loading entirely for data already in the lake.

Architecture at a Glance

Both clusters share the same Frontend (FE) tier — metadata, SQL parsing, query planning, and the MySQL-protocol endpoint. They diverge at the worker tier (BE vs. CN) and at the storage substrate (local EBS vs. S3 with NVMe data cache). Both can query external catalogs (Iceberg / Hudi / Delta Lake / Hive) in S3 without copying data.

Shared-Nothing — BE with local EBS

Shared-Data — CN with S3 and NVMe data cache

How to read these diagrams

FE tier is identical in both — only the worker tier and storage substrate change.
In shared-nothing, native tables live on each BE's local EBS volume; losing a BE means replicating tablets.
In shared-data, native tables are authoritative on S3; CNs are stateless and only hold a hot-data cache on NVMe, so they scale in and out in seconds.
The external catalog path is the same in both: read Iceberg / Hudi / Delta / Hive tables directly from S3 via the Glue or Hive Metastore — no ingest, no copy.

Why This Benchmark Matters

When you're building an analytics platform on EKS, you face a fundamental storage architecture choice:

Shared-Data (cloud-native): Stateless Compute Nodes (CN) read from S3 with local NVMe SSD cache. Storage and compute scale independently. Lower storage cost, higher elasticity.
Shared-Nothing (traditional): Backend Nodes (BE) store data on local EBS volumes and perform both storage and compute. Data locality, but tighter coupling.

Both claim to deliver excellent OLAP performance, but the trade-offs aren't obvious from documentation alone. Common questions we answer here:

Does S3 with NVMe cache match the speed of local EBS?
Is the shared-data model worth the S3 latency for interactive analytics?
What's the CPU/memory footprint difference under identical workloads?
Which architecture gives better price-performance on AWS?

This benchmark answers these questions by running the TPC-DS 1TB decision support benchmark against both architectures on identical compute hardware, isolating storage as the only variable.

Executive Summary (TL;DR)

Shared-Nothing Wins Decisively on Speed

EBS-backed BE delivers 128.4s total, 1,309ms mean across 98 queries — 33.7% faster than shared-data. Local EBS (6000 IOPS) has lower and more predictable latency than S3+NVMe cache at 1 TB scale, and BE shows almost no cold-vs-warm penalty across iterations.

Shared-Data Wins on Flexibility

S3 + NVMe cache delivers 193.8s total, 1,977ms mean. The ~34% speed gap is the price paid for unlimited storage, second-scale elasticity (CN can scale 0→N), ~4× lower storage cost at scale, and durable multi-AZ persistence. On warm-cache iterations alone the gap narrows to ~12%.

Key findings:

Compute is identical: Both clusters use 3× r8g.8xlarge / r8gd.8xlarge nodes (96 vCPU / 768 GiB total). The ONLY variable is storage architecture.
Performance gap is meaningful at 1 TB (~34%): Shared-nothing finishes 98 queries in 128.4 s vs. 193.8 s for shared-data. EBS gp3 latency is both lower and more predictable than S3+NVMe-cache hydration.
Cold start is the main penalty for shared-data: iteration 1 total is 289.9s (S3 GETs hydrating the cache) but iterations 2–3 drop to ~145s — roughly 2× overall, and 10–25× on individual cold queries (e.g. q03 18.0 s cold → 0.74 s warm).
Shared-nothing has almost no warm-up effect: all three iterations finish in 122.5–131.5 s. EBS page cache plus tablet metadata is effectively always warm.
Warm-only comparison narrows the gap to ~12%: if you discard iteration 1 and average iterations 2–3, shared-data is ~146 s vs. shared-nothing ~128 s. For long-lived dashboards that sit on a warm cache, the architectures are much closer than the raw totals suggest.
q23 is the slowest query on both (36.1 s SN / 35.9 s SD — shared-data is actually 169 ms faster here). q23 is a 3-way cross-join on fact tables where compute dominates, not I/O.
q95 times out on both: Same 600 s timeout, same query failure pattern — query-planner / optimizer issue, not storage.
Small, repeatable queries punish shared-data: q03, q13, q20, q62, q64, q66 all run 3–17× slower on shared-data purely because the cold-iteration S3 round-trip dominates the runtime.
BE memory baseline is high: Shared-nothing BE uses ~85 GiB baseline even idle (page cache, tablet metadata). Size accordingly.

Use shared-data when: unpredictable workloads, need elasticity, large datasets (>10TB), multi-tenant, cost-sensitive storage.

Use shared-nothing when: predictable workloads, consistent low latency required, smaller datasets (<10TB), dedicated clusters.

Benchmark Configuration

Test Environment

Component	Configuration
EKS Cluster	Amazon EKS v1.34.6
Region	us-east-1
StarRocks Version	3.5-latest
Operator Version	v1.11.3 (deployed via ArgoCD)
Dataset	TPC-DS Scale Factor 1000 (~1TB raw)
Queries	All 99 standard TPC-DS queries
Iterations	3 per query — reports min / avg / max
Query Timeout	600 seconds

Cluster Specifications

Both clusters have identical compute resources — the only variable is storage. This isolates the impact of storage architecture.

Shared-Data (CN + S3)


FE	3× m6g.2xlarge (8 vCPU, 32 GiB)
CN	3× r8gd.8xlarge (32 vCPU, 256 GiB)
CN requests	28 vCPU, 220 GiB
CN limits	31 vCPU, 240 GiB
Data storage	S3 (unlimited)
Cache	1.9 TB NVMe SSD per CN (RAID0)
FE metadata	100Gi gp3 EBS

Shared-Nothing (BE + EBS)


FE	3× m6g.2xlarge (8 vCPU, 32 GiB)
BE	3× r8g.8xlarge (32 vCPU, 256 GiB)
BE requests	28 vCPU, 220 GiB
BE limits	31 vCPU, 240 GiB
Data storage	EBS gp3-starrocks
EBS spec	500Gi × 3, 6000 IOPS, 250 MB/s
FE metadata	100Gi gp3 EBS

Why These Instance Types?

StarRocks recommends a minimum 1:4 vCPU:GB memory ratio for compute nodes. The r family (memory-optimized Graviton) delivers 1:8 — exceeding the minimum with headroom for:

Vectorized hash joins and aggregation state (lives in RAM)
Data cache page cache on the CN side (NVMe-backed on r8gd)
Bloom filters, dictionary caches, tablet metadata (always in memory)

The d suffix (r8gd) adds local NVMe SSD — essential for shared-data CN cache. The shared-nothing BE uses r8g without local NVMe since data persistence lives on EBS PVCs.

Why not compute-optimized (c7g, c8g)?

Compute-optimized instances have a 1:2 vCPU:GB ratio — below StarRocks' minimum 1:4 recommendation. You'll see OOM kills on big joins long before you run out of CPU. We tested this early and observed CN pods failing to schedule with 128 GiB memory requests on c-family instances.

Why not network-optimized (r6in, m6in)?

Network-optimized instances (up to 200 Gbps) are a tempting trap for shared-storage architectures. In practice, once the Data Cache hit rate is >80% (typical in production), the bottleneck shifts from network to local NVMe IOPS and memory. The extra network bandwidth costs 15-25% more for throughput you won't use after warmup.

Why not storage-optimized (i4i, im4gn)?

i4i/im4gn have excellent local NVMe but the vCPU:GB ratio is 1:8 with too much memory — you over-provision for StarRocks' needs. These instances make more sense for ClickHouse-style workloads.

Storage Architecture Deep Dive

Dimension	Shared-Data (CN + S3)	Shared-Nothing (BE + EBS)
Primary storage cost	S3 Standard (~$0.023/GB/mo)	EBS gp3 (~$0.08/GB/mo)
Cache layer	1.9 TB NVMe (free with instance)	N/A
Cold read latency	10-50ms (S3 GET)	1-2ms (EBS gp3)
Warm read latency	~0.1ms (NVMe cache hit)	1-2ms (EBS gp3)
Cache hit rate	~90% after warmup (TPC-DS)	100% (all data is local)
Storage IOPS	NVMe: ~100K+ per node	EBS: 6,000 per volume
Storage throughput	NVMe: 2+ GB/s per node	EBS: 250 MB/s per volume
Max storage size	Unlimited (S3)	500 GiB per BE (1.5TB total)
Stateful	No (CN is stateless, cache is ephemeral)	Yes (data on EBS PVCs)
Elasticity	Scale CN 0→N in seconds	Scale BE requires data rebalancing
Data durability	99.999999999% (S3 11 nines)	99.999% (EBS)
Multi-AZ	S3 cross-AZ by default	Requires explicit replication

Benchmark Methodology

TPC-DS Table Format

Both clusters use identical StarRocks native table DDL — no external catalogs, no Iceberg, no Hive. Data is loaded directly into StarRocks managed tables so the benchmark measures pure StarRocks query execution.

All 24 TPC-DS tables use the Duplicate Key model — StarRocks' default table type optimized for append-heavy OLAP workloads:

Table type: duplicate key (append-only, no primary key uniqueness)
Distribution: distributed by hash(<key>) buckets N — data is sharded across buckets by hash of the join key
Replication: replication_num = 1 — single replica (no data redundancy for the benchmark)
Storage engine: OLAP (columnar, sorted by duplicate key for range scans)

Example from store_sales:

create table store_sales (
    ss_sold_date_sk           integer,
    ss_item_sk                integer   not null,
    ss_ticket_number          bigint    not null,
    -- ... 20+ more columns ...
    ss_net_profit             decimal(7,2)
)
duplicate key (ss_sold_date_sk, ss_item_sk, ss_ticket_number)
distributed by hash(ss_item_sk, ss_ticket_number) buckets 261
properties("replication_num" = "1");

Why Duplicate Key model?

Best for immutable analytical data (fact tables like store_sales) — TPC-DS loads append-only
Lower write amplification than Primary Key model (no compaction for uniqueness)
Full columnar compression with sort-by-key push-down
Matches StarRocks' own recommended TPC-DS schema

The key difference between the two clusters is WHERE the data is physically stored:

Shared-Data: Duplicate Key tables → S3 objects (managed by StarRocks starmgr)
Shared-Nothing: Duplicate Key tables → EBS volumes on BE pods (local storage)

The SQL DDL is byte-for-byte identical. StarRocks abstracts the storage layer from the table definition.

Data Generation

We used the standard TPC-DS dsdgen tool at scale factor 1000 (~1TB raw data):

Output: 10,015 .dat files, ~911 GB total
Duration: ~2 hours using 32 parallel dsdgen workers on a Graviton r6g.8xlarge
Storage: ReadWriteOnce PVC (1 TiB gp3), shared across both loaders

Data Loading

Both clusters were loaded from the same PVC using StarRocks _stream_load HTTP API:

Loading target: ~12 GiB columnar-compressed in S3 (shared-data) / ~12 GiB on EBS (shared-nothing)
Compression ratio: ~3-5x vs raw text
Loading duration: 5-7 hours per cluster (single-threaded via curl)
Data parity: Row counts verified near-identical between both clusters (store_sales: 2.75B rows, catalog_sales: ~1.4-1.5B, web_sales: 720M)

Query Execution

Each benchmark run executes all 99 TPC-DS queries for 3 iterations:

Iteration 1: Cold cache (CN reads from S3, BE reads from EBS for first time)
Iteration 2: Warm cache (NVMe for CN, page cache for BE)
Iteration 3: Fully warm

Results report min / avg / max per query across iterations. The avg is used as the representative query time.

Results

Overall Runtime Comparison

Shared-nothing is 33.7% faster overall (128.4 s vs 193.8 s, a 65.4 s gap across 98 queries). The gap is driven largely by iteration 1 on shared-data: its cold cache incurs ~290 s of S3 GET latency before NVMe warms up, while BE is consistent across all three iterations (131.5 s → 131.2 s → 122.5 s).

Warm-only comparison (iterations 2 + 3 averaged): shared-data ≈ 145.8 s, shared-nothing ≈ 126.9 s — a much smaller ~13% gap. Which number you care about depends on your workload: ad-hoc exploration pays the cold tax; dashboards and embedded analytics mostly see the warm number.

Per-Cluster Headlines

Shared-Nothing Results

98/99

Queries Passed

1,309ms

Mean Query Time

128.4s

Total Runtime

Timeout

Shared-Data Results

98/99

Queries Passed

1,977ms

Mean Query Time

193.8s

Total Runtime

Timeout

Latency Distribution

The distributions look broadly similar at the coarse level — ~83% of shared-nothing queries and ~72% of shared-data queries complete in under 2 seconds — but the fast bucket (<500 ms) tells the real story: 57 queries for shared-nothing vs only 27 for shared-data. The S3 round-trip cost shifts most sub-500 ms SN queries into the 500 ms – 2 s bucket on SD.

Slowest Queries

The top-10 slowest queries per cluster, based on avg time across 3 iterations. The same query IDs dominate both clusters — joins and window functions on the fact tables — but the cold S3 penalty inflates several shared-data runs (notably q03, q04, q09, q28).

Shared-Nothing (top 10 by avg)

Rank	Query	Avg (ms)	Min (ms)	Max (ms)	Notes
1	q23	36,071	31,567	39,498	3-way cross-join across fact tables
2	q67	6,197	6,172	6,246	Deep window functions over `store_sales`
3	q14	5,398	5,389	5,405	3-way self-join on `item`
4	q78	5,132	5,095	5,189	Multi-aggregation on `web_sales`
5	q04	5,107	4,561	5,869	Heavy `customer` window functions
6	q11	3,887	3,844	3,911	Repeat-customer year-over-year
7	q74	3,719	3,677	3,756	Income-band ranking
8	q09	3,189	3,131	3,273	Case-when aggregations
9	q75	3,141	3,074	3,259	Multi-year comparison
10	q88	2,965	2,953	2,973	Hourly store-sales aggregation
—	q95	timeout	—	—	Complex `web_sales` self-join (hits 600 s limit on both clusters)

Shared-Data (top 10 by avg)

Rank	Query	Avg (ms)	Min (ms)	Max (ms)	Notes
1	q23	35,902	34,311	38,155	3-way cross-join — compute-bound, storage-independent
2	q04	7,878	4,880	12,859	Cold iter 1 at 12.9 s pulls the avg up
3	q14	6,938	5,639	9,454	3-way self-join on `item`
4	q03	6,511	740	18,041	Huge cold-vs-warm variance (24×)
5	q67	6,412	6,392	6,432	Deep window functions — tied with SN
6	q78	6,171	5,991	6,506	Multi-aggregation on `web_sales`
7	q09	5,931	3,386	10,943	Case-when aggregations, cold-inflated
8	q28	5,224	2,603	10,465	Multi-union aggregations
9	q59	5,155	3,175	6,146	Week-over-week analysis
10	q75	5,127	4,396	6,533	Multi-year comparison
—	q95	timeout	—	—	Same 600 s timeout as shared-nothing

Where Shared-Nothing Wins by the Biggest Margin

Small-to-medium queries where the cold-iteration S3 round-trip dominates runtime. These are the cases where shared-nothing's data-locality pays off most:

Query	Shared-Nothing (avg ms)	Shared-Data (avg ms)	SD / SN	Notes
q03	368	6,511	17.7× slower	SD cold iter 1 alone = 18,041 ms
q66	358	2,234	6.2× slower	Cold iter dominates SD avg
q62	315	1,381	4.4× slower	Multi-table union on warehouse
q13	989	3,884	3.9× slower	Cold-heavy SD iter 1 = 9,625 ms
q64	1,153	4,089	3.5× slower	Customer purchase-path join
q28	2,707	5,224	1.9× slower	Multi-union on store/catalog/web sales

Where Shared-Data Ties or Wins

On long-running compute-bound queries the storage difference is dwarfed by join cost, and on a handful of small queries the NVMe page cache matches EBS after warm-up. Only q23 is an outright SD win:

Query	Shared-Nothing (avg ms)	Shared-Data (avg ms)	Delta	Notes
q23	36,071	35,902	SD 169 ms faster (0.5%)	Compute-bound cross-join
q20	173	159	SD 14 ms faster (8.1%)	Tiny aggregation, NVMe caching effect
q11	3,887	3,892	SN 5 ms faster (0.1%)	Essentially tied
q99	91	93	SN 2 ms faster (2.2%)	Essentially tied
q67	6,197	6,412	SN 215 ms faster (3.5%)	Long window query — compute dominates
q51	2,227	2,314	SN 87 ms faster (3.9%)	Rolling aggregations

Resource Utilization

Observed during benchmark execution via kubectl top pods and Grafana:

CPU Utilization

	Shared-Data CN	Shared-Nothing BE
Peak CPU during query	~3,000m (~9.4% of 32 vCPU)	~4,000m (~12.5%)
Avg CPU during benchmark	~2,200m (~7%)	~3,000m (~9%)
Idle CPU	<100m	<50m

Observation: Neither architecture is CPU-bound at 1TB scale. The bottleneck is I/O (S3+NVMe vs EBS) and query planning on the FE. At 10TB scale, CPU saturation would become more relevant — consider that for capacity planning.

Memory Footprint

	Shared-Data CN	Shared-Nothing BE
Baseline memory	~94 GiB (37% of 256)	~85 GiB (33% of 256)
Peak memory during query	~105 GiB (41%)	~95 GiB (37%)
Contents	Data cache + query buffers + tablet metadata	Page cache + tablet metadata (always resident)

Observation: Both architectures carry a significant baseline memory footprint. The BE doesn't consume less memory just because it's idle — StarRocks pre-loads tablet metadata and keeps page cache hot. Budget for ~85-95 GiB baseline per node regardless of query load.

Storage I/O

Shared-Nothing: EBS IOPS approached the 6,000 provisioned limit on iteration 1 (cold read) but stayed well below on warm iterations thanks to Linux page cache
Shared-Data: Iteration 1 generates ~600 MB/s of S3 GET traffic per CN (network-bound on cold cache). Iterations 2-3 drop S3 traffic to near-zero as NVMe cache serves reads

Cost Analysis (Monthly, us-east-1)

Pricing disclaimer

The prices shown in this section are illustrative estimates captured at the time of the benchmark run and will drift over time. AWS pricing changes frequently with discounts, Savings Plans, Spot, regional variation, and instance-family updates. Always refer to the official AWS pricing pages for current costs before making architectural or budgeting decisions:

Amazon EC2 Pricing — for FE/BE/CN instance costs
Amazon EBS Pricing — for gp3 volume costs
Amazon S3 Pricing — for object storage + request costs
AWS Pricing Calculator — for region-specific "what-if" modeling

Use the StarRocks Cost Analyzer or kubecost on your own EKS cluster for real-world figures once the workload is live.

Component	Shared-Data	Shared-Nothing
3× FE (m6g.2xlarge)	$560	$560
3× CN/BE (r8gd/r8g.8xlarge)	~$4,220	~$3,580
Data storage (1TB compressed)	S3: ~$30 (1TB @ $0.023/GB)	EBS: ~$120 (1.5TB @ $0.08/GB)
EBS for FE + logs	~$30	~$30
Total (3-node cluster)	~$4,840/month	~$4,290/month

note

r8gd costs ~18% more than r8g due to included NVMe SSD. This covers the storage offload from S3 → local cache.

At 1 TB scale, shared-nothing is cheaper by ~$550/month (~13%). As data grows:

Dataset Size	Shared-Data Storage	Shared-Nothing Storage	Storage Delta
1 TB	$23/mo	$80/mo	+$57/mo
10 TB	$230/mo	$800/mo (scaling BE)	+$570/mo
100 TB	$2,300/mo	$8,000/mo	+$5,700/mo
1 PB	$23,000/mo	$80,000/mo	+$57,000/mo

The inflection point for shared-data cost advantage is around 5-8 TB of stored data (after compression).

Factor In: Scale-to-Zero (Shared-Data Only)

With KEDA autoscaling (scale CN to 0 during idle), shared-data can save an additional ~$1,400/month per CN during nights/weekends. For a 3-CN cluster idle 60% of the time, that's an extra ~$2,500/month savings.

When To Use Which

Choose Shared-Data When

Unpredictable workloads — queries arrive in bursts, need scale-out
Large datasets (>10 TB after compression)
Multi-tenant — different users want different compute sizes
Cost-sensitive storage — pay-as-you-grow with S3
Cloud-native ops — want to treat compute as cattle
Need multi-AZ DR — S3 provides it for free
Scale-to-zero needs — idle weekends/nights are free

Examples: Data lakehouse queries, ad-hoc analytics, multi-team data platforms

Choose Shared-Nothing When

Predictable workloads — steady traffic, known query patterns
Consistent low latency required — dashboards, real-time analytics
Smaller datasets (< 10 TB)
Dedicated cluster — single team, stable compute
Simpler operations — familiar DBA patterns
No S3 latency budget — every query must be sub-second
Every millisecond counts — 34% faster overall at 1 TB scale, and still ~13% faster on warm-cache iterations alone

Examples: BI dashboards, customer-facing analytics, real-time reporting

Reproducing This Benchmark

Full step-by-step guide: TPC-DS Benchmark on StarRocks

Infrastructure setup: StarRocks Infrastructure Deployment

Code: data-stacks/starrocks-on-eks

Introduction​

What is StarRocks and Where Does It Fit?​

Native Tables vs. External Catalogs, and Scaling to Thousands of Users​

Workload Isolation Within a Single Cluster​

Capabilities That Shape Benchmark Results​

Architecture at a Glance​

Why This Benchmark Matters​

Executive Summary (TL;DR)​

Shared-Nothing Wins Decisively on Speed

Shared-Data Wins on Flexibility

Benchmark Configuration​

Test Environment​

Cluster Specifications​

Shared-Data (CN + S3)​

Shared-Nothing (BE + EBS)​

Why These Instance Types?​

Storage Architecture Deep Dive​

Benchmark Methodology​

TPC-DS Table Format​

Data Generation​

Data Loading​

Query Execution​

Results​

Overall Runtime Comparison​

Per-Cluster Headlines​

Shared-Nothing Results​

Shared-Data Results​

Latency Distribution​

Slowest Queries​

Shared-Nothing (top 10 by avg)​

Shared-Data (top 10 by avg)​

Where Shared-Nothing Wins by the Biggest Margin​

Where Shared-Data Ties or Wins​

Resource Utilization​

CPU Utilization​

Memory Footprint​

Storage I/O​

Cost Analysis (Monthly, us-east-1)​

Factor In: Scale-to-Zero (Shared-Data Only)​

When To Use Which​

Choose Shared-Data When​

Choose Shared-Nothing When​

Reproducing This Benchmark​

References​