Back to blog

9 Iceberg Table Compaction Tools Compared for Production Lakehouses

Compaction keeps Apache Iceberg lakehouses fast and lean — but every tool approaches it differently. A side-by-side look at nine production options: LakeOps, AWS Glue, Amazon S3 Tables, Snowflake, Google BigLake, Cloudera, Starburst, Dremio, and Databricks.

Jonathan Saring

Jonathan Saring

17 min read
Iceberg Table Maintenance Solution Comparison — side-by-side feature matrix for LakeOps, AWS Glue, S3 Tables, Snowflake, BigLake, Cloudera, and Starburst

Every Apache Iceberg guide explains compaction the same way: small files slow down queries, so you merge them into bigger ones. Run `rewrite_data_files`, set a target size, schedule it nightly, move on.

That description is accurate for a single table with predictable batch loads. In a production lakehouse — hundreds of tables, streaming and batch mixed, multiple engines, teams with different SLAs — compaction is a systems problem. A streaming pipeline checkpointing every 60 seconds against 100 active partitions generates 144,000 new files per day. A nightly Spark job processes yesterday's output, but by morning the backlog has rebuilt itself. Delete files pile up alongside data files, forcing every query to reconcile both. Sort orders chosen at table creation go stale as access patterns shift. And maintenance operations scheduled independently start conflicting: compaction rewrites files that snapshot expiration is about to dereference, orphan cleanup deletes temporary files before the engine has committed them.

The ecosystem now offers nine production-grade approaches to this problem. This guide compares them — from cloud-native managed services to dedicated optimization platforms and engine-integrated maintenance — so you can evaluate which fits your architecture, scale, and operational model.

The compaction problem at scale

Most tools handle file consolidation. The real differentiators are everything else a compaction system must address at production scale:

- Physical layout optimization — sorting data on columns that queries filter on, so engines can skip entire row groups via Parquet min/max statistics. Sorted layouts can cut scan volume in half compared to unsorted data. - Delete file handling — physically applying position and equality delete files during compaction so readers stop paying reconciliation overhead on every query. - Maintenance sequencing — running compaction after snapshot expiration and orphan cleanup so it operates on the clean, current dataset rather than rewriting files about to be garbage-collected. - Trigger intelligence — compacting based on actual table state (file count, average size, delete-file ratio) rather than cron schedules that waste compute on idle tables and miss degraded ones. - Per-table tuning — tables serving point lookups perform best at 64–128 MB files; full-partition scan tables at 256–512 MB. One default does not fit every workload. - Engine cost — JVM-based compaction (Spark, Trino) carries startup, GC, and idle-cluster overhead that compounds across hundreds of daily compaction runs.

1. LakeOps

Compaction is not just a feature in LakeOps. It is the core of a system-level optimization loop — file optimization as part of a smart, sequenced order of operations triggered by events and thresholds across the lakehouse, still tuned per table, fused with the context of every query hitting each table in production, and executed on an engine built for this workload from the ground up.

LakeOps Control Plane — catalogs, engines, and autonomous optimization
LakeOps connects to your existing Iceberg catalogs and query engines as a dedicated control plane — no data movement, no pipeline changes. Compaction runs alongside observability, snapshot management, orphan cleanup, manifest optimization, routing, and policy governance as a unified system.

LakeOps is a dedicated lakehouse control plane built in Rust on Apache DataFusion. It connects to your existing catalogs and storage — no code, infrastructure, or pipeline changes. Your data stays in your account; LakeOps reads metadata and telemetry, nothing else.

Once connected, LakeOps autonomously runs compaction, snapshot expiration, orphan cleanup, and manifest optimization across every table — coordinated, sequenced, and tuned to each table's workload. You get full observability into lake health, intelligent automation that adapts to real query patterns, and granular control over every operation through fleet-wide policies. The result is a lakehouse that stays optimized continuously, without manual scripts or dedicated maintenance teams.

LakeOps Dashboard — lake-wide compaction activity and table health
The LakeOps Dashboard: 30-day optimization activity, query acceleration, cost savings, and table health across the lake — Critical, Warning, and Healthy — with recent compaction and maintenance operations visible at a glance.

Event-driven triggers, not cron. The system continuously monitors structural signals per table and partition — file count, average file size relative to target, delete-file-to-data-file ratio, manifest depth. Compaction fires only when thresholds are crossed. A streaming table may compact multiple times per hour; a weekly batch table compacts once. The trigger also responds to upstream events — after snapshot expiration frees data files, the system re-evaluates whether a compaction pass is warranted. No wasted runs, no missed tables.

Query-aware sort, driven by production telemetry. The compaction engine does not just merge files — it physically reorganizes data based on how tables are actually queried. It collects telemetry across every connected engine (Trino, Spark, Snowflake, Athena, DuckDB, Flink), identifies which columns appear in WHERE, JOIN, and GROUP BY clauses per table, and applies the sort order that maximizes data skipping. The planner evaluates single-dimension sort, multi-column sort, and z-order, selecting the strategy per table independently — not a global default applied blindly.

That telemetry also feeds the Insights engine, which surfaces when file structure and access patterns are out of alignment — excessive manifests, partition skew, small-file buildup — before query latency degrades. Platform teams see which tables need layout work and which columns queries actually filter on, rather than guessing from static sort orders set at table creation.

LakeOps Insights — proactive alerts driven by table and query signals
Lake-wide Insights: CRITICAL alerts for partition file issues, HIGH for excessive manifests and snapshots, WARNING for partition skew and small files — severity-ranked signals that inform when and how to compact and sort.

Sequenced with the full maintenance stack. Compaction does not run in a vacuum. In LakeOps, maintenance runs as a coordinated pipeline: (1) snapshot expiration trims stale data → (2) orphan cleanup removes newly unreferenced files → (3) compaction rewrites the clean, current dataset → (4) manifest optimization consolidates metadata against the final layout. Each operation's output becomes the next one's clean input. One production run removed ~200 TB of orphan data across 324 tables in under 30 minutes — the kind of cleanup that makes subsequent compaction dramatically more efficient.

That scale of recovery is not a benchmark slide — it is a real customer deployment, documented in the post below:

A real customer deployment: ~200 TB of orphan data removed across 324 tables in under 30 minutes — for less than the cost of a cup of coffee in compute. The cleanup that makes every subsequent compaction pass faster and cheaper.

Every step in that pipeline is logged in the Events view — compaction, snapshot expiration, orphan removal, and manifest rewrites — with duration, before/after file counts, and status so operators can verify sequencing and impact per table.

LakeOps table events — compaction, snapshot expiration, and manifest rewrites
Table-level Events: every step in the maintenance pipeline for customer_orders — Compact Data Files (970→87 files), Expire Snapshots, Rewrite Manifests — with duration, impact, and status.

Layout simulations before committing. Changing a sort order rewrites every data file in the table. If the new layout is worse for the dominant query pattern, you have paid the full cost with negative returns. Layout Simulations test proposed changes on a real Iceberg branch — the layout is applied, actual production queries are replayed, and cost/performance impact is compared before any production data is modified.

LakeOps Layout Simulations — testing sort strategies against real query patterns
Layout Simulations: field access frequency from real queries (SELECT, FILTER, JOIN) per column, three candidate sort configurations compared against the baseline, and the Layout Customization Diff showing projected file sizes for each approach.

A Rust engine that runs 95% faster. The execution engine matters as much as the strategy. LakeOps processes Parquet data through Arrow columnar buffers with bounded memory, lock-free parallelism, no garbage collection, and no executor provisioning. Every commit is non-blocking — concurrent readers and writers are never interrupted. In production benchmarks across 10 tables totaling 5.5 TB: binpack completed in 221 seconds versus 1,612 for Spark (86% faster). Peak throughput reached 2,522 MB/s. File counts dropped from 101K to 19K (81% reduction). On a 1.2 TB table that caused Spark to fail with an out-of-memory error, the Rust engine completed compaction in 11 minutes.

The engine self-improves: it records per-table throughput, partition structure, and memory usage from each run, so subsequent passes execute faster as the planner converges on optimal resource allocation. In production, the same table went from 22 minutes to 11 minutes across consecutive runs with zero configuration changes.

LakeOps compaction benchmarks — Rust engine vs Spark vs S3 Tables
Compaction benchmarks: S3 Tables (6,300s), Apache Spark (1,612s), LakeOps binpack (221s), LakeOps sort (780s). The Rust engine delivers the fastest compaction at a fraction of the cost across both strategies.

Fleet-wide policies with per-table precision. Compaction rules are declared as policies — trigger conditions, strategy, schedule constraints, and scope. Scope follows a specificity hierarchy: table-level overrides namespace, which overrides catalog-wide baseline. Platform teams set defaults for the entire lake; individual teams override for their SLAs. Policies are versioned with a full audit trail.

Delete file handling: Position and equality delete files are physically applied during compaction in the same pass. Standalone delete file rewriting is also available as a separate operation for addressing delete accumulation independently.

Observability: Full lake-wide and per-table visibility — compaction activity, file counts, throughput, cost impact, and health classification (Critical/Warning/Healthy). Every operation is logged with before/after metrics. Proactive alerts at four severity levels surface degradation before users notice it.

At the table level, the Insights tab pinpoints exactly what is wrong — a HIGH alert for 92 manifest files where 50 is the threshold, a WARNING for partition skew, a LOW note for early small-file accumulation. Each insight links directly to the affected table so operators can act or let the autonomous pipeline remediate on the next cycle.

LakeOps Insights tab — per-table alerts for manifests, skew, and small files
Table Insights for customer_orders: a HIGH alert for 92 manifest files (threshold: 50) with 43 undersized, a WARNING for partition data skew, and a LOW note for small file accumulation — severity-ranked signals that drive compaction and maintenance decisions.

Catalog support: AWS Glue, Hive Metastore, REST catalogs (Polaris, Gravitino, Nessie, Lakekeeper), and S3 Tables — multiple catalogs across regions and clouds simultaneously.

The compound effect: When event-driven triggers, production query telemetry, maintenance sequencing, layout simulation, a Rust engine, and fleet-wide policies all work together, compaction stops being a maintenance task and becomes a continuous optimization loop. The lake does not drift toward degradation between runs — it converges toward an optimal state, continuously, table by table, at petabyte scale.

Trade-offs: LakeOps operates as an external control plane, not embedded in a single cloud vendor's console. Teams with very low volume running a handful of tables may not need this level of orchestration. For teams running many tables across multiple catalogs or engines — where the cost of not coordinating compaction with the rest of the maintenance stack compounds — the dedicated approach produces better results with less operational work.

2. AWS Glue Data Catalog Table Optimization

AWS Glue Data Catalog provides managed table optimization for Iceberg tables directly within the Glue console, CLI, or SDKs. Compaction runs as one of three table optimizers alongside snapshot retention and orphan file deletion.

Compaction strategies: Binpack (default), Sort, and Z-order. Sort organizes data by specified columns for improved filtered query performance. Z-order clusters data for efficient multi-column queries.

Automation: Compaction activates when a table or partition exceeds 100 files with each file smaller than 75% of the target size (default 512 MB). Can also be triggered manually via the API. As of December 2024, Glue supports compaction of delete files and nested data types, with partial progress commits to reduce conflicts.

Configuration: Target file size, strategy, sort columns, and compression codecs (zstd, brotli, lz4, gzip, snappy) are all configurable. Schema evolution and partition spec evolution are supported. Available in 14 AWS regions.

Trade-offs: The natural choice for teams committed to Glue. Compaction runs on AWS-managed infrastructure — no Spark clusters to provision. However, it is scoped to Glue-cataloged tables only. There is no cross-engine query telemetry and no coordinated sequencing between maintenance operations.

3. Amazon S3 Tables

S3 Tables brings object-storage-native Iceberg support directly into Amazon S3. Tables are created in dedicated table buckets, and all maintenance — including compaction — is automatic and enabled by default.

Compaction strategies: Auto (default — S3 selects binpack or sort based on the table's sort order), Binpack, Sort, and Z-order. Target file size configurable between 64 MB and 512 MB.

Automation: Fully automatic with zero configuration. S3 handles scheduling and execution transparently. AWS reports up to 3x query performance improvement with compaction enabled and up to 10x higher transactions per second compared to general-purpose buckets.

Cost: Per-object prices dropped 50% and per-byte processing dropped 90% for binpack (80% for sort/z-order) as of July 2025.

Trade-offs: The most hands-off option. The trade-off is control — tables must live in S3 table buckets, strategy selection is limited, and there is no query-aware sort, layout simulation, or maintenance sequencing.

4. Snowflake Managed Iceberg Tables

Snowflake provides managed Iceberg table support where Snowflake acts as the catalog and handles table maintenance internally. Data is written to customer-owned cloud storage (S3, GCS, or Azure Blob) in Iceberg format, accessible by external engines through the Iceberg catalog.

Snowflake managed Iceberg architecture
Snowflake manages Iceberg tables with data written to customer-owned external cloud storage in Iceberg format, accessible by external engines through the Iceberg catalog.

Compaction strategies: Automatic compaction enabled by default. Since September 2025, the `ENABLE_DATA_COMPACTION` parameter allows disabling at account, database, schema, or table level. Configurable target file sizes (AUTO, 16 MB, 32 MB, 64 MB, 128 MB) tune interoperability — achieving within 7–8% performance parity for external engines like Spark, Trino, and Databricks SQL.

Automation: Fully automatic. Costs billed through Snowflake credits, trackable via the `ICEBERG_STORAGE_OPTIMIZATION_HISTORY` view. Billing for compaction started October 2025.

Trade-offs: Works well when Snowflake is the primary write engine. The compaction strategy is opaque — users control whether it runs and the file size, but not sorting or trigger conditions. Not informed by query patterns from non-Snowflake engines.

5. Google BigLake Managed Iceberg Tables

BigQuery manages Iceberg tables with data stored in Google Cloud Storage. BigLake provides automatic maintenance — compaction, clustering, garbage collection, and metadata optimization — as part of the managed table experience.

Compaction strategies: Automatic compaction with adaptive file sizing, automatic clustering and auto-reclustering. History-based optimization (GA as of 2025) tunes table layouts based on observed workloads. BigLake Metastore (GA in 2025) and Iceberg REST Catalog API (preview) enable external engine access.

Trade-offs: Tightly integrated with BigQuery and GCP. The automatic maintenance is zero-configuration but opaque — no user control over sort strategies, trigger thresholds, or sequencing. Multi-cloud architectures need a separate solution for tables outside BigLake.

6. Cloudera Lakehouse Optimizer

Cloudera provides an automated maintenance service within Cloudera Data Platform (CDP), available on AWS and Azure. Compaction runs as Spark jobs orchestrated by the Lakehouse Optimizer in the CDP Management Console.

Cloudera open data lakehouse architecture
Cloudera's open data lakehouse architecture: streaming and batch data flow through ingestion, processing, and consumption layers with Iceberg as the central format. The Lakehouse Optimizer runs Spark-based compaction across this stack.

Compaction strategies: Implements Iceberg's `rewrite_data_files` with Binpack, Sort, and Z-order. Configurable target file size, minimum input files, delete file threshold, max concurrent file group rewrites, and partial progress commits.

Automation: Policy-driven — schedule-based (cron) or event-based triggers. The optimizer prioritizes tables based on usage patterns and ROI. Snapshot expiration and orphan deletion available as separate policy types.

Trade-offs: The natural choice for CDP organizations. Spark-based execution is proven and configurable, but carries JVM overhead. Maintenance operations are independent policies without built-in sequencing.

7. Starburst Managed Iceberg Pipelines

Starburst Galaxy provides Live Table Maintenance through serverless operations. Starburst Enterprise (SEP) offers data maintenance through the platform interface with granular scheduling.

Compaction strategies: Leverages Trino's Iceberg connector for file consolidation. Scheduling via predefined frequencies or custom cron expressions. Galaxy handles execution serverlessly.

Additional maintenance: Snapshot expiration, orphan file removal, and statistics collection as separate tasks. SEP provides broad schema-level policies and targeted table-level configurations.

Trade-offs: Well-suited for Trino-native organizations. Compaction performance is bound to Trino's characteristics. No cross-engine telemetry — sort decisions reflect only Trino workloads. Maintenance tasks scheduled independently.

8. Dremio Automatic Optimization

Dremio provides automatic optimization for Iceberg tables through its Open Catalog and Dremio Cloud platform. Maintenance runs on a dedicated engine separate from query workloads, so compaction never impacts production queries.

Optimization operations: Five integrated operations — data file compaction (target 256 MB), row-level delete handling (position deletes in v2, deletion vectors in v3), clustering by specified keys, partition evolution alignment, and manifest file rewriting. All five are evaluated automatically when optimization triggers.

Automation: Automatic optimization runs periodically (default every 3 hours for optimization, every 24 hours for vacuum). Tables are processed incrementally in batches — Dremio determines batch size based on available engine resources.

Iceberg v3 support: Dremio supports Iceberg v3 deletion vectors — compact bitmaps stored in Puffin files that replace v2 position delete files, eliminating join overhead during reads.

Trade-offs: Bundles compaction with clustering, delete handling, partition evolution, and manifest optimization into a single automatic pass. The 256 MB target file size is not configurable per table. No cross-engine query telemetry. Best suited for organizations already running Dremio as their primary query layer.

9. Databricks Predictive Optimization

Databricks provides Predictive Optimization for Unity Catalog managed tables — including Managed Iceberg Tables accessible via the Iceberg REST Catalog API. The system continuously analyzes usage patterns to automatically determine when and how to run maintenance.

Operations: Three automatic operations — OPTIMIZE (compaction and Liquid Clustering), VACUUM (unreferenced file removal), and ANALYZE (statistics maintenance for query planning). For Managed Iceberg tables, both `PARTITIONED BY` and `CLUSTER BY` produce Liquid Clustered tables.

Automatic Liquid Clustering: Powered by Predictive Optimization, this feature automatically selects and updates clustering columns based on query workload analysis. Automatic Statistics (GA) improved query performance by 22% across the platform.

Scale: Throughout 2025, Predictive Optimization vacuumed exabytes of unreferenced data, compacted hundreds of petabytes, and onboarded millions of tables to Automatic Liquid Clustering.

Trade-offs: Deeply integrated with the Databricks ecosystem and Unity Catalog. Workload-driven clustering is a genuine differentiator for teams fully on Databricks. Requires Unity Catalog as the catalog layer for managed Iceberg. Cross-engine telemetry is scoped to Databricks workloads.

Choosing the right approach

The right tool depends on where you are and where you are headed:

Single-cloud managed tables. If your tables live entirely within one vendor's managed Iceberg offering — Glue, S3 Tables, Snowflake, BigLake, Dremio, or Databricks — the built-in compaction is the lowest-friction starting point. It runs automatically, requires minimal configuration, and integrates natively.

Multi-catalog, multi-engine architectures. When Glue and REST catalogs coexist, and Trino, Snowflake, Athena, and Spark all query the same tables, the built-in options reach their limits. Each covers only its own catalog or engine. Compaction decisions informed by query patterns across all engines, coordinated with the full maintenance lifecycle, and enforced through fleet-wide policies require a dedicated optimization layer.

Cost-sensitive compaction at scale. If you compact hundreds of tables daily, the execution engine matters. JVM-based engines carry startup, GC, and idle-cluster overhead that compounds. A purpose-built Rust engine eliminates that overhead structurally — the cost-per-terabyte can drop by an order of magnitude.

Growing lakehouse, growing maintenance burden. When tables drift out of optimal file distribution between cron runs, orphan files accumulate silently, snapshot depth grows unchecked, and sort orders go stale — the question is not which compaction tool to pick. It is whether compaction should be one function within a broader control plane that handles the entire optimization surface continuously. When event-driven triggers, production query context, maintenance sequencing, layout simulation, a Rust engine, and fleet-wide policies all work together — compaction is no longer a chore. It is how your lakehouse scales.

LakeOps Control Plane — Modern Lakehouse Open Platform
Modern Lakehouse Open Platform: LakeOps sits between data and metadata sources (Iceberg Catalogs, AWS Glue, REST Catalogs, S3 Tables) and query engines (Spark, Trino, Flink, Snowflake, Athena, DuckDB, Databricks). Outcomes: Lower Cost, Faster Queries, Healthier Tables, Less Waste, AI-Ready Lakehouse.

Every solution in this comparison solves the small-files problem. The differentiators are what happens beyond file consolidation — how intelligently the sort strategy adapts to real query patterns, how compaction coordinates with other maintenance operations, how efficiently the engine uses compute, and how gracefully the system scales from ten tables to ten thousand.

Related articles

Found this useful? Share it with your team.