
Managed Iceberg in 2026: Autonomous Data Lake
Iceberg tables degrade silently — small files pile up, snapshots bloat metadata, and query latency creeps higher. A breakdown of the nine components every production data lake needs to stay healthy.
Connect catalogs, metadata, storage, and query engines through one autonomous control layer for optimization, governance, and cost-efficient scale.
Runs on your stack
| Operation | Table | Duration | Impact | Time | Status |
|---|---|---|---|---|---|
| Compact Data Files | customer_orders orders | 4s | 1.24 TB, 16 → 1 files | 57 minutes ago | SUCCESS |
| Expire Snapshots | payment_transactions payments | 27s | 8.2 TB | 4 hours ago | SUCCESS |
| Expire Snapshots | inventory_snapshots_20250702 warehouse | 3s | 2.1 TB | 4 hours ago | SUCCESS |
| Rewrite Manifests | raw_clickstream analytics | 1.9s | 3 → 1 manifests | 5 hours ago | SUCCESS |
| Compact Data Files | product_catalog products | 6m 11.3s | 3,008 → 1,256 files | 6 hours ago | SUCCESS |
Capabilities
Every layer of your lakehouse — from compaction and metadata to engines, observability, and policy enforcement — managed from one control plane.
Seconds
Cost ($)
Compaction
Not just file merging — LakeOps analyzes which columns your queries actually filter, join, and group on, then organizes data files accordingly. The result: predicate pushdown and column pruning skip entire file groups, reducing I/O, query time, and compute cost across every engine reading the table. Powered by a Rust-based engine with Apache DataFusion — 95% faster and ~10x cheaper than Spark.
Compaction
38% small files — merging 970 → 87 at 512 MB target
Expire Snapshots
154 snapshots, 62 past 30-day retention
Rewrite Manifests
12 manifests — below threshold, waiting for compaction
Orphan Cleanup
847 MB unreferenced — scheduled after expiration
Query patterns
event_date, region
Top sort columns (Trino + Spark)
Improvement
12.4× faster
Avg query speed after optimization
Cycle
Self-tuning
Sort orders adapt as patterns change
Maintenance
LakeOps continuously collects telemetry — file counts, partition health, snapshot velocity, delete ratios, manifest growth, and query patterns — and uses that signal to decide what to run, when, and in what order. Each operation's outcome feeds back into the next decision. The result is a coordinated maintenance loop that eliminates redundant work, adapts to changing workloads, and keeps every table in optimal shape without human intervention.
Total Snapshots
154
Retention
30 days
Expired Today
12
Storage Freed
18.4 GB
Automated retention, expiration, and version history for every table. Set policies once — LakeOps expires old snapshots safely with full awareness of concurrent readers. Time-travel to any point, compare snapshots, and roll back without manual intervention.
Manifests
487 → 12
97.5% reduced
Planner Latency
−2.1s
3.4s → 1.3s
Puffin Stats
100%
All columns indexed
Rewrite Manifests
Consolidate manifest files for faster query planning
Rewrite Position Deletes
Optimize position delete files to improve read performance
Compute Statistics (Puffin)
Calculate column stats to optimize query planning and pruning
Consolidate and rewrite manifest files so query planning stays fast at any scale. Smaller manifests mean faster planning and fewer metadata scans for Trino, Spark, Flink, and every engine that touches your lake. Includes position delete file optimization and Puffin statistics computation.
Unreferenced
847 MB
59,831 files
Age Threshold
7 days
Safety window
Last Cleanup
74.8 GB
Reclaimed 3 hrs ago
Detect and safely remove files no longer referenced by any table. Eliminate storage drift from failed jobs, aborted commits, and legacy tables. Configurable retention thresholds, catalog-wide or per-table scope, and scheduled execution — reclaim capacity without risking data integrity.
Queries Today
12,485
+12% from yesterday
Avg Latency
1.2s
−0.3s from last week
Active Engines
4 / 6
All critical online
Active Alerts
3
1 critical
312 partitions exceed file threshold
Query scan amplified 8×
Excessive manifests (487) — planning overhead
Planner latency +2.1s
Small file ratio 38% — compaction recommended
S3 GET costs elevated
Observability
Continuous analysis of table structure, file health, and optimization opportunities. Monitor active engines, query latency, throughput, and error rates. Cross-system telemetry from S3, GCS, ADLS, and every engine — view, alert, and act from one place.
Active Groups
2 / 3
Routing traffic
Engines in Use
7
8 registered
Routed Volume
7,285
This period
Query Routing
Connect Trino, Spark, Snowflake, Athena, DuckDB, and Flink to one routing layer. Intelligent query routing optimizes for cost, latency, or throughput automatically. Compare engine performance, monitor health, and add new engines — all without engine-specific scripts or duplicate tooling.
Catalogs
4
Tables
127
Columns
1,842
ReadOnly
Blocks DDL and DML from agent sessions
CostEstimate
Rejects queries exceeding scan thresholds
PIIMask
Hashes sensitive columns before results reach the model
HumanApproval
Pauses high-stakes operations for review
Agentic AI
Built for AI and ML pipelines — optimized metadata, layout, and table structure for agents, feature stores, and autonomous data workflows. Run simulations on file layout changes before applying them. Fast, consistent access to table state and history so AI pipelines get the data they need without extra glue.
Total Policies
5
Maintenance
4
Configuration
1
Governance
Define and enforce compaction, retention, orphan cleanup, and maintenance policies across catalogs and tables. Set schedules, priorities, and target scopes — then let LakeOps execute continuously. Every policy is auditable, versioned, and controllable with one toggle.

Iceberg tables degrade silently — small files pile up, snapshots bloat metadata, and query latency creeps higher. A breakdown of the nine components every production data lake needs to stay healthy.

Netflix spent years building an intelligent lakehouse — Polaris, Autotune, janitors, and Metacat. LakeOps lets every team build the same — and go beyond — in minutes.

How to route queries across Trino, Spark, DuckDB, Snowflake, Athena, and Flink on shared Iceberg tables — SQL routing proxy, dialect translation, and table-aware optimization.
Why LakeOps
From cost and performance to AI readiness — one platform that covers every dimension of lake operations.
A unified control plane that understands your lake end-to-end — tables, engines, queries, and costs — and acts on it autonomously.
Read articleSnapshot expiration, manifest rewrites, orphan cleanup, and metadata — automated, scheduled, and safe.
Explore maintenanceRust-based engine that organizes data by real query patterns — every cycle cuts IO, shrinks file counts, and speeds up reads.
Explore compactionRoute queries across Trino, Spark, Snowflake, and more — optimized for cost, latency, or throughput per workload.
Explore routingAgent-native MCP interface, guardrails, and a self-optimizing lake ready for AI agents, feature stores, and autonomous pipelines.
Explore AI enablementTable health, engine metrics, cross-system telemetry, and policy-driven governance from one control plane.
Explore observabilityResults
Benchmarks from production-grade tables across multiple engines and cloud providers.
Compaction speed
vs. Apache Spark on identical datasets
Query performance
After compaction + layout optimization
Cost savings
In compute & storage spend
Table health
Autonomous maintenance keeps every table optimized
Get in touch
Get a personalized walkthrough of the LakeOps platform with your data. Short call, your architecture.
No commitment · Typically 30 min