Autonomous Lakehouse Control Plane

Connect catalogs, metadata, storage, and query engines through one autonomous control layer for optimization, governance, and cost-efficient scale.

Lakehouse control plane diagram showing LakeOps orchestrating catalogs, S3 lake storage, metadata, and multiple query engines through observability, optimization, maintenance, routing, and AI guardrails

Runs on your stack

AWS
Azure
Google Cloud
Snowflake
Databricks
Apache Flink
Apache Hadoop
Apache Iceberg
Delta Lake
Spark
Lakekeeper
StarRocks
AWS
Azure
Google Cloud
Snowflake
Databricks
Apache Flink
Apache Hadoop
Apache Iceberg
Delta Lake
Spark
Lakekeeper
StarRocks

Full Iceberg benefits.
Snowflake-level ease.

Monitor health, run compaction and maintenance—across catalogs and engines—and manage policies from a single view.

LakeOps LogoLakeOps

Last 30 days Optimization Activity

Total Operations
12,211
Last 90 days
Query Speed
12.4×
Avg. acceleration across engines
Cost Savings
$1,374,672
Saved in last 3 months
CPU & Storage
-76%
Last 90 days
Data Optimized
46.8 PB
Last 30 days

Key Metrics

Total Tables
786
Tables in all catalogs
Critical Tables
70
Require immediate attention
Warning Tables
105
Should be addressed or auto-piloted
Healthy Tables
566
Tables in optimal state
Total Data
112.4 PB
Total lake data size

Recent Operations

Last 10 operations
OperationTableDurationImpactTimeStatus
Compact Data Files
customer_orders
orders
4s1.24 TB, 16 → 1 files57 minutes agoSUCCESS
Expire Snapshots
payment_transactions
payments
27s8.2 TB4 hours agoSUCCESS
Rewrite Manifests
raw_clickstream
analytics
1.9s3 → 1 manifests5 hours agoSUCCESS
Compact Data Files
product_catalog
products
6m 11.3s3,008 → 1,256 files6 hours agoSUCCESS
Remove Orphan Files
user_sessions
analytics
13m 6.9s59,831 files, 74.81 GB freed7 hours agoSUCCESS

Table Status Distribution

Critical70 (9%)
Warning105 (13%)
Healthy566 (72%)

Top 5 Tables Needing Optimization

By Size
Table NameTable SizeStatusLast Scan
analytics.raw_clickstream4.6 TBCRITICAL2 hours ago
analytics.search_query_logs3.2 TBCRITICAL3 hours ago
analytics.user_sessions1.9 TBCRITICAL4 hours ago
orders.customer_orders1.24 TBCRITICAL1 hour ago
payments.payment_transactions860 GBCRITICAL2 hours ago

Minutes to value with no risk

1

Connect & collect telemetry

Apache Iceberg
AWS
Snowflake
Trino
2

Manual or autonomous management

Manual
Autonomous
3

Operations run & optimize

Compaction
Snapshots
Orphan cleanup
Manifests & metadata
4

Observability & governance

Metrics
Health
Agents
Routing
Logs
Policies
No vendor lock-in
No code / infra changes
No data changes

Capabilities

Managed Data. Optimized Ops.
Agentic AI ready.

Every layer of your lakehouse — from compaction and metadata to engines, observability, and policy enforcement — managed from one control plane.

Compaction Duration

Seconds

6300s
1612s
221s
780s
80006000400020000
S3 Tables
Apache Spark
LakeOps
LakeOps (Sort)

Cost of Compaction

Cost ($)

0
0
0
0
100%0
S3 Tables
Apache Spark
LakeOps
LakeOps (Sort)

Compaction

Intelligent Compaction

Rust-based compaction engine for Iceberg — analyzes query patterns and access frequency to optimize file layout at scale. Run more compactions in less time with minimal resource footprint, so your lake stays performant without blocking writes or queries.

  • 95% faster engine with Rust and AI
  • Organize data by real query usage to cut IO
customer_ordersSnapshots (154)
SNAPSHOT IDTIMESTAMPOPERATIONMANIFESTSADDEDACTIONS
6847201938742Mar 15, 2026 12:18 PMAppend12+4🔍
6847201938740Mar 15, 2026 11:45 AMAppend11+2🔍
6847201938738Mar 15, 2026 10:30 AMAppend10+6🔍
6847201938736Mar 14, 2026 08:00 PMAppend8+3🔍

Version History & Time Travel

Total Snapshots

154

Retention Policy

30 days

Latest Operation

Append

Management

Snapshot Lifecycle Management

Automated retention, expiration, and version history for every table. Set policies once — LakeOps expires old snapshots safely with full awareness of concurrent readers. Time-travel to any point, compare snapshots, and roll back without manual intervention.

Metadata Optimization3 operations

Rewrite Manifests

Consolidate manifest files for faster query planning

Rewrite Position Deletes

Optimize position delete files to improve read performance

Compute Statistics (Puffin)

Calculate column stats to optimize query planning and pruning

Manifest & Metadata Optimization

Consolidate and rewrite manifest files so query planning stays fast at any scale. Smaller manifests mean faster planning and fewer metadata scans for Trino, Spark, Flink, and every engine that touches your lake. Includes position delete file optimization and Puffin statistics computation.

Remove Orphan Files Policy

Clean up files no longer referenced by any table

1

Basic Information

Name and priority

e.g. Production Orphan Cleanup
1
StatusEnabled
2

Target Scope

Where this policy applies

Select a catalog
Leave empty for entire catalog
3

Execution Schedule

When the policy runs

0 0 * * *At 12:00 AM daily
4

Orphan File Configuration

How orphans are identified

7Unreferenced files older than this are removed

Orphan File Detection & Cleanup

Detect and safely remove files no longer referenced by any table. Eliminate storage drift from failed jobs, aborted commits, and legacy tables. Configurable retention thresholds, catalog-wide or per-table scope, and scheduled execution — reclaim capacity without risking data integrity.

Query Routing4 engines connected
Auto-optimize

Engine Load Distribution

Trino 42%Snowflake 31%AWS Athena 18%DuckDB 9%

Trino

Active
Queries: 256Avg: 1.8s

Snowflake

Active
Queries: 192Avg: 2.1s

AWS Athena

Active
Queries: 128Avg: 2.3s

DuckDB

Active
Queries: 64Avg: 0.5s
Optimize for:

Engines and AIs

Multi-Engine Query Routing

Connect Trino, Spark, Snowflake, Athena, DuckDB, and Flink to one routing layer. Intelligent query routing optimizes for cost, latency, or throughput automatically. Compare engine performance, monitor health, and add new engines — all without engine-specific scripts or duplicate tooling.

Routing Endpoints

Groups

4

Configured endpoints

Active

3

Passing traffic

Inactive

1

Paused groups

Endpoints

8

Publicly available

All routing groups

Storefront Analyticsactive

Interactive customer behavior and funnel exploration workloads.

storefront-analytics.lakeops.dev
Engines: TrinoDuckDB
Query types: SELECTAGGREGATE
Updated Mar 16, 2026 03:20 PMHigh
Checkout Transactionsactive

Operational writes and near-real-time checkout event ingestion.

checkout-transactions.lakeops.dev
Engines: SnowflakeStarRocks
Query types: INSERTUPDATEMERGE
Updated Mar 16, 2026 02:10 PMMedium
Catalog ETLinactive

Nightly catalog transformations and product availability sync jobs.

catalog-etl.lakeops.dev
Engines: AWS AthenaSpark
Query types: INSERTDELETE
Updated Mar 15, 2026 11:42 PMLow
Executive Reportingactive

Scheduled BI reports and financial dashboard refresh workloads.

executive-reporting.lakeops.dev
Engines: SnowflakeTrino
Query types: SELECTJOIN
Updated Mar 16, 2026 01:07 PMMedium

Agentic AI Enablement

Built for AI and ML pipelines — optimized metadata, layout, and table structure for agents, feature stores, and autonomous data workflows. Run simulations on file layout changes before applying them. Fast, consistent access to table state and history so AI pipelines get the data they need without extra glue.

MonitoringAll systems operational

Active Engines

4/6

Avg Latency

1.2s

↓ 15% vs yesterday

Throughput

2.4K q/h

↑ 8% vs last week

Error Rate

0.02%

↓ vs last week

Recent Alerts

Table optimization completed for orders.customer_orders

2 min ago

Snapshot expiration policy executed — 12 snapshots cleaned

15 min ago

High query latency detected on Trino endpoint

1 hr ago

Orphan file cleanup completed — 847 MB reclaimed

3 hrs ago

Observability

Full Lake Observability

Continuous analysis of table structure, file health, and optimization opportunities. Monitor active engines, query latency, throughput, and error rates. Cross-system telemetry from S3, GCS, ADLS, and every engine — view, alert, and act from one place.

Policies

Manage all policies including configuration, maintenance, delete, and truncate policies.

All Types
All Status
StatusPolicyTypeNextActions
Orders compaction
ManifestsMar 16, 02:00
Catalog manifest rewrite
Manifests
Payments orphan cleanup
Orphan FilesMar 16, 03:00
Warehouse snapshot expiry
SnapshotsMar 16, 01:00
Loyalty stats refresh
Config

Governance

Governance and Policies

Define and enforce compaction, retention, orphan cleanup, and maintenance policies across catalogs and tables. Set schedules, priorities, and target scopes — then let LakeOps execute continuously. Every policy is auditable, versioned, and controllable with one toggle.

Results

Measured impact on
real Iceberg workloads

Benchmarks from production-grade tables across multiple engines and cloud providers.

Compaction speed

95%faster

vs. Apache Spark on identical datasets

Spark
LakeOps
+ Sort

Query performance

12×faster

After compaction + layout optimization

Cost savings

80%reduction

In compute & storage spend

TPC-DS benchmark suiteProduction Iceberg tablesMulti-cloud, multi-engine

Get in touch

See LakeOps in action

Get a personalized walkthrough of the LakeOps platform with your data. Short call, your architecture.

No commitment · Typically 30 min