Autonomous Lakehouse Control Plane

Connect catalogs, metadata, storage, and query engines through one autonomous control layer for optimization, governance, and cost-efficient scale.

Get a Demo

Explore solutions

Lakehouse control plane diagram showing LakeOps orchestrating catalogs, S3 lake storage, metadata, and multiple query engines through observability, optimization, maintenance, routing, and AI guardrails

Runs on your stack

Full Iceberg benefits.
Snowflake-level ease.

Monitor health, run compaction and maintenance—across catalogs and engines—and manage policies from a single view.

LakeOps

Last 30 days Optimization Activity

Total Operations

12,211

Last 90 days

Query Speed

12.4×

Avg. acceleration across engines

Cost Savings

$1,374,672

Saved in last 3 months

CPU & Storage

-76%

Last 90 days

Data Optimized

46.8 PB

Last 30 days

Key Metrics

Total Tables

786

Tables in all catalogs

Critical Tables

Require immediate attention

Warning Tables

105

Should be addressed or auto-piloted

Healthy Tables

566

Tables in optimal state

Total Data

112.4 PB

Total lake data size

Recent Operations

Last 10 operations

Operation	Table	Duration	Impact	Time	Status
Compact Data Files	customer_orders orders	4s	1.24 TB, 16 → 1 files	57 minutes ago	SUCCESS
Expire Snapshots	payment_transactions payments	27s	8.2 TB	4 hours ago	SUCCESS
Rewrite Manifests	raw_clickstream analytics	1.9s	3 → 1 manifests	5 hours ago	SUCCESS
Compact Data Files	product_catalog products	6m 11.3s	3,008 → 1,256 files	6 hours ago	SUCCESS
Remove Orphan Files	user_sessions analytics	13m 6.9s	59,831 files, 74.81 GB freed	7 hours ago	SUCCESS

Table Status Distribution

Critical70 (9%)

Warning105 (13%)

Healthy566 (72%)

Top 5 Tables Needing Optimization

By Size

Table Name	Table Size	Status	Last Scan
analytics.raw_clickstream	4.6 TB	CRITICAL	2 hours ago
analytics.search_query_logs	3.2 TB	CRITICAL	3 hours ago
analytics.user_sessions	1.9 TB	CRITICAL	4 hours ago
orders.customer_orders	1.24 TB	CRITICAL	1 hour ago
payments.payment_transactions	860 GB	CRITICAL	2 hours ago

Minutes to value with no risk

Connect & collect telemetry

Manual or autonomous management

Manual

Autonomous

Operations run & optimize

Compaction

Snapshots

Orphan cleanup

Manifests & metadata

Observability & governance

Metrics

Health

Agents

Routing

Logs

Policies

No vendor lock-in

No code / infra changes

No data changes

Get a Demo

Get in touch ->

Capabilities

Managed Data. Optimized Ops.
Agentic AI ready.

Every layer of your lakehouse — from compaction and metadata to engines, observability, and policy enforcement — managed from one control plane.

Compaction Duration

Seconds

6300s

1612s

221s

780s

80006000400020000

S3 Tables

Apache Spark

LakeOps

LakeOps (Sort)

Cost of Compaction

Cost ($)

100%0

S3 Tables

Apache Spark

LakeOps

LakeOps (Sort)

Compaction

Intelligent Compaction

Rust-based compaction engine for Iceberg — analyzes query patterns and access frequency to optimize file layout at scale. Run more compactions in less time with minimal resource footprint, so your lake stays performant without blocking writes or queries.

95% faster engine with Rust and AI
Organize data by real query usage to cut IO

customer_ordersSnapshots (154)

SNAPSHOT ID	TIMESTAMP	OPERATION	MANIFESTS	ADDED	ACTIONS
6847201938742	Mar 15, 2026 12:18 PM	Append	12	+4	🔍⇄⏱
6847201938740	Mar 15, 2026 11:45 AM	Append	11	+2	🔍⇄⏱
6847201938738	Mar 15, 2026 10:30 AM	Append	10	+6	🔍⇄⏱
6847201938736	Mar 14, 2026 08:00 PM	Append	8	+3	🔍⇄⏱

⏱

Version History & Time Travel

Total Snapshots

154

Retention Policy

30 days

Latest Operation

Append

Management

Snapshot Lifecycle Management

Automated retention, expiration, and version history for every table. Set policies once — LakeOps expires old snapshots safely with full awareness of concurrent readers. Time-travel to any point, compare snapshots, and roll back without manual intervention.

Metadata Optimization3 operations

Rewrite Manifests

Consolidate manifest files for faster query planning

Rewrite Position Deletes

Optimize position delete files to improve read performance

Compute Statistics (Puffin)

Calculate column stats to optimize query planning and pruning

Manifest & Metadata Optimization

Consolidate and rewrite manifest files so query planning stays fast at any scale. Smaller manifests mean faster planning and fewer metadata scans for Trino, Spark, Flink, and every engine that touches your lake. Includes position delete file optimization and Puffin statistics computation.

Remove Orphan Files Policy

Clean up files no longer referenced by any table

Basic Information

Name and priority

Policy Name

e.g. Production Orphan Cleanup

Priority

StatusEnabled

Target Scope

Where this policy applies

Catalog

Select a catalog

Namespace / Tables (optional)

Leave empty for entire catalog

Execution Schedule

When the policy runs

Cron Expression

0 0 * * *At 12:00 AM daily

Orphan File Configuration

How orphans are identified

Older Than (Days)

7Unreferenced files older than this are removed

Orphan File Detection & Cleanup

Detect and safely remove files no longer referenced by any table. Eliminate storage drift from failed jobs, aborted commits, and legacy tables. Configurable retention thresholds, catalog-wide or per-table scope, and scheduled execution — reclaim capacity without risking data integrity.

Query Routing4 engines connected

Auto-optimize

Engine Load Distribution

Trino 42%Snowflake 31%AWS Athena 18%DuckDB 9%

Trino

Active

Queries: 256Avg: 1.8s

Snowflake

Active

Queries: 192Avg: 2.1s

AWS Athena

Active

Queries: 128Avg: 2.3s

DuckDB

Active

Queries: 64Avg: 0.5s

Optimize for:

Engines and AIs

Multi-Engine Query Routing

Connect Trino, Spark, Snowflake, Athena, DuckDB, and Flink to one routing layer. Intelligent query routing optimizes for cost, latency, or throughput automatically. Compare engine performance, monitor health, and add new engines — all without engine-specific scripts or duplicate tooling.

Routing Endpoints

Groups

Configured endpoints

Active

Passing traffic

Inactive

Paused groups

Endpoints

Publicly available

All routing groups

Storefront Analyticsactive

Interactive customer behavior and funnel exploration workloads.

storefront-analytics.lakeops.dev

Engines: TrinoDuckDB

Query types: SELECTAGGREGATE

Updated Mar 16, 2026 03:20 PMHigh

Checkout Transactionsactive

Operational writes and near-real-time checkout event ingestion.

checkout-transactions.lakeops.dev

Engines: SnowflakeStarRocks

Query types: INSERTUPDATEMERGE

Updated Mar 16, 2026 02:10 PMMedium

Catalog ETLinactive

Nightly catalog transformations and product availability sync jobs.

catalog-etl.lakeops.dev

Engines: AWS AthenaSpark

Query types: INSERTDELETE

Updated Mar 15, 2026 11:42 PMLow

Executive Reportingactive

Scheduled BI reports and financial dashboard refresh workloads.

executive-reporting.lakeops.dev

Engines: SnowflakeTrino

Query types: SELECTJOIN

Updated Mar 16, 2026 01:07 PMMedium

Agentic AI Enablement

Built for AI and ML pipelines — optimized metadata, layout, and table structure for agents, feature stores, and autonomous data workflows. Run simulations on file layout changes before applying them. Fast, consistent access to table state and history so AI pipelines get the data they need without extra glue.

MonitoringAll systems operational

Active Engines

4/6

Avg Latency

1.2s

↓ 15% vs yesterday

Throughput

2.4K q/h

↑ 8% vs last week

Error Rate

0.02%

↓ vs last week

Recent Alerts

Table optimization completed for orders.customer_orders

2 min ago

Snapshot expiration policy executed — 12 snapshots cleaned

15 min ago

High query latency detected on Trino endpoint

1 hr ago

Orphan file cleanup completed — 847 MB reclaimed

3 hrs ago

Observability

Full Lake Observability

Continuous analysis of table structure, file health, and optimization opportunities. Monitor active engines, query latency, throughput, and error rates. Cross-system telemetry from S3, GCS, ADLS, and every engine — view, alert, and act from one place.

Policies

Manage all policies including configuration, maintenance, delete, and truncate policies.

All Types

All Status

Policy	Type	Next
Orders compaction	Manifests	Mar 16, 02:00
Catalog manifest rewrite	Manifests	—
Payments orphan cleanup	Orphan Files	Mar 16, 03:00
Warehouse snapshot expiry	Snapshots	Mar 16, 01:00
Loyalty stats refresh	Config	—

Governance

Governance and Policies

Define and enforce compaction, retention, orphan cleanup, and maintenance policies across catalogs and tables. Set schedules, priorities, and target scopes — then let LakeOps execute continuously. Every policy is auditable, versioned, and controllable with one toggle.

Why LakeOps

The control plane
for your lakehouse

From cost and performance to AI readiness — one platform that covers every dimension of lake operations.

Managed Iceberg

Autonomous compaction, snapshots, manifests, and orphan cleanup for every table.

Explore Managed Iceberg

Agentic AI readiness

Agent-native MCP interface, guardrails, and a self-optimizing lake for AI workloads.

Explore AI enablement

Cost reduction

Eliminate small files, orphans, and over-provisioned compute automatically.

Explore cost optimization

Query performance

Adaptive data layout, lean manifests, and optimized file sizes for faster reads.

Explore performance impact

Multi-engine routing

Route queries across Trino, Spark, Snowflake, and more — optimized per workload.

Explore routing

Lakehouse observability

Table health, engine metrics, and cross-system telemetry from one control plane.

Explore observability

Explore use-cases

Results

Measured impact on
real Iceberg workloads

Benchmarks from production-grade tables across multiple engines and cloud providers.

Compaction speed

95%faster

vs. Apache Spark on identical datasets

Spark

LakeOps

+ Sort

Query performance

12×faster

After compaction + layout optimization

Cost savings

80%reduction

In compute & storage spend

TPC-DS benchmark suiteProduction Iceberg tablesMulti-cloud, multi-engine

Get in touch

See LakeOps in action

Get a personalized walkthrough of the LakeOps platform with your data. Short call, your architecture.

Get a Demo

No commitment · Typically 30 min

Autonomous Lakehouse Control Plane

Full Iceberg benefits.Snowflake-level ease.

Last 30 days Optimization Activity

Key Metrics

Recent Operations

Table Status Distribution

Top 5 Tables Needing Optimization

Minutes to value with no risk

Connect & collect telemetry

Manual or autonomous management

Operations run & optimize

Observability & governance

Managed Data. Optimized Ops.Agentic AI ready.

Compaction Duration

Cost of Compaction

Intelligent Compaction

Version History & Time Travel

Snapshot Lifecycle Management

Rewrite Manifests

Rewrite Position Deletes

Compute Statistics (Puffin)

Manifest & Metadata Optimization

Remove Orphan Files Policy

Basic Information

Target Scope

Execution Schedule

Orphan File Configuration

Orphan File Detection & Cleanup

Engine Load Distribution

Multi-Engine Query Routing

All routing groups

Agentic AI Enablement

Recent Alerts

Full Lake Observability

Policies

Governance and Policies

Managed Iceberg

Agentic AI readiness

Cost reduction

Query performance

Multi-engine routing

Lakehouse observability

See LakeOps in action

Full Iceberg benefits.
Snowflake-level ease.

Managed Data. Optimized Ops.
Agentic AI ready.