LakeOps Documentation
LakeOps is the autonomous control plane for Apache Iceberg data lakes. It automates compaction, table maintenance, multi-engine query routing, and real-time optimization across your entire lake — with built-in observability, governance policies, and agentic AI support.
These docs cover every feature of the platform with step-by-step guides, configuration references, and best practices for common data lake operations.
Platform features
Getting Started
Connect your catalogs and start optimizing in under 10 minutes.
Compaction
Rust-based compaction engine that organizes data files by real query patterns for faster reads and lower costs.
Snapshot Management
Automated retention, expiration, time-travel, and rollback for every table.
Manifest Optimization
Consolidate manifest files so query planning stays fast at any table scale.
Orphan File Cleanup
Detect and safely remove unreferenced files from failed jobs, aborted commits, and legacy tables.
Observability
Table health, engine metrics, query latency, throughput, and cross-system telemetry from one place.
Policies
Define and enforce compaction, retention, and cleanup policies across catalogs and tables.
Engine Management
Connect Trino, Spark, Snowflake, Athena, DuckDB, and Flink. Monitor health and compare performance.
Query Routing
Route queries across engines optimized for cost, latency, or throughput automatically.
Simulations
Run layout simulations to preview the impact of compaction strategies before applying them.
Agentic AI
Agent-native MCP interface, guardrails, and self-optimizing lake for AI pipelines.
