Quick Start

Connect your Iceberg catalog and start optimizing in under 10 minutes. No agents to install, no data movement, no pipeline changes.

1. Connect your catalog

Go to Data > Catalogs > + Add Catalog. Pick your catalog type, enter connection details, and LakeOps discovers every namespace and table automatically.

Glue + S3
AWS Glue Data Catalog
DynamoDB + S3
DynamoDB catalog store
REST + S3
REST API endpoint
S3 Tables
S3 Tables catalog & storage
Custom
Any catalog implementation

Connect multiple catalogs across regions and cloud environments.

2. Optimize tables

Open any table in Data > Explore > Optimization. Each operation has its own card with two ways to run it:

Manual

Configure, Save, then Execute to run once. Check results in the Events tab before automating.

Automated

Toggle Enabled, set a cron schedule, and LakeOps runs the operation continuously on that cadence.

Mix modes freely — automate compaction on your busiest tables while keeping other operations manual elsewhere. Start with one table and expand as you see results.

OperationWhat it does
File CompactionMerge small files into optimally-sized data files
Snapshot RetentionExpire old snapshots and reclaim storage
Orphan Files CleanupRemove unreferenced files left by failed jobs or expired snapshots
Rewrite ManifestsConsolidate manifest files for faster query planning

3. Scale with policies

Instead of configuring tables one by one, create policies that apply rules across your lake. Go to Manage > Policies > + Create Policy.

Maintenance Policy

Automate one operation per policy (snapshot expiration, orphan cleanup, manifest rewrites), or use Adaptive Maintenance to bundle all operations into a single data-driven policy that reacts to table activity.

Configuration Policy

Enforce table settings — Iceberg format version, file format, write distribution mode — across catalogs and namespaces.

Scope each policy to a catalog, namespace, or specific tables and set a cron schedule, fixed interval, or manual-only execution. Per-table policies override broader ones for the same operation type.

4. Monitor & act on insights

On top of optimization, LakeOps continuously analyzes your lake and surfaces what needs attention:

  • Dashboard — real-time operations count, query speed gains, cost savings, and resource reduction across every catalog.
  • Insights (Data > Insights) — AI-generated recommendations ranked by severity. Issues like small-file accumulation, excessive snapshots, and missing retention policies are flagged before they impact queries. Insights resolve automatically as optimizations take effect.
  • Events — full audit log for every operation with before/after metrics, duration, and status. Available per-table or lake-wide.

Key guarantees

  • No vendor lock-in — works with your existing catalogs, engines, and storage.
  • No code or infra changes — connects via metadata. No agents, sidecars, or pipeline rewrites.
  • No data movement — reads metadata and writes optimized files back to your storage (S3, GCS, ADLS). Data never leaves your account.