Organization-Wide Policies
LakeOps lets you define and enforce compaction, retention, orphan cleanup, and maintenance policies across catalogs and tables. Set schedules, priorities, and target scopes — then let LakeOps execute continuously. Every policy is auditable, versioned, and controllable with one toggle.
Why use policies?
Configuring optimization settings table-by-table works for a handful of tables, but doesn't scale. Policies let you define optimization rules once and apply them across hundreds or thousands of tables automatically.
- •Consistency — every table gets the same optimization standards without manual setup
- •Scale — onboard new tables automatically as they inherit catalog or namespace-wide policies
- •Governance — every policy change is versioned and auditable with full history
- •Control — enable or disable any policy instantly with a single toggle
How policies work
A policy is a named rule that applies a specific optimization operation to a set of tables on a schedule. Each policy has:
- •A name and optional description
- •A type (Adaptive Maintenance, Expire Snapshots, Remove Orphan Files, Rewrite Manifests, or Configuration)
- •A scope (which tables it applies to)
- •An execution schedule — cron expression, fixed interval, or manual-only
- •An enable/disable toggle for instant control
- •Type-specific settings (retention period, minimum snapshots, age threshold, etc.)
When enabled, LakeOps executes the policy on the configured cron schedule. You can also trigger any policy manually at any time.
Policy types
LakeOps supports the following policy types. When creating a new maintenance policy, you choose the operation type from the selection screen:
Adaptive Maintenance
Adaptive Maintenance bundles compaction, snapshot expiry, manifest rewrite, and delete-file handling into a single data-driven policy. LakeOps monitors table activity signals and runs the right operations at the right time, eliminating the need to configure each operation separately.
Key behavior
- • When active on a table, it manages the individual compaction, snapshot retention, orphan cleanup, and manifest rewrite sections in the Optimization tab — those sections are locked
- • All bundled operations run automatically based on table activity
- • Configurable name, description, scope, and cron schedule
Best for teams that have validated optimization on individual tables and want fully autonomous management. See the Quick Start guide for a recommended adoption path.
Expire Snapshots
Remove snapshots older than the retention period while respecting minimum retention count and concurrent readers. Keeps metadata lean and enables storage reclamation.
Configurable settings
- • Retention period (use table config or custom days)
- • Minimum snapshots to retain
- • Delete associated metadata files (on/off)
- • Delete associated files (on/off)
- • Cron schedule
Learn more in Snapshot Management docs
Remove Orphan Files
Detect and safely remove unreferenced data files older than the configured age threshold. Reclaims storage from failed writes, expired snapshots, and dropped tables.
Configurable settings
- • Retention threshold (default: 7 days)
- • Cron schedule
Learn more in Orphan Cleanup docs
Rewrite Manifests
Consolidate manifest files to reduce metadata overhead and improve query planning performance across all connected engines.
Configurable settings
- • Cron schedule
Learn more in Manifest Optimization docs
Compact Data Files Coming soon as standalone policy
Merge small data files into optimally-sized files. Reduces file count, improves query performance, and lowers storage request costs. Currently available through Adaptive Maintenance (which bundles compaction with other operations) or via the per-table Optimization tab.
Configuration & Governance (UI label: Configuration)
Configuration & Governance policies let you enforce table-level settings, format standards, and operational guardrails across your organization. Instead of relying on teams to manually configure each table, define rules once and apply them everywhere.
What you can enforce
- • Iceberg format version (e.g. require v2 across all production catalogs)
- • Default file format (Parquet, ORC, Avro)
- • Write distribution mode (hash, range, none)
- • Commit retry and isolation settings
- • Naming conventions and metadata standards
Example use cases
- • Standardize format version — ensure every table uses Iceberg v2 so all teams get row-level deletes, position deletes, and improved statistics.
- • Enforce Parquet as default — prevent teams from accidentally creating ORC or Avro tables that break downstream tooling assumptions.
- • Set write distribution mode — apply hash distribution across high-ingestion tables to prevent write hotspots and ensure balanced partition sizing.
- • Governance for new tables — when a team creates a new table in a governed catalog, it automatically inherits the organization's configuration policy — no manual setup required.
Policy scope
Policies can be scoped at different levels of your data hierarchy:
| Scope | Applies to | Use case |
|---|---|---|
| Per-table | A single specific table | Custom settings for critical or unusual tables |
| Per-namespace | All tables in a namespace | Team or domain-level standards |
| Per-catalog | All tables in a catalog | Environment-level rules (prod, staging) |
| Exclude lists | Skip specific namespaces or tables within scope | Exempt critical tables from broad rules |
Precedence rules
More specific policies override broader ones. A per-table policy always takes precedence over a namespace or catalog-wide policy for the same operation type. This lets you set sensible defaults at the catalog level and override only where needed.
Global Policies screen
Navigate to Manage > Policies in the sidebar to access the central policy management screen. This is where you create, search, filter, and manage all policies across your organization.
Screen layout
| Element | Description |
|---|---|
| + Create Policy | Opens a form to define a new policy (name, type, scope, schedule, settings) |
| Search bar | Filter policies by name or description |
| Type filter | Filter by policy type (All Types, Compact Data Files, Expire Snapshots, etc.) |
| Status filter | Filter by enabled/disabled status (All Status, Enabled, Disabled) |
Policy table columns
| Column | Description |
|---|---|
| Status | Toggle switch to enable or disable the policy instantly |
| Policy | Policy name and optional description (e.g. “Expire snapshots for all prod tables every 7 days”) |
| Type | Color-coded badge showing the policy type |
| Next Run | When the policy will next execute (based on cron schedule) |
| Last Run | Timestamp of the most recent execution |
| Updated | When the policy configuration was last modified |
| Actions | Edit (pencil icon) and Delete (trash icon) buttons |
Creating a policy
prod_daily_snapshot_cleanup).Per-table policy assignment
You can also view and manage policies from the perspective of a single table. Navigate to Data > Explore, select a table, then open the Policies tab.
What you see
The per-table Policies tab shows all policies currently assigned to the selected table, including inherited policies from namespace, catalog, or organization scope. Each row shows:
| Column | Description |
|---|---|
| Status | Toggle to enable/disable the policy for this table |
| Policy | Policy name |
| Type | Color-coded type badge |
| Next Run | Next scheduled execution |
| Last Run | Most recent execution timestamp |
Assigning a policy
Click + Assign Policy to link an existing policy to this table. You can assign multiple policies of different types to the same table (e.g. one compaction policy plus one snapshot expiration policy).
Example: typical policy set
A production table typically has multiple policies covering different optimization operations:
| Status | Policy | Type | Schedule |
|---|---|---|---|
| prod_adaptive_maintenance | Adaptive Maintenance | Data-driven | |
| prod_expire_snapshots | Expire Snapshots | Hourly | |
| prod_rewrite_manifests | Rewrite Manifests | Daily at 4:00 AM | |
| org_orphan_cleanup | Remove Orphan Files | Daily at 3:00 AM |
Adaptive Maintenance handles compaction and other operations data-driven. Manifest rewrites run daily for file layout. Snapshot expiration runs hourly to prevent metadata bloat. Orphan cleanup runs daily at the catalog level to catch stragglers.
Policies vs. per-table Optimization tab
Both policies and the per-table Optimization tab configure the same underlying operations. The difference is scope and management:
Policies
- • Managed centrally from the Policies screen
- • Apply to many tables at once
- • Versioned with full audit trail
- • New tables inherit automatically
- • Best for catalog-wide and cross-namespace standards
Per-table Optimization tab
- • Configured on individual tables
- • Quick setup for one-off cases
- • Override policy defaults when needed
- • Includes Execute button for immediate runs
- • Best for table-specific tuning
A common pattern: set catalog-wide policies for baseline hygiene, then use the per-table Optimization tab to override settings on tables that need special treatment.
Scheduling best practices
Policies use cron expressions to control execution timing. Consider these guidelines:
- •Stagger schedules — avoid running compaction, snapshot expiration, and manifest rewrites at the same time. Spread them across different hours.
- •Run during low-traffic windows — schedule heavy operations like compaction during off-peak hours to minimize impact on query workloads.
- •Order matters — run snapshot expiration before orphan cleanup so that expired data files become detectable as orphans.
- •Start conservative — begin with longer intervals and tighten as you observe results.
Recommended schedule
| Operation | Frequency | Cron |
|---|---|---|
| Compaction (via per-table or Adaptive) | Daily | 0 2 * * * |
| Expire Snapshots | Hourly | 0 * * * * |
| Rewrite Manifests | Daily | 0 4 * * * |
| Remove Orphan Files | Daily | 0 3 * * * |
Policy inheritance
When a policy is scoped at the catalog or namespace level, tables within that scope automatically inherit the policy. The per-table Optimization tab shows inherited policies with an Inherited badge.
- •Inherited policies — shown with an amber badge indicating the source policy name and scope type. Clicking the policy name navigates to the governance policy.
- •Table-level overrides — saving a table-level configuration creates a local policy that takes precedence over the inherited one. Removing the table policy reverts to the governance policy.
- •Excluding a table — toggle off an inherited policy to exclude the specific table from the governance policy without affecting other tables.
When Adaptive Maintenance is inherited, it locks all individual operation sections in the Optimization tab for that table. The notice indicates which governance policy manages the table.
Auditing & versioning
Every policy change is tracked with full audit history:
- •Version history — see when a policy was created, modified, enabled, or disabled, and by whom
- •Execution log — every policy run is recorded in the Events tab for each affected table, showing the operation performed, duration, and impact
- •Updated timestamp — the global Policies table shows when each policy was last modified
Monitoring policy execution
Track policy health and impact through:
- •Global Policies screen — Next Run and Last Run columns show scheduling health at a glance
- •Events tab (per-table) — detailed log of every operation executed by the policy, including before/after metrics
- •Insights tab (per-table) — warnings that should resolve after policy-driven optimizations take effect
- •Dashboard — aggregated operations count and cost savings reflect the cumulative impact of your policies
