Organization-Wide Policies

LakeOps lets you define and enforce compaction, retention, orphan cleanup, and maintenance policies across catalogs and tables. Set schedules, priorities, and target scopes — then let LakeOps execute continuously. Every policy is auditable, versioned, and controllable with one toggle.

Why use policies?

Configuring optimization settings table-by-table works for a handful of tables, but doesn't scale. Policies let you define optimization rules once and apply them across hundreds or thousands of tables automatically.

  • Consistency — every table gets the same optimization standards without manual setup
  • Scale — onboard new tables automatically as they inherit catalog or namespace-wide policies
  • Governance — every policy change is versioned and auditable with full history
  • Control — enable or disable any policy instantly with a single toggle

How policies work

A policy is a named rule that applies a specific optimization operation to a set of tables on a schedule. Each policy has:

  • A name and optional description
  • A type (Adaptive Maintenance, Expire Snapshots, Remove Orphan Files, Rewrite Manifests, or Configuration)
  • A scope (which tables it applies to)
  • An execution schedule — cron expression, fixed interval, or manual-only
  • An enable/disable toggle for instant control
  • Type-specific settings (retention period, minimum snapshots, age threshold, etc.)

When enabled, LakeOps executes the policy on the configured cron schedule. You can also trigger any policy manually at any time.

Policy types

LakeOps supports the following policy types. When creating a new maintenance policy, you choose the operation type from the selection screen:

Adaptive Maintenance

Adaptive Maintenance bundles compaction, snapshot expiry, manifest rewrite, and delete-file handling into a single data-driven policy. LakeOps monitors table activity signals and runs the right operations at the right time, eliminating the need to configure each operation separately.

Key behavior

  • • When active on a table, it manages the individual compaction, snapshot retention, orphan cleanup, and manifest rewrite sections in the Optimization tab — those sections are locked
  • • All bundled operations run automatically based on table activity
  • • Configurable name, description, scope, and cron schedule

Best for teams that have validated optimization on individual tables and want fully autonomous management. See the Quick Start guide for a recommended adoption path.

Expire Snapshots

Remove snapshots older than the retention period while respecting minimum retention count and concurrent readers. Keeps metadata lean and enables storage reclamation.

Configurable settings

  • • Retention period (use table config or custom days)
  • • Minimum snapshots to retain
  • • Delete associated metadata files (on/off)
  • • Delete associated files (on/off)
  • • Cron schedule

Learn more in Snapshot Management docs

Remove Orphan Files

Detect and safely remove unreferenced data files older than the configured age threshold. Reclaims storage from failed writes, expired snapshots, and dropped tables.

Configurable settings

  • • Retention threshold (default: 7 days)
  • • Cron schedule

Learn more in Orphan Cleanup docs

Rewrite Manifests

Consolidate manifest files to reduce metadata overhead and improve query planning performance across all connected engines.

Configurable settings

  • • Cron schedule

Learn more in Manifest Optimization docs

Compact Data Files Coming soon as standalone policy

Merge small data files into optimally-sized files. Reduces file count, improves query performance, and lowers storage request costs. Currently available through Adaptive Maintenance (which bundles compaction with other operations) or via the per-table Optimization tab.

Configuration & Governance (UI label: Configuration)

Configuration & Governance policies let you enforce table-level settings, format standards, and operational guardrails across your organization. Instead of relying on teams to manually configure each table, define rules once and apply them everywhere.

What you can enforce

  • • Iceberg format version (e.g. require v2 across all production catalogs)
  • • Default file format (Parquet, ORC, Avro)
  • • Write distribution mode (hash, range, none)
  • • Commit retry and isolation settings
  • • Naming conventions and metadata standards

Example use cases

  • Standardize format version — ensure every table uses Iceberg v2 so all teams get row-level deletes, position deletes, and improved statistics.
  • Enforce Parquet as default — prevent teams from accidentally creating ORC or Avro tables that break downstream tooling assumptions.
  • Set write distribution mode — apply hash distribution across high-ingestion tables to prevent write hotspots and ensure balanced partition sizing.
  • Governance for new tables — when a team creates a new table in a governed catalog, it automatically inherits the organization's configuration policy — no manual setup required.

Policy scope

Policies can be scoped at different levels of your data hierarchy:

ScopeApplies toUse case
Per-tableA single specific tableCustom settings for critical or unusual tables
Per-namespaceAll tables in a namespaceTeam or domain-level standards
Per-catalogAll tables in a catalogEnvironment-level rules (prod, staging)
Exclude listsSkip specific namespaces or tables within scopeExempt critical tables from broad rules

Precedence rules

More specific policies override broader ones. A per-table policy always takes precedence over a namespace or catalog-wide policy for the same operation type. This lets you set sensible defaults at the catalog level and override only where needed.

Global Policies screen

Navigate to Manage > Policies in the sidebar to access the central policy management screen. This is where you create, search, filter, and manage all policies across your organization.

Screen layout

ElementDescription
+ Create PolicyOpens a form to define a new policy (name, type, scope, schedule, settings)
Search barFilter policies by name or description
Type filterFilter by policy type (All Types, Compact Data Files, Expire Snapshots, etc.)
Status filterFilter by enabled/disabled status (All Status, Enabled, Disabled)

Policy table columns

ColumnDescription
StatusToggle switch to enable or disable the policy instantly
PolicyPolicy name and optional description (e.g. “Expire snapshots for all prod tables every 7 days”)
TypeColor-coded badge showing the policy type
Next RunWhen the policy will next execute (based on cron schedule)
Last RunTimestamp of the most recent execution
UpdatedWhen the policy configuration was last modified
ActionsEdit (pencil icon) and Delete (trash icon) buttons

Creating a policy

1Click + Create Policy in the top-right of the Policies screen.
2Select a policy category: choose between Configuration and Maintenance. For Maintenance, select an operation: Adaptive Maintenance, Expire Snapshots, Remove Orphan Files, or Rewrite Manifests.
3Enter a policy name, optional description, and priority. Use descriptive names that reflect scope and purpose (e.g. prod_daily_snapshot_cleanup).
4Define the scope: select a catalog, then optionally narrow to specific namespaces or tables.
5Set the execution schedule: cron expression, fixed interval, or manual-only.
6Configure type-specific settings (e.g. retention period and minimum snapshot count for Expire Snapshots). Toggle Enabled and click Create Policy.

Per-table policy assignment

You can also view and manage policies from the perspective of a single table. Navigate to Data > Explore, select a table, then open the Policies tab.

What you see

The per-table Policies tab shows all policies currently assigned to the selected table, including inherited policies from namespace, catalog, or organization scope. Each row shows:

ColumnDescription
StatusToggle to enable/disable the policy for this table
PolicyPolicy name
TypeColor-coded type badge
Next RunNext scheduled execution
Last RunMost recent execution timestamp

Assigning a policy

Click + Assign Policy to link an existing policy to this table. You can assign multiple policies of different types to the same table (e.g. one compaction policy plus one snapshot expiration policy).

Example: typical policy set

A production table typically has multiple policies covering different optimization operations:

StatusPolicyTypeSchedule
prod_adaptive_maintenanceAdaptive MaintenanceData-driven
prod_expire_snapshotsExpire SnapshotsHourly
prod_rewrite_manifestsRewrite ManifestsDaily at 4:00 AM
org_orphan_cleanupRemove Orphan FilesDaily at 3:00 AM

Adaptive Maintenance handles compaction and other operations data-driven. Manifest rewrites run daily for file layout. Snapshot expiration runs hourly to prevent metadata bloat. Orphan cleanup runs daily at the catalog level to catch stragglers.

Policies vs. per-table Optimization tab

Both policies and the per-table Optimization tab configure the same underlying operations. The difference is scope and management:

Policies

  • • Managed centrally from the Policies screen
  • • Apply to many tables at once
  • • Versioned with full audit trail
  • • New tables inherit automatically
  • • Best for catalog-wide and cross-namespace standards

Per-table Optimization tab

  • • Configured on individual tables
  • • Quick setup for one-off cases
  • • Override policy defaults when needed
  • • Includes Execute button for immediate runs
  • • Best for table-specific tuning

A common pattern: set catalog-wide policies for baseline hygiene, then use the per-table Optimization tab to override settings on tables that need special treatment.

Scheduling best practices

Policies use cron expressions to control execution timing. Consider these guidelines:

  • Stagger schedules — avoid running compaction, snapshot expiration, and manifest rewrites at the same time. Spread them across different hours.
  • Run during low-traffic windows — schedule heavy operations like compaction during off-peak hours to minimize impact on query workloads.
  • Order matters — run snapshot expiration before orphan cleanup so that expired data files become detectable as orphans.
  • Start conservative — begin with longer intervals and tighten as you observe results.

Recommended schedule

OperationFrequencyCron
Compaction (via per-table or Adaptive)Daily0 2 * * *
Expire SnapshotsHourly0 * * * *
Rewrite ManifestsDaily0 4 * * *
Remove Orphan FilesDaily0 3 * * *

Policy inheritance

When a policy is scoped at the catalog or namespace level, tables within that scope automatically inherit the policy. The per-table Optimization tab shows inherited policies with an Inherited badge.

  • Inherited policies — shown with an amber badge indicating the source policy name and scope type. Clicking the policy name navigates to the governance policy.
  • Table-level overrides — saving a table-level configuration creates a local policy that takes precedence over the inherited one. Removing the table policy reverts to the governance policy.
  • Excluding a table — toggle off an inherited policy to exclude the specific table from the governance policy without affecting other tables.

When Adaptive Maintenance is inherited, it locks all individual operation sections in the Optimization tab for that table. The notice indicates which governance policy manages the table.

Auditing & versioning

Every policy change is tracked with full audit history:

  • Version history — see when a policy was created, modified, enabled, or disabled, and by whom
  • Execution log — every policy run is recorded in the Events tab for each affected table, showing the operation performed, duration, and impact
  • Updated timestamp — the global Policies table shows when each policy was last modified

Monitoring policy execution

Track policy health and impact through:

  • Global Policies screen — Next Run and Last Run columns show scheduling health at a glance
  • Events tab (per-table) — detailed log of every operation executed by the policy, including before/after metrics
  • Insights tab (per-table) — warnings that should resolve after policy-driven optimizations take effect
  • Dashboard — aggregated operations count and cost savings reflect the cumulative impact of your policies