Snapshot Management

LakeOps automates snapshot retention, expiration, and version history for every Iceberg table. Set policies once — LakeOps expires old snapshots safely with full awareness of concurrent readers. Time-travel to any point, compare snapshots, and roll back without manual intervention.

How Iceberg snapshots work

Every write operation (append, overwrite, delete) to an Iceberg table creates an immutable snapshot. Each snapshot references a set of manifest files which in turn reference actual data files. This design enables time-travel, atomic rollbacks, and concurrent reads during writes.

Over time, unused snapshots accumulate and increase metadata overhead, storage costs (from retained data files), and query planning time.

Why manage snapshots?

  • Excessive snapshots bloat metadata and slow down planning
  • Old snapshots retain references to data files, preventing storage reclamation
  • Without expiration, snapshot count grows unbounded with every write
  • LakeOps Insights flags tables with excessive obsolete snapshot ratios

Snapshot explorer

Navigate to Explore and select any table, then click the Snapshots tab. The explorer shows a full history table with:

ColumnDescription
#Row number (chronological order)
Snapshot IDUnique 19-digit identifier
DateWhen the snapshot was created
OperationAppend, Overwrite, Delete, Replace
ReferencesBranch/tag labels (e.g. main, latest, custom tags)
ActionsInspect, Compare, Time Travel, and More options per row

Below the table, the Version History & Time Travel panel shows aggregate stats: total snapshot count, retention policy, and the latest operation.

Snapshot actions

The Snapshots tab provides both global actions (top bar) and per-row actions:

Global actions

Tag
Label a snapshot for long-term reference
Branch
Create an isolated branch from a snapshot
Rollback to snapshot
Revert the entire table to a specific version
Set current snapshot
Manually designate which snapshot is active

Per-row actions

  • Inspect — view full snapshot details (manifests, added/removed files, parent snapshot)
  • Compare — side-by-side diff of this snapshot against another
  • Time Travel — query the table as it existed at this point in time

Time travel

Time travel lets you query a table as it existed at any historical snapshot. This is useful for:

  • Debugging data issues by comparing current vs historical state
  • Auditing what data was visible at a specific time for compliance
  • Recovering accidentally deleted data by reading from a pre-delete snapshot
  • Testing query results against different versions of the data

Time travel reads are non-destructive and don't affect the current table state.

Rollback

Rolling back sets the current snapshot pointer to a previous version. The operation is:

  • Atomic — the table transitions instantly from one state to another
  • Safe for concurrent readers — existing queries finish on the old snapshot
  • Reversible — you can roll forward again to any later snapshot

Auto vs. Manual mode

Snapshot retention supports two execution modes, toggled at the top of the Snapshot Retention card in the Optimization tab:

Auto (autopilot)

LakeOps expires snapshots autonomously on the configured cron schedule. Ideal for production tables where snapshot buildup must be kept in check without manual oversight.

Safety-aware: always respects the minimum snapshot count before expiring.

Manual (on-demand)

You control when expiration runs. Good for tables where you want to review simulation results before committing, or in staging environments.

The retention configuration is saved; trigger execution whenever you're ready.

Retention configuration

Configure automatic snapshot expiration from the Explore > Optimization tab under the Snapshot Retention card, or via organization-wide Policies.

SettingDescription
Snapshots retention periodHow long to keep snapshots before they become eligible for expiration. Use table config value, or specify a custom value in days.
Minimum snapshots to retainMinimum number of snapshots to always keep, regardless of age. Ensures you always have rollback points.
Delete associated Metadata FilesWhen enabled, removes manifest lists and manifest files that are only referenced by expired snapshots.
Create associated filesCreate cleanup records for data files only referenced by expired snapshots.
Schedule (cron)Default: 0 0 * * * * (every hour). Controls how often LakeOps checks for expired snapshots.
Auto / ManualAuto: runs expiration on schedule. Manual: triggered on demand.

Using the snapshot retention UI

In the Optimization tab, the Snapshot Retention card provides the following controls:

1Toggle Auto to enable scheduled expiration. Toggle Manual if you prefer to trigger it yourself.
2Choose whether to use the table's built-in retention config or specify custom values for retention period and minimum snapshot count.
3Set the cron schedule for expiration checks.
4Configure expired metadata associated files options (Delete associated Metadata Files, Create associated files).
5Click Simulate to preview how many snapshots will be expired, then Save.

Monitoring snapshot operations

After expiration runs, verify through:

  • Events tab — shows “Expire Snapshots” operations with count expired and duration
  • Snapshots tab — history table reflects the reduced snapshot count
  • Insights — “Excessive Snapshots” warnings should resolve