Snapshot Management
LakeOps automates snapshot retention, expiration, and version history for every Iceberg table. Set policies once — LakeOps expires old snapshots safely with full awareness of concurrent readers. Time-travel to any point, compare snapshots, and roll back without manual intervention.
How Iceberg snapshots work
Every write operation (append, overwrite, delete) to an Iceberg table creates an immutable snapshot. Each snapshot references a set of manifest files which in turn reference actual data files. This design enables time-travel, atomic rollbacks, and concurrent reads during writes.
Over time, unused snapshots accumulate and increase metadata overhead, storage costs (from retained data files), and query planning time.
Why manage snapshots?
- •Excessive snapshots bloat metadata and slow down planning
- •Old snapshots retain references to data files, preventing storage reclamation
- •Without expiration, snapshot count grows unbounded with every write
- •LakeOps Insights flags tables with excessive obsolete snapshot ratios
Snapshot explorer
Navigate to Explore and select any table, then click the Snapshots tab. The explorer shows a full history table with:
| Column | Description |
|---|---|
| # | Row number (chronological order) |
| Snapshot ID | Unique 19-digit identifier |
| Date | When the snapshot was created |
| Operation | Append, Overwrite, Delete, Replace |
| References | Branch/tag labels (e.g. main, latest, custom tags) |
| Actions | Inspect, Compare, Time Travel, and More options per row |
Below the table, the Version History & Time Travel panel shows aggregate stats: total snapshot count, retention policy, and the latest operation.
Snapshot actions
The Snapshots tab provides both global actions (top bar) and per-row actions:
Global actions
Per-row actions
- •Inspect — view full snapshot details (manifests, added/removed files, parent snapshot)
- •Compare — side-by-side diff of this snapshot against another
- •Time Travel — query the table as it existed at this point in time
Time travel
Time travel lets you query a table as it existed at any historical snapshot. This is useful for:
- •Debugging data issues by comparing current vs historical state
- •Auditing what data was visible at a specific time for compliance
- •Recovering accidentally deleted data by reading from a pre-delete snapshot
- •Testing query results against different versions of the data
Time travel reads are non-destructive and don't affect the current table state.
Rollback
Rolling back sets the current snapshot pointer to a previous version. The operation is:
- •Atomic — the table transitions instantly from one state to another
- •Safe for concurrent readers — existing queries finish on the old snapshot
- •Reversible — you can roll forward again to any later snapshot
Auto vs. Manual mode
Snapshot retention supports two execution modes, toggled at the top of the Snapshot Retention card in the Optimization tab:
Auto (autopilot)
LakeOps expires snapshots autonomously on the configured cron schedule. Ideal for production tables where snapshot buildup must be kept in check without manual oversight.
Safety-aware: always respects the minimum snapshot count before expiring.
Manual (on-demand)
You control when expiration runs. Good for tables where you want to review simulation results before committing, or in staging environments.
The retention configuration is saved; trigger execution whenever you're ready.
Retention configuration
Configure automatic snapshot expiration from the Explore > Optimization tab under the Snapshot Retention card, or via organization-wide Policies.
| Setting | Description |
|---|---|
| Snapshots retention period | How long to keep snapshots before they become eligible for expiration. Use table config value, or specify a custom value in days. |
| Minimum snapshots to retain | Minimum number of snapshots to always keep, regardless of age. Ensures you always have rollback points. |
| Delete associated Metadata Files | When enabled, removes manifest lists and manifest files that are only referenced by expired snapshots. |
| Create associated files | Create cleanup records for data files only referenced by expired snapshots. |
| Schedule (cron) | Default: 0 0 * * * * (every hour). Controls how often LakeOps checks for expired snapshots. |
| Auto / Manual | Auto: runs expiration on schedule. Manual: triggered on demand. |
Using the snapshot retention UI
In the Optimization tab, the Snapshot Retention card provides the following controls:
Monitoring snapshot operations
After expiration runs, verify through:
- •Events tab — shows “Expire Snapshots” operations with count expired and duration
- •Snapshots tab — history table reflects the reduced snapshot count
- •Insights — “Excessive Snapshots” warnings should resolve
