Deep dives on
data lakes

Guides and insights from the LakeOps team on Apache Iceberg,lakehouse architecture, and production operations.

Featured

LakeOps dashboard showing optimization activity, key metrics, and recent operations across production Iceberg tables

Managed Iceberg in 2026: Autonomous Data Lake

Iceberg tables degrade silently — small files pile up, snapshots bloat metadata, and query latency creeps higher. A breakdown of the nine components every production data lake needs to stay healthy — starting with observability and telemetry collection, through compaction, snapshot management, and lake-wide policies, to multi-engine routing and agentic AI enablement.

Apache IcebergLakeOpsFinOpsData PlatformsObservabilityCloud Cost
Jonathan Saring
Jonathan Saring
23 min read

Latest articles

AWS Glue Iceberg Optimization — an S3 bucket with scattered data objects funneled through an optimization lens into a geometric iceberg, with icons for Search, Analytics, and Tuning
Apache IcebergAWS GlueCompactionTable Maintenance

AWS Glue Iceberg Optimization: A Practical Guide

AWS Glue provides native Iceberg support for cataloging, ETL, and built-in table maintenance — but production lakehouses hit limitations fast. This guide covers Glue catalog configuration, ETL best practices, compaction tuning, common pitfalls, and how a dedicated control plane fills the operational gaps.

David W
David W
20 min read
Databricks to Iceberg smooth migration — Databricks and Apache Iceberg connected by a data bridge, with table data flowing into an open Iceberg lakehouse
DatabricksApache IcebergLakeOpsDelta Lake

Databricks to Iceberg Smooth Migration

Databricks to Iceberg smooth migration opens a multi-engine lakehouse — not a platform exit. Databricks stays central for ML and Spark; Iceberg adds Trino, Snowflake, and open catalogs. Five tools: LakeOps, UC managed Iceberg, Delta UniForm, Spark, and Lakehouse Federation.

David W
David W
18 min read
Apache Iceberg with dbt Optimization — dbt logo above SQL model cards flowing through a transformation pipeline into a geometric iceberg, with chart and analytics icons
Apache IcebergdbtIncremental ModelsData Lakehouse

Apache Iceberg with dbt: Optimization Guide

dbt transforms your data — but who maintains the Iceberg tables underneath? A practical guide to dbt adapters, incremental strategies, table properties, and the maintenance gap that every dbt + Iceberg team hits in production.

Rob M
Rob M
16 min read
Apache Iceberg with Flink Optimization — Flink squirrel mascot with streaming data flowing through an optimization ring into a geometric iceberg, with performance metric icons
Apache IcebergApache FlinkFlink streamingIceberg compaction

Apache Iceberg with Flink: Streaming Optimization Guide

Flink streaming into Iceberg creates thousands of small files per hour. This guide covers checkpoint tuning, write distribution modes, Flink SQL patterns, and why external maintenance is essential for production streaming tables.

Amit Gilad
Amit Gilad
15 min read
Apache Iceberg Delete Files — stacked data blocks with pink delete file markers funneled through compaction into clean, optimized data with a performance gauge showing improved read speed
Apache Icebergdelete filesmerge-on-readposition deletes

Apache Iceberg Delete Files: Reducing Merge-on-Read Overhead

Delete files let Iceberg avoid rewriting data on every UPDATE or DELETE — but every unresolved delete file forces readers to reconcile at query time. A deep guide to position deletes, equality deletes, measuring overhead, and resolving accumulation before it tanks performance.

David W
David W
17 min read
Apache Iceberg Table Partitioning Best Practices — a geometric iceberg branching into date, region, and category partition columns, each with table and folder icons showing the partition hierarchy
Apache IcebergPartitioningHidden PartitioningPartition Evolution

Apache Iceberg Table Partitioning Best Practices

Partitioning determines how much data every query must scan. Apache Iceberg's hidden partitioning and partition evolution change the game — but choosing the wrong strategy still creates performance cliffs. A practical guide to transforms, sizing, evolution, and avoiding the small-files trap.

Chris P
Chris P
18 min read
Scroll for more