LakeOps Blog

Insights, tutorials, and best practices for data lakehouse architecture and optimization

Incremental Processing with Apache Iceberg & Spark: A Comprehensive Guide
Apache IcebergSparkData Processing

Incremental Processing with Apache Iceberg & Spark: A Comprehensive Guide

Learn how to implement efficient incremental processing with Apache Iceberg and Spark, including best practices for data lake optimization and performance tuning.

Amit Gilad
Sep 17, 2024
9 min read
Read article
Delta Lake vs Apache Iceberg: Choosing the Right Table Format
Delta LakeApache IcebergComparisonData Lakes

Delta Lake vs Apache Iceberg: Choosing the Right Table Format

A detailed comparison between Delta Lake and Apache Iceberg, exploring their architectures, performance characteristics, and ideal use cases to help you make the right choice.

Amit Gilad
Jan 30, 2025
10 min read
Read article
Cracking the Ice: The Battle Between Sort and Binpack in Apache Iceberg
Apache IcebergCompactionData LakesPerformance

Cracking the Ice: The Battle Between Sort and Binpack in Apache Iceberg

Unlocking performance vs. optimizing storage — choosing the right compaction strategy for your data lake.

Amit Gilad
May 7, 2025
7 min read
Read article
Why Every Data Lake Needs a Control Plane: Lessons from Apache Iceberg
Apache IcebergCompactionData LakesPerformance

Why Every Data Lake Needs a Control Plane: Lessons from Apache Iceberg

Apache Iceberg delivers speed, but without a control plane snapshots pile up, costs surge, query take more time — starting with expiration.

Amit Gilad
Sep 11, 2025
8 min read
Read article
From 350TB to 230TB in 10 Minutes: The Hidden Weight of Stale Data
Apache IcebergExpire SnapshotsData LakesCost

From 350TB to 230TB in 10 Minutes: The Hidden Weight of Stale Data

See how a 350TB data lake shrank to 230TB in 10 minutes by removing stale data—saving 34% in AWS S3 costs and proving the need for a control plane.

Amit Gilad
Sep 23, 2025
5 min read
Read article