Understanding Apache Iceberg Architecture
.MP4, AVC, 1280x720, 30 fps | English, AAC, 2 Ch | 32m | 80.2 MB
Instructor: Russ Thomas
.MP4, AVC, 1280x720, 30 fps | English, AAC, 2 Ch | 32m | 80.2 MB
Instructor: Russ Thomas
You’re familiar with the “what” and “why” of Apache Iceberg. This internals course will focus on the “how” of Apache Iceberg: specifically, how it is able to accomplish what it does as an open-storage format powering petabyte-scale data lakes.
What you'll learn
Many courses and white sheets on Apache Iceberg do a good job of describing what it is and why you might want to consider it as a storage format for your data lake or lake house platform.
In this course, Understanding Apache Iceberg Architecture, you’ll gain an understanding of the internals of Apache Iceberg.
First, you’ll explore Apache Iceberg’s layered approach to data and meta-data storage and management, and how Iceberg is able to fully decouple storage from the query engine and catalog.
Next, you’ll discover how Iceberg handles schema evolution gracefully while maintaining full ACID compliance, allowing data engineers to work with a petabyte-scale platform as they would a traditional data platform like SQL Server or Oracle.
Finally, you’ll learn how Iceberg manages and facilitates storage and query optimizations such as partitioning, compaction, and garbage collection.
When you’re finished with this course, you’ll have the skills and knowledge of Apache Iceberg architecture needed to fully assess and support it in your own data project.