Optimize Storage and Performance in Azure Databricks with Delta Lake
.MP4, AVC, 1280x720, 30 fps | English, AAC, 2 Ch | 1h 34m | 245 MB
Instructor: Janani Ravi
.MP4, AVC, 1280x720, 30 fps | English, AAC, 2 Ch | 1h 34m | 245 MB
Instructor: Janani Ravi
Delta Lake is a storage layer that brings reliability and performance to data lakes. This course covers Delta Lake on Azure Databricks core concepts, batch and streaming operations, schema evolution, performance optimization, and Azure integration.
What you'll learn
Modern data lakes often suffer from data quality issues, inconsistent schemas, and poor performance when handling both batch and streaming data. Without a structured and reliable storage format, it's difficult to build trustworthy and maintainable analytics workflows.
In this course, Optimize Storage and Performance in Azure Databricks with Delta Lake, you’ll gain the ability to effectively manage and operate on large-scale data using the Delta Lake format.
First, you’ll explore the fundamentals of Delta Lake, including its role in the Lakehouse architecture, how it builds on Parquet with transaction logs, and its support for ACID transactions and schema enforcement.
Next, you’ll discover how to work with Delta Lake in practice—converting data into Delta format, performing batch and streaming operations, and managing schema evolution using Apache Spark.
Finally, you’ll learn how to optimize Delta Lake performance using features like Z-ordering, data skipping, and caching, and maintain data integrity with time travel, retention policies, and Azure-native monitoring tools.
When you’re finished with this course, you’ll have the skills and knowledge of Delta Lake on Azure needed to confidently use it as a foundational layer for reliable, scalable, and well-governed data workflows.