Ultimate Azure Data Factory: Cloud Data Engineering
Published 12/2023
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 3.35 GB | Duration: 7h 51m
Published 12/2023
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 3.35 GB | Duration: 7h 51m
Real world Modern Data Warehouse project for Data Engineers using Azure Data Factory, Sql, Data Lake, Databricks
What you'll learn
You will learn how to build data pipelines in Azure Data Factory (ADF) through a step-by-step approach.
You will learn how to ingest data in different formats into Azure Data Lake Gen2 using Azure Data Factory (ADF)
You will learn how to use and build various types of transformations in Azure Data Factory (ADF)
You will learn hands-on implementations of building generic artifacts in Azure Data Factory (ADF) such as Flowlets and Templates
You will learn how to transform data into the Medallion layers in Azure Data Lake Gen2 using Data Flows in Azure Data Factory (ADF)
You will learn how to implement ETL/ELT using Azure Data Factory (ADF) in order to implement a Data Warehouse
You will learn how to create generic metadata driven pipelines in Azure Data Factory (ADF) to implement the ETL/ELT processes
You will learn the concepts of the Modern Data Warehouse Architecture and the Delta Lake
You will learn the concepts of Slowly Changing Dimensions and how to implement them in Azure Data Factory (ADF)
You will learn how to load transformed data from Azure Data Lake Storage Gen2 to Azure SQL Database using Azure Data Factory (ADF)
You will learn how to implement a Delta Lake using Databricks Notebook Activity in Azure Data Factory (ADF) and load into Azure Data Lake Storage Gen2
You will learn how to transform your raw data into a finished data warehouse using Azure Data Factory (ADF) and then visualize it in PowerBI
You will learn how to build pipelines using good practices and naming standards as in a typical real-world data engineering project
You will learn how to implement different types of Triggers in Azure Data Factory (ADF) and how to schedule your data pipelines
You will learn how to monitor pipelines using Azure Data Factory (ADF), Azure Monitor, and how to recover from pipeline failures
By the end of this course you will have learnt all the topics required on Azure Data Factory to pass the Azure Data Engineer Associate Certification Exam DP203
Requirements
Basic understanding of Sql will be beneficial
Basic understanding of cloud computing will be beneficial
Experience in Azure is not required, we will learn this step by step within this course as we build the project
Understanding of data warehouses will be beneficial, but not necessary, we will learn it as we build the project in the course
An Azure account is required for the course, we will learn how you can create it during the course
Description
Welcome!Data engineering is a thriving focus in the IT industry, with Microsoft's Azure Data Factory emerging as a sought-after tool in cloud-based data engineering.Join this course for a step-by-step journey into mastering Azure Data Factory (ADF). Using a real-world scenario of an e-commerce company grappling with data integration and insights, we'll explore the data of an online wine retailer, showcasing how implementing a modern data warehouse with ADF can provide solutions.Distinguishing itself from other Udemy offerings on Azure Data Factory and Data Engineering Technologies, this course guides you hands-on in transforming raw data into a Modern Data Warehouse using Azure Data Factory (ADF). Upon completion, you'll gain proficiency in ADF, ready to tackle real-world data engineering projects.Given the course's focus on real-world business scenarios, it adopts a sequential approach mirroring how such requirements unfold in actual projects. This method ensures you not only implement business needs but also grasp the technical concepts explained at each stage of implementing data pipelines with Azure Data Factory (ADF).This course covers more than just modern data warehouse concepts like architecture, medallion layers, and delta lake. You'll also gain expertise in utilizing diverse Azure ecosystem solutions, including Azure Data Lake Storage, Azure SQL Database, and Azure Databricks. Additionally, you'll learn to visually represent the completed data warehouse through Power BI reports.This course enables you to grasp concepts and skills assessed in the Azure Data Engineer Associate Certification exam DP203. While it equips you with the necessary skills, it's important to note that the course is not designed solely for certification passing but for comprehensive learning.I appreciate your time, and I've crafted this course to be practical and focused. I aim for simplicity and conciseness, starting from the basics and ensuring proficiency in the technologies covered.Currently the course teaches you the following:Azure Data FactoryConstructing a contemporary Data Warehouse architecture for a data engineering solution involves utilizing Azure Data Engineering technologies like Azure Data Factory (ADF), Azure Data Lake Gen2, Azure SQL Database, Azure Databricks, Azure KeyVault, and Microsoft PowerBI.Incorporating data from varied sources with diverse formats into Azure Data Lake Gen2 is achieved through the use of Azure Data Factory.Comprehending Azure concepts, including resources and their provisioning methods.Learning to incorporate and use tools such as Azure Storage Explorer, Azure Data Studio, and Visual Studio Code in the development workflow.Implementing Azure Data Factory (ADF) pipelines using different control flow activities such as Get Metadata, ForEach, If Conditions, etc.Using Parameters and Variables in Pipelines, Datasets and LinkedServices to create generic parameter driven pipelines in Azure Data Factory (ADF).Using parameters in conjunction with Azure KeyVault to create generic parameter driven piplines in Azure Data Factory (ADF).Implementing Mapping Data Flows to create transformation logic to handle a variety of transformation scenarios such as Filter, Conditional Split, Derived Column, Aggregate, Join, Select, and Sink transformation.Developing universal components in data pipelines, such as Flowlets, and mastering the swift development of data processing needs through pre-built pipeline templates.Learning how to implement error handling in data pipelines and controlling pipeline flow.Implementig data quality rules using the Assert transformation within a data pipeline. Implementing data pipelines to handle common slowly changing dimension scenarios such as SCD Type 1 and SCD Type 2.Implementing data pipleines to implement a Fact table.Learning how to debug data pipelines and resolving issues.Implementing pipeline scheduling using different types of triggers such as Event Trigger, Schedule Trigger and Tumbling Window Trigger in Azure Data Factory (ADF)Implementing Azure Data Factory pipelines to invoke Mapping Data Flows and executing them.Creating ADF pipelines to execute Databricks Notebook activities to carry out transformations and implement a Delta Lake table.Creating pipeline dependencies and using the Pipeline activity to orchestrate the ETL/ELT process.Implementing trigger dependencies to understand how to chain pipelines and orchestrate the data flow.Monitoring data pipelines, creating alert notifications, and reporting data factory metrics using Azure Data Factory Monitor.Understanding how to monitor Azure Data Factory pipelines using Azure Monitor using specific Data Factory metrics.Modern Data WarehouseUnderstand the different types of Data Warehouse Architectures.Understand the concepts of a Delta Lake.Understand the Dimensional Model and a Star Schema based Data Warehouse.Understand the concept of Medallion Layers and how to implement it within the Azure Data Lake Storage.Azure DatabricksUnderstand the creation of an Azure Databricks Workspace, Databricks clusters, Mounting storage accounts, Creating Databricks notebooks, performing transformations using Databricks notebooks, and Invoking Databricks notebooks from Azure Data Factory.Understand the implementation of a Delta Lake table using Azure Databricks Notebook activity from an Azure Data Factory pipeline.Understand the concepts of Optimizing a Delta Lake Table, Time Travel, Vacuuming, and Delta Logs.Azure Resources and Azure Storage SolutionsLearn the different approaches to creating Azure Resources.Learn how to create an Azure Storage Account resource, creating containers, and how to upload data through the Azure Portal or through Azure Storage Explorer into the Azure storage resource. Learn how to create an Azure SQL Database resource, understand the Pricing Tiers, Creating an Admin User, Creating Tables, Loading Data, Querying the database and interacting with Azure Sql Database through Azure Data Studio.
Overview
Section 1: Overview
Lecture 1 Welcome
Lecture 2 What you will learn?
Lecture 3 Goal of this course
Lecture 4 Commitment
Lecture 5 Course Materials
Lecture 6 Course Slides
Section 2: Introduction
Lecture 7 Introduction to Azure Data Factory
Lecture 8 Why Azure Data Factory?
Lecture 9 What is Azure Data Factory?
Lecture 10 Benefits of Azure Data Factory
Lecture 11 Azure Account
Lecture 12 User Interface Azure Portal
Lecture 13 Module Summary
Section 3: Project Overview
Lecture 14 Hands-On Project Overview
Lecture 15 Business Case for the Project
Lecture 16 Solution Requirements
Lecture 17 Architectural Patterns
Lecture 18 Modern Data Warehouse Architecture
Lecture 19 Hands-On Project Architecture
Lecture 20 Repositories
Lecture 21 Module Summary
Section 4: Environment
Lecture 22 Module Overview
Lecture 23 Software Tools
Lecture 24 Software Tools Setup
Lecture 25 Azure Resources
Lecture 26 Setup Azure Resources
Lecture 27 Setup Azure Resources in Azure Portal
Lecture 28 Setup Azure Resource Group
Lecture 29 Setup Azure Data Lake Storage
Lecture 30 Setup Azure Data Factory Resource
Lecture 31 Setup Azure Sql DB Resource
Lecture 32 Review Azure Resources
Lecture 33 Setup Azure Data Studio
Lecture 34 Setup Azure Storage Explorer
Lecture 35 Module Summary
Section 5: Building a Data Pipeline
Lecture 36 Module Overview
Lecture 37 Building Blocks of Azure Data Factory - Main Components
Lecture 38 Building Blocks of Azure Data Factory - Pipelines and Activities
Lecture 39 Building Blocks of Azure Data Factory - How they Tie Together
Lecture 40 Azure Data Factory User Interface - Main Page
Lecture 41 Azure Data Factory User Interface - Authoring Canvas
Lecture 42 Data Sources
Lecture 43 Data Sources - Data Ingestion
Lecture 44 Data Sources - Data Organization
Lecture 45 Building the Data Pipeline
Lecture 46 Building the Data Pipeline - Creating the Containers
Lecture 47 Building the Data Pipeline - Creating the Pipeline
Lecture 48 Building the Data Pipeline - Review and Organize
Lecture 49 Importing Semi-Structured Data
Lecture 50 Importing Semi-Structured Data - Building the Pipeline
Lecture 51 Importing Semi-Structured Data - Organizing the Pipeline
Lecture 52 Importing Semi-Structured Data - Recap of the Lesson
Lecture 53 Naming Conventions
Lecture 54 Module Summary
Section 6: Pipeline Activities and Parameters
Lecture 55 Module Overview
Lecture 56 Activities
Lecture 57 Activity Dependencies
Lecture 58 Activity Dependencies - Examples
Lecture 59 Copy Activity
Lecture 60 Copy Activity Concepts - Examples
Lecture 61 Expressions and Variables
Lecture 62 Expressions and Variables - Examples
Lecture 63 Parameters
Lecture 64 Parameters - Examples
Lecture 65 Azure Key Vault - Overview
Lecture 66 Azure Key Vault - Setup
Lecture 67 Azure Key Vault - Create Linked Service
Lecture 68 Importing Semi-Structured Data
Lecture 69 Module Summary
Section 7: Mapping Data Flows
Lecture 70 Module Overview
Lecture 71 Introduction to Mapping Data Flows
Lecture 72 Scenarios for Mapping Data Flows
Lecture 73 User Interface of Mapping Data Flows
Lecture 74 User Interface of Mapping Data Flows - Debug Feature
Lecture 75 Implementing a Mapping Data Flow - Overview
Lecture 76 Implementing a Mapping Data Flow - Pipeline and Data Sources
Lecture 77 Implementing a Mapping Data Flow - Adding Transformations
Lecture 78 Implementing a Mapping Data Flow - Pipeline Execution
Lecture 79 Mapping Data Flow - Concepts
Lecture 80 Mapping Data Flow - Concepts Example
Lecture 81 Performance of Mapping Data Flows - Integration Runtime
Lecture 82 Performance of Mapping Data Flows
Lecture 83 Module Summary
Section 8: Implementing Flowlets
Lecture 84 Module Overview
Lecture 85 Introduction to Flowlets
Lecture 86 Scenarios for Flowlets
Lecture 87 User Interface of Flowlets - Overview
Lecture 88 User Interface of Flowlets - Create a Demo Flowlet
Lecture 89 Implementing a Flowlet - Create Flowlet
Lecture 90 Implementing a Flowlet - Use the Flowlet
Lecture 91 Module Summary
Section 9: Controlling Pipeline Flow
Lecture 92 Module Overview
Lecture 93 Asserts
Lecture 94 Implementing Asserts - Assert Expect True
Lecture 95 Implementing Asserts - Identifying Error Rows
Lecture 96 Implementing Asserts - Processing Error Rows
Lecture 97 Error Handling Overview
Lecture 98 Implementing Error Handling - Fail Activity
Lecture 99 Implementing Error Handling - Capturing Errors
Lecture 100 Implementing Error Handling - Logging Errors
Lecture 101 Implementing Error Handling - Review of Error Pipeline
Lecture 102 Integrating Data Quality and Error Handling
Lecture 103 Building Pipelines using Pre-Built Templates
Lecture 104 Module Summary
Section 10: Building the Data Warehouse - Part 1
Lecture 105 Module Overview
Lecture 106 Data Warehouse Overview
Lecture 107 Data Warehouse Models
Lecture 108 Data Warehouse Vino World
Lecture 109 Data Process
Lecture 110 Building the Azure Sql Database - Create the Stage Tables
Lecture 111 Building the Azure Sql Database - Create the DW Tables
Lecture 112 Building the Staging Layer Master Data - Master data
Lecture 113 Building the Staging Layer Master Data - Product data
Lecture 114 Building the Staging Layer Master Data - Metadata approach
Lecture 115 Building the Staging Layer Master Data - Create Parameter Datasets
Lecture 116 Building the Staging Layer Master Data - Create Metadata Pipeline
Lecture 117 Building the Staging Layer Master Data - Pipeline execution
Lecture 118 Building the Staging Layer Transaction Data
Lecture 119 Building the Staging Layer Product Data - Combine Product Data
Lecture 120 Building the Staging Layer Transaction Data - Combine Sales Data
Lecture 121 Module Summary
Section 11: Building the Data Warehouse - Part 2
Lecture 122 Module Overview
Lecture 123 Dimensions - Overview of Dimensions
Lecture 124 Dimensions - Slowly Changing Dimensions
Lecture 125 Dimensions - Master Dimensions and SCD Type
Lecture 126 Building Type1 Dimensions - Using Data Flows
Lecture 127 Building Type 1 Dimensions - Pipeline Review
Lecture 128 Building Type 1 Dimensions - Using Stored Procedures
Lecture 129 Dimensions - Overview of Type2 Dimensions
Lecture 130 Building Type 2 Dimensions - Product Dimension
Lecture 131 Building Type 2 Dimensions - Using Data Flows - Step1
Lecture 132 Building Type 2 Dimensions - Using Data Flows - Step2
Lecture 133 Building Type 2 Dimensions - Pipeline Review
Lecture 134 Building Type 2 Dimensions - Using Stored Procedures
Lecture 135 Building Dimensions - Build remaining dimensions
Lecture 136 Facts - Overview
Lecture 137 Building Facts
Lecture 138 Data Warehouse Review and Data Analysis
Lecture 139 Module Summary
Section 12: Building the Delta Lake
Lecture 140 Module Overview
Lecture 141 Recap of what we implemented
Lecture 142 What we will implement
Lecture 143 Azure Databricks
Lecture 144 What is Azure Databricks
Lecture 145 Core Artifacts of Azure Databricks
Lecture 146 Setup Azure Databricks
Lecture 147 Setup Databricks Resource
Lecture 148 Databricks UI Overview
Lecture 149 Databricks Cluster Overview
Lecture 150 Create Databricks Cluster
Lecture 151 Azure Service Principal and Access to Data Lake Storage
Lecture 152 Mount Azure Data Lake Storage
Lecture 153 Overview of Delta Lake Implementation
Lecture 154 What is a Delta Lake
Lecture 155 Create Data Source for the Delta Table
Lecture 156 Create Delta Table
Lecture 157 Load Delta Table
Lecture 158 Update Delta Table
Lecture 159 Delta Table Concepts
Lecture 160 Create Linked Service to Databricks from Data Factory
Lecture 161 Executing Databricks Notebook from Data Factory
Lecture 162 Module Summary
Section 13: Presentation Layer
Lecture 163 Module Overview
Lecture 164 Overview - Modern Data Warehouse
Lecture 165 Overview - What we implemented
Lecture 166 Overview - What we will implement
Lecture 167 PowerBI - Installation
Lecture 168 PowerBI - Overview
Lecture 169 PowerBI - Connecting to the Data Warehouse
Lecture 170 PowerBI - Building the Tabular Model
Lecture 171 PowerBI - Building the Report
Lecture 172 PowerBI - Report Requirements
Lecture 173 PowerBI - Report Review
Lecture 174 Module Summary
Section 14: Triggers
Lecture 175 Module Overview
Lecture 176 Overview of Triggers
Lecture 177 Approach to Pipeline Execution
Lecture 178 Implementing a Master Pipeline
Lecture 179 Executing the Master Pipeline
Lecture 180 Implementing Event-based triggers
Lecture 181 Executing Event-based Triggers
Lecture 182 Scheduling Pipelines
Lecture 183 Creating a Tumbling Window Trigger
Lecture 184 Module Summary
Section 15: Monitoring
Lecture 185 Module Overview
Lecture 186 Executing Event-based triggers
Lecture 187 Overview of Data Factory Monitoring
Lecture 188 What do we monitor in Azure Data Factory
Lecture 189 Visual Monitoring in Azure Data Factory
Lecture 190 Pipeline Recovery
Lecture 191 Setup Alerts
Lecture 192 Validate the Alert
Lecture 193 Metrics
Lecture 194 Module Summary
Section 16: Section 16: Conclusion
Lecture 195 Summary
Beginners or Students who want to break into the Data Engineering field,Developers who want to learn Data Engineering,Data Engineers who want to learn how to implement a Modern Data Warehouse through a step-by-step approach,Data Engineers/Data Warehouse developers who want to get the skills necessary in implementing cloud based data engineering solutions,Data Engineers who want to understand how to build and end-to-end solution using Azure Data Factory (ADF)