Tags
Language
Tags
June 2025
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 1 2 3 4 5
    Attention❗ To save your time, in order to download anything on this site, you must be registered 👉 HERE. If you do not have a registration yet, it is better to do it right away. ✌

    ( • )( • ) ( ͡⚆ ͜ʖ ͡⚆ ) (‿ˠ‿)
    SpicyMags.xyz

    Basics To Advanced: Azure Synapse Analytics Hands-On Project

    Posted By: ELK1nG
    Basics To Advanced: Azure Synapse Analytics Hands-On Project

    Basics To Advanced: Azure Synapse Analytics Hands-On Project
    Published 8/2023
    MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
    Language: English | Size: 6.86 GB | Duration: 18h 40m

    Build complete project only with Azure Synapse Analytics focused on PySpark includes delta lake and spark Optimizations

    What you'll learn

    Understand Azure Synapse Analytics Services Practically

    Complete basic to advanced understanding on Azure Synapse Analytics

    Gain hands-on experience in applying Spark optimization techniques to real-world scenarios, achieving faster insights.

    Understand 50+ most commonly used PySpark Transformations

    Acquire a comprehensive library of 45+ PySpark notebooks for data cleansing, enrichment, and transformation.

    Hands-on learning on building a modern data warehouse using Azure Synapse

    Explore the capabilities of Spark Pools and their role in processing large-scale data workloads

    Understand how python is used in Data Engineering

    Understand and transform data with Serverless SQL pool

    Understand the principles and advantages of Delta Lake as a reliable data storage and management solution.

    Explore the capabilities of Spark Pools and their role in processing large-scale data workloads

    Learn How Spark is evolved and its growth

    Provides insights on services that needed to clear DP-203

    Create and configure a Serverless SQL pool

    Create External DataSource, External Files, External Tables in Serverless SQL pool

    Configure Spark Pools and understand the working of them

    Explore the capabilities of Spark Pools and their role in processing large-scale data workloads

    Understand the Integration of Power BI with Azure Synapse Analytics

    Explore the capabilities of Spark Pools and their role in processing large-scale data workloads

    Create and work with Dedicated SQL pool on a high level

    Optimize your PySpark with Spark Optimization techniques

    Learn history and data processing before Spark

    Implement the incremental UPSERT using Delta Lake

    Understand and implement versioning in delta lake

    Implement MSSpark Utils and the uses of its utilities

    How we can mount Data lake to Synapse Notebooks

    Requirements

    No Azure Synapse Analytics experience needed. You will learning everything you needed

    Basics of Python programming

    Basics of SQL language

    Description

    Are you ready to revolutionize your data analytics skills? Look no further. Welcome to our comprehensive course, where you'll delve deep into the world of Azure Synapse Analytics with PySpark and emerge equipped with the tools to excel in modern data analysis. Unlock the Power of Azure Synapse Analytics! 18.5+ HOURS OF IN-DEPTH LEARNING CONTENT! In this course we will be learning about :Serverless SQL Pool - Perform flexible querying for structured and initial data explorationSpark Pools - Dive into advanced data processing and analytics with the power of Apache Spark.Spark SQL - Seamlessly query structured data using Spark's SQL capabilities.MSSpark Utils - Leverage MSSpark Utilities for enhanced Spark functionalities for Synapse/50+ PySpark Transformations - Harness over 50 PySpark transformations to manipulate and refine your data.Dedicated SQL Pool - To report data efficiently to Power BI.Integrating Power BI with Azure Synapse Analytics - Seamlessly connect Power BI for enriched data visualization and insights.Delta Lake and its features - Integrate Delta Lake for reliable, ACID-compliant data.Spark Optimization Techniques - Employ optimization techniques to enhance Spark processing speed and efficiency.You will also learn how python is helpful in data analysis. Our project-based approach ensures hands-on learning, giving you the practical experience needed to conquer real-world data challenges.While this course not completely focuses on certification you can also learn the practical understanding about Azure Synapse analytics service that is needed to pass DP-203 - "Microsoft Certified Azure Data Engineer" and DP-500 "Designing and Implementing Enterprise-Scale Analytics Solutions Using Microsoft Azure and Microsoft Power BI"Join with me in mastering Azure Synapse Analytics !

    Overview

    Section 1: Introduction

    Lecture 1 Introduction

    Lecture 2 Project Architecture

    Lecture 3 Course Slides

    Section 2: Origin of Azure Synapse Analytics

    Lecture 4 Section Introduction

    Lecture 5 Need of separate Analytical system

    Lecture 6 OLAP vs OLTP

    Lecture 7 A typical Datawarehouse

    Lecture 8 Datalake Introduction

    Lecture 9 Modern datawarehouse and its problem

    Lecture 10 The solution - Azure Synapse Analytics and its Components

    Lecture 11 Azure Synapse Analytics - A Single stop solution

    Lecture 12 Section Summary

    Section 3: Environment Setup

    Lecture 13 Section Introduction

    Lecture 14 Creating a resource group in Azure

    Lecture 15 Create Azure Synapse Analytics Service

    Lecture 16 Exploring Azure Synapse Analytics

    Lecture 17 Understanding the dataset

    Section 4: Serverless SQL Pool

    Lecture 18 Section Introduction

    Lecture 19 Serverless SQL Pool - Introduction

    Lecture 20 Serverless SQL Pool - Architecture

    Lecture 21 Serverless SQL Pool- Benefits and Pricing

    Lecture 22 Uploading files into Azure Datalake Storage

    Lecture 23 Initial Data Exploration

    Lecture 24 How to import SQL scripts or ipynb notebooks to Azure Synapse

    Lecture 25 Fixing the Collation warning

    Lecture 26 Creating External datasource

    Lecture 27 Creating database scoped credential Using SAS

    Lecture 28 Creating Database scoped cred using MI

    Lecture 29 Deleting existing data sources for cleanup

    Lecture 30 Creating an external file format - Demo

    Lecture 31 Creating an External File Format - Practical

    Lecture 32 Creating External DataSource for Refined container

    Lecture 33 Creating an External Table

    Lecture 34 End of section

    Section 5: History and Data processing before Spark

    Lecture 35 Section Introduction

    Lecture 36 Big Data Approach

    Lecture 37 Understanding Hadoop Yarn- Cluster Manager

    Lecture 38 Understanding Hadoop - HDFS

    Lecture 39 Understanding Hadoop - MapReduce Distributed Computing

    Section 6: Emergence of Spark

    Lecture 40 Section Introduction

    Lecture 41 Drawbacks of MapReduce Framework

    Lecture 42 Emergence of Spark

    Section 7: Spark Core Concepts

    Lecture 43 Section Introduction

    Lecture 44 Spark EcoSystem

    Lecture 45 Difference between Hadoop & Spark

    Lecture 46 Spark Architecture

    Lecture 47 Creating a Spark Pool & its benefits

    Lecture 48 RDD Overview

    Lecture 49 Functions Lambda, Map and Filter - Overview

    Lecture 50 Understanding RDD in practical

    Lecture 51 RDD- Lazy loading - Transformations and Actions

    Lecture 52 What is RDD Lineage

    Lecture 53 RDD - Word count program - Demo

    Lecture 54 RDD - Word count - PySpark Program - Practical

    Lecture 55 Optimization - ReduceByKey vs GroupByKey Explanation

    Lecture 56 RDD - Understanding about Jobs in spark Practical

    Lecture 57 RDD - Understanding Narrow and Wide Transformations

    Lecture 58 RDD- Understanding Stages - Practical

    Lecture 59 RDD- Understanding Tasks Practical

    Lecture 60 Understand DAG , RDD Lineage and Differences

    Lecture 61 Spark Higher level APIs Intro

    Lecture 62 Synapse Notebook - Creating dataframes practical

    Section 8: PySpark Transformation 1 - Select and Filter functions

    Lecture 63 Introduction for PySpark Transformations

    Lecture 64 Walkthrough on Notebook , Markdown cells

    Lecture 65 Using Free Databricks Community Edition to practise and Save Costs

    Lecture 66 Display and show Functions

    Lecture 67 Stop Spark Session when not in use

    Lecture 68 Select and SelectExpr

    Lecture 69 Filter Function

    Lecture 70 Organizing notebooks into a folder

    Section 9: PySpark Transformation 2 - Handling Nulls, Duplicates and aggregation

    Lecture 71 Understanding fillna and na.fill

    Lecture 72 Identifying duplicates using Aggregations

    Lecture 73 Handling Duplicates using dropna

    Lecture 74 Organising notebooks into a folder

    Lecture 75 Transformations summary of this section

    Section 10: PySpark Transformation 3 - Data Transformation and Manipulation

    Lecture 76 withColumn to Create Update columns

    Lecture 77 Transforming and updating column withColumnRenamed

    Section 11: PySpark 4 - Synapse Spark - MSSparkUtils

    Lecture 78 What is MSSpark Utilities

    Lecture 79 MSSpark Utils - Env utils

    Lecture 80 What is mount point

    Lecture 81 Creating and accessing mount point in Notebook

    Lecture 82 All File System Utils

    Lecture 83 Notebook Utils - Exit command

    Lecture 84 Creating another spark pool

    Lecture 85 Procedure to increase vCores request (optional)

    Lecture 86 Calling notebook from another notebook

    Lecture 87 Calling notebook from another using runtime parameters

    Lecture 88 Magic commands

    Lecture 89 Attaching two notebooks to a single spark pool

    Lecture 90 Accessing Mount points from another notebook

    Section 12: PySpark 5 - Synapse - Spark SQL

    Lecture 91 Accessing data using Temporary Views - Practical

    Lecture 92 Lake Database - Overview

    Lecture 93 Understanding and creating database in Lake Database

    Lecture 94 Using Spark SQL in notebook

    Lecture 95 Managed vs External tables in Spark

    Lecture 96 Metadata sharing between Spark pool and Serverless SQL Pool

    Lecture 97 Deleting unwanted folders

    Section 13: PySpark Transformation 6 - Join Transformations

    Lecture 98 Uploading required files for Joins

    Lecture 99 Python notebooks till Union

    Lecture 100 Inner join

    Lecture 101 Left Join

    Lecture 102 Right Join

    Lecture 103 Full outer join

    Lecture 104 Left Semi Join

    Lecture 105 Left anti and Cross Join

    Lecture 106 Union Operation

    Lecture 107 Performing Join Transformation on Project Dataset

    Lecture 108 Summary of Transformations performed

    Section 14: PySpark Transformation 7 - String Manipulation and sorting

    Lecture 109 Replace function to change spaces

    Lecture 110 PySpark Notebook for this section

    Lecture 111 Split and concat functions

    Lecture 112 Order by and sort

    Lecture 113 Section Summary

    Section 15: PySpark Transformation 8 - Window Functions

    Lecture 114 Row number function

    Lecture 115 PySpark Notebook used in this section

    Lecture 116 Rank Function

    Lecture 117 Dense Rank function

    Section 16: PySpark Transformation 9 - Conversions and Pivoting

    Lecture 118 Conversion using cast function

    Lecture 119 PySpark Notebook need for casting and pivoting lectures

    Lecture 120 Pivot function

    Lecture 121 Unpivot using stack function

    Lecture 122 Using to date to convert date column

    Section 17: PySpark Transformation 10 - Schema definition and Management

    Lecture 123 PySpark Notebook used in this lecture

    Lecture 124 StructType and StructField - Demo

    Lecture 125 Implementing explicit schema with StructType and StructField

    Section 18: PySpark Transformation 11 - UDFs

    Lecture 126 User Defined Functions - Demo

    Lecture 127 Implementing UDFs in Notebook

    Lecture 128 Writing transformed data to Processed container

    Section 19: Dedicated SQL Pool

    Lecture 129 Dedicated SQL pool - Demo

    Lecture 130 Dedicated SQL Pool Architecture

    Lecture 131 How distribution takes places based on DWU

    Lecture 132 Factors to consider when choosing dedicated SQL pool

    Lecture 133 Creating Dedicated SQL pool in Synapse

    Lecture 134 Ways to copy data into Dedicated SQL Pool

    Lecture 135 Copy command to copy to dedicated SQL pool

    Lecture 136 Clustured Column Store index(optional)

    Lecture 137 Types of Distributions or Sharing patterns

    Lecture 138 Using Pipeline to Copy to dedicated SQL Pool

    Section 20: Reporting data to Power BI

    Lecture 139 Section Introduction

    Lecture 140 Installing Power BI Desktop

    Lecture 141 Creating report from Power BI Desktop

    Lecture 142 Creating new user in Azure AD for creating workspace (if using personal account)

    Lecture 143 Creating a shared workspace in Power BI

    Lecture 144 Publishing report to Shared Workspace

    Lecture 145 Accessing Power BI from Azure Synapse Analytics

    Lecture 146 Download Power BI .pbix file from here

    Lecture 147 Creating Dataset and report from Synapse Analytics

    Lecture 148 Concluding the Power BI Section

    Lecture 149 Summary and end of project implementation

    Section 21: Spark - Optimisation Techniques

    Lecture 150 Optimisation Section Intro

    Lecture 151 Uploading required files for Optimisation

    Lecture 152 Spark Optimisation levels

    Lecture 153 Avoid using Collect function

    Lecture 154 Making notebook into particular folder

    Lecture 155 Avoid InferSchema

    Lecture 156 Use Cache Persist 1 - Understanding Serialization and DeSerialization

    Lecture 157 Use Cache Persist 2 - How cache or persist will work - Demo

    Lecture 158 Use Cache Persist 3 - Understanding cache practically

    Lecture 159 Use Cache Persist 4 - Persist - What is persist and different storage levels

    Lecture 160 Use Cache Persist - Notebook for persist with all storage levels

    Lecture 161 Use Cache Persist 5 - Persist - MEMORY_ONLY

    Lecture 162 Use Cache Persist 6 - Persist - MEMORY AND DISK

    Lecture 163 Use Cache Persist 7 - Persist - MEMORY_ONLY_SER (Scala Only)

    Lecture 164 Use Cache Persist 8 - Persist - MEMORY_AND_DISK_SER ( Scala Only)

    Lecture 165 Use Cache Persist 9 - Persist - DISK ONLY

    Lecture 166 Use Cache Persist 10 - Persist - OFF HEAP (Scala Only)

    Lecture 167 Use Cache Persist 11 - Persist - MEMORY_ONLY_2 (PySpark only)

    Lecture 168 Use Partitioning 1 - Understanding partitioning - Demo

    Lecture 169 Use Partitioning 2 - Understand partitioning - Practical

    Lecture 170 Repartiton and coalesce 1 - Understanding repartition and coalesce - Demo

    Lecture 171 Repartiton and coalesce 2 - Understanding repartition and coalesce - Practical

    Lecture 172 Broadcast variables 1 - Understanding broadcast variables - Demo

    Lecture 173 Broadcast variables 2 - Implementing broadcast variables in notebook

    Lecture 174 Use Kryo Serializer

    Section 22: Delta Lake

    Lecture 175 Section Introduction

    Lecture 176 Drawbacks of ADLS

    Lecture 177 What is Delta lake

    Lecture 178 Lakehouse Architecture

    Lecture 179 Uploading required file for Delta lake

    Lecture 180 Problems with Azure Datalake - Practical

    Lecture 181 Creating a Delta lake

    Lecture 182 Understanding Delta format

    Lecture 183 Contents of Transaction Log or Delta log file - Practical

    Lecture 184 Contents of a transaction log demo

    Lecture 185 Creating delta table by Path using SQL

    Lecture 186 Creating delta table in Metastore using Pyspark and SQL

    Lecture 187 Schema Enforcement - Files required for Understanding Schema Enforcement -

    Lecture 188 What is schema enforcement - Demo

    Lecture 189 Schema Enforcement - Practical

    Lecture 190 Schema Evolution - Practical

    Lecture 191 16. Versioning and Time Travel

    Lecture 192 Vacuum command

    Lecture 193 Convert to Delta command

    Lecture 194 Checkpoints in delta log

    Lecture 195 Optimize command - Demo

    Lecture 196 Optimize command - Practical

    Lecture 197 Applying UPSERT using MERGE Command

    Section 23: Conclusion

    Lecture 198 Course Conclusion

    Lecture 199 Bonus Lecture

    Beginners who want to step into the world of Data Engineers,Professional Data Engineers who want to advance their data analysis skills,Students who are keen to learn Data Analytics,Data Engineers who want to learn data warehousing in Cloud using Azure Synapse Analytics