Data Engineering Essentials - SQL, Python and Spark
MP4 | Video: h264, 1280x720 | Audio: AAC, 44100 Hz
Language: English | Size: 15.9 GB | Duration: 37h 47m
MP4 | Video: h264, 1280x720 | Audio: AAC, 44100 Hz
Language: English | Size: 15.9 GB | Duration: 37h 47m
Build Data Engineering Pipelines using SQL, Python and Spark
What you'll learn
Setup Development Environment on GCP
Database Essentials using Postgres
Programming Essentials using Python
Data Engineering using Spark Dataframe APIs
Data Engineering using Spark SQL
Requirements
Laptop with decent configuration (Minimum 4 GB RAM and Dual Core)
Free Sign up for GCP with the available credit
CS or IT degree or prior IT experience is highly desired
Description
As part of this course, you will learn all the Data Engineering Essentials related to building Data Pipelines using SQL, Python as well as Spark.
About Data Engineering
Data Engineering is nothing but processing the data depending up on our downstream needs. We need to build different pipelines such as Batch Pipelines, Streaming Pipelines etc as part of Data Engineering. All roles related to Data Processing are consolidated under Data Engineering. Conventionally, they are known as ETL Development, Data Warehouse Development etc.
Course Details
As part of this course, you will be learning Data Engineering Essentials such as SQL, Programming using Python and Spark. Here is the detailed agenda for the course.
Database Essentials - SQL using Postgres
Getting Started with Postgres
Basic Database Operations (CRUD or Insert, Update, Delete)
Writing Basic SQL Queries (Filtering, Joins and Aggregations)
Creating Tables and Indexes
Partitioning Tables and Indexes
Predefined Functions (String Manipulation, Date Manipulation and other functions)
Writing Advanced SQL Queries
Programming Essentials using Python
Perform Database Operations
Getting Started with Python
Basic Programming Constructs
Predefined Functions
Overview of Collections - list and set
Overview of Collections - dict and tuple
Manipulating Collections using loops
Understanding Map Reduce Libraries
Overview of Pandas Libraries
Database Programming - CRUD Operations
Database Programming - Batch Operations
Setting up Single Node Cluster for Practice
Setup Single Node Hadoop Cluster
Setup Hive and Spark on Single Node Cluster
Introduction to Hadoop eco system
Overview of HDFS Commands
Data Engineering using Spark SQL
Getting Started with Spark SQL
Basic Transformations
Managing Tables - Basic DDL and DML
Managing Tables - DML and Partitioning
Overview of Spark SQL Functions
Windowing Functions
Data Engineering using Spark Data Frame APIs
Data Processing Overview
Processing Column Data
Basic Transformations - Filtering, Aggregations and Sorting
Joining Data Sets
Windowing Functions - Aggregations, Ranking and Analytic Functions
Spark Metastore Databases and Tables
Desired Audience
Here are the desired audience for this course.
College students and entry level professionals to get hands on expertise with respect to Data Engineering. This course will provide enough skills to face interviews for entry level data engineers.
Experienced application developers to gain expertise related to Data Engineering.
Conventional Data Warehouse Developers, ETL Developers, Database Developers, PL/SQL Developers to gain enough skills to transition to be successful Data Engineers.
Testers to improve their testing capabilities related to Data Engineering applications.
Any other hands on IT Professional who want to get knowledge about Data Engineering with Hands-On Practice.
Prerequisites
Logistics
Computer with decent configuration (At least 4 GB RAM, however 8 GB is highly desired)
Dual Core is required and Quad Core is highly desired
Chrome Browser
High Speed Internet
Desired Background
Engineering or Science Degree
Ability to use computer
Knowledge or working experience with databases and any programming language is highly desired
Who this course is for:
Computer Science or IT Students or other graduates with passion to get into IT
Data Warehouse Developers who want to transition to Data Engineering roles
ETL Developers who want to transition to Data Engineering roles
Database or PL/SQL Developers who want to transition to Data Engineering roles
BI Developers who want to transition to Data Engineering roles
QA Engineers to learn about Data Engineering
Application Developers to gain Data Engineering Skills