Learn Numpy, Pandas, And Pyspark For Etl Testing From Scratc
Published 6/2025
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 9.29 GB | Duration: 16h 30m
Published 6/2025
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 9.29 GB | Duration: 16h 30m
Numpy, Pandas, Pyspark for ETL and Machine Learning
What you'll learn
Master Python for Data Analysis – Write efficient Python code for data manipulation, cleaning, and transformation using core programming concepts.
Leverage NumPy for Numerical Computing – Perform high-performance numerical operations, array manipulations, and mathematical computations using NumPy.
Analyze & Manipulate Data with Pandas – Clean, explore, and analyze structured datasets using Pandas DataFrames, including handling missing data, grouping, and
Process Big Data with PySpark – Scale data processing using PySpark, including distributed computing, SQL operations, and optimizing performance for large data
Requirements
The only requirement for this course is prior knowledge of python basics
Description
This course will be a completely hands on course to learn NumPy, Pandas, and PySpark. There's going to be emphasis on NumPy and there will be an entire section on PySpark and Pandas to get you started. This course is designed to prepare for ETL and Machine Learning jobs.There's a complete coverage of NumPy because the concepts in NumPy are similar to PySpark and Pandas and will get you started to better understand DataFrames in Pandas and PySpark.There’s an entire Section in this course about PySpark to help overcome the main challenges in getting started with PySpark in personal Windows Computer.There’s an entire Section in this course about Pandas to get the student started and overcome the main challenges.There are 11 sections in this course. 9 sections are dedicated to Numpy as such:Section 1: IntroductionThis section is an introduction to this course and Udemy.Section 2: Getting started with Python and NumPyThis section covers initial Python and NumPy Installations and configurations and initial lessons about NumPy.Section 3: Introduction to NumPy AttributesIn This section NumPy Attributes are described such as shape, dtype, size and ndim.Section 4: NumPy Special Arrays.This section describes NumPy special Arrays such as eye, diag, random, default_rngSection 5: NumPy Array Indexing and SlicingThis section describes NumPy Indexing and slicing in 1D, 2D, 3d and modifying array elementsSection 6: NumPy Operations and Broadcasting and filteringThis section covers basic operations in NumPySection 7: NumPy Reshaping and combining ArraysThis section covers reshaping and combining Arrays using functions like reshape, flatten, ravel, transposing axes, concatenate, stack, vstack, npstack and hsplit, and vsplit.Section 8: NumPy and Linear AlgebraThis section covers functions in NumPy related to Linear Algebra such as Determinant, Inverse, Eigenvalues and EigenvectorsSection 9: NumPy and statisticsThis section covers statistics in NumPy such as Normal, Uniform, Binomial, and Poisson distribution.Section 10: PySparkThis section covers a starting point for PySpark and its functions for ETL testingSection 11: PandasThis section covers a starting point for learning Pandas
Overview
Section 1: Introduction
Lecture 1 Introduction
Lecture 2 Course Introduction, audience, purpose, and goals
Lecture 3 Instructor Introduction and style
Lecture 4 Introduction to Udemy
Section 2: Getting started with Python and Numpy
Lecture 5 Installing Python, MySQL, Git, and modules
Lecture 6 Python Refresher for this course
Lecture 7 Installing and inspecting NumPy
Lecture 8 Introduction to NumPy and np.array
Lecture 9 NumPy Array part 2
Lecture 10 NumPy np.array() multi-dimensional addition and multiplication correction
Lecture 11 NumPy np.zeros()
Lecture 12 NumPy np.ones()
Lecture 13 NumPy np.arange()
Lecture 14 NumPy np.linspace()
Section 3: Introduction to NumPy Attributes
Lecture 15 NumPy Shape
Lecture 16 NumPy dtype
Lecture 17 NumPy Size
Lecture 18 NumPy ndim
Section 4: NumPy Special Arrays
Lecture 19 NumPy np.eye
Lecture 20 NumPy Diagonal np.diag()
Lecture 21 NumPy Random np.random() Beginner
Lecture 22 Numpy default_rng()
Lecture 23 Numpy Random np.random() Advanced
Section 5: NumPy Array Indexing and slicing
Lecture 24 NumPy Basic indexing and slicing 1D, 2D, and nD arrays
Lecture 25 Numpy Intermediate/Advanced indexing and slicing
Lecture 26 Modifying array elements
Section 6: NumPy Operations and Broadcasting and filtering
Lecture 27 NumPy Array Arithmetic Operations (+, -, *, /, //, %, **)
Lecture 28 NumPy Broadcasting rules and examples
Section 7: NumPy Reshaping and combining Arrays
Lecture 29 Reshape arrays using reshape(), flatten(), ravel()
Lecture 30 Transposing and swapping axes
Lecture 31 Concatenation: np.concatenate(concatenate, stack, vstack, np.hstack
Lecture 32 Splitting Arrays: split, hsplit, and vsplit
Section 8: NumPy and Linear Algebra
Lecture 33 Basic Linear Algebra
Lecture 34 Determinant
Lecture 35 Inverse
Lecture 36 Eigenvalues and Eigenvectors
Lecture 37 Solving Linear Equations np.linalg.solve
Lecture 38 SVD: Singular Value Decomposition
Section 9: NumPy and statistics
Lecture 39 Random Number generation rand, randn, and randint
Lecture 40 Probability Dstributions (Normal, Uniform, Binomial, and Poisson)
Lecture 41 Statistical function np.mean()
Lecture 42 Statistical function np.median()
Lecture 43 Statistical function np.percentile()
Lecture 44 Statistical function np.corrcoef()
Section 10: PySpark
Lecture 45 PySpark overview, setup, and starting first park session
Lecture 46 PySpark DataFramew Basic (CSV, Lists etc)
Lecture 47 PySpark basic data frame operations select(), filter(), withColumn()
Lecture 48 PySpark Aggregations (groupBy(), agg())
Lecture 49 PySpark and SQL - spark.sql()
Lecture 50 PySpark RDDs quick intro map(), collect()
Section 11: Pandas
Lecture 51 Introduction to pandas Series vs DataFrames
Lecture 52 Pandas Data Loading & Inspection (CSV, Json)
Lecture 53 Pandas Data Selection and Filtering
Lecture 54 Pandas Data Cleaning & Transformation
Lecture 55 Pandas Data Joining and Merging
This course is for anyone who has some knowledge of python but they want to learn NumPy, Pandas, and PySpark for ETL testing