Tags
Language
Tags
October 2025
Su Mo Tu We Th Fr Sa
28 29 30 1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31 1
    Attention❗ To save your time, in order to download anything on this site, you must be registered 👉 HERE. If you do not have a registration yet, it is better to do it right away. ✌

    ( • )( • ) ( ͡⚆ ͜ʖ ͡⚆ ) (‿ˠ‿)
    SpicyMags.xyz

    Pyspark - Python Spark Hadoop Coding Framework & Testing

    Posted By: ELK1nG
    Pyspark - Python Spark Hadoop Coding Framework & Testing

    Pyspark - Python Spark Hadoop Coding Framework & Testing
    Last updated 11/2021
    MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
    Language: English | Size: 1.46 GB | Duration: 3h 33m

    Big data Python Spark PySpark coding framework logging error handling unit testing PyCharm PostgreSQL Hive data pipeline

    What you'll learn
    Python Spark PySpark industry standard coding practices - Logging, Error Handling, reading configuration, unit testing
    Building a data pipeline using Hive, Spark and PostgreSQL
    Python Spark Hadoop development using PyCharm
    Requirements
    Basic programming skills
    Basic database skills
    Hadoop entry level knowledge
    Description
    This course will bridge the gap between your academic and real world knowledge and prepare you for an entry level Big Data Python Spark developer role. You will learn the followingPython Spark coding best practicesLoggingError HandlingReading configuration from properties fileDoing development work using PyCharmUsing your local environment as a Hadoop Hive environmentReading and writing to a Postgres database using SparkPython unit testing frameworkBuilding a data pipeline using Hadoop , Spark and PostgresPrerequisites :Basic programming skillsBasic database knowledgeHadoop entry level knowledge

    Overview

    Section 1: Introduction

    Lecture 1 Introduction

    Lecture 2 What is Big Data Spark?

    Section 2: Setting up Hadoop Spark development environment

    Lecture 3 Environment setup steps

    Lecture 4 Installing Python

    Lecture 5 Installing PyCharm

    Lecture 6 Creating a project in the main Python environment

    Lecture 7 Installing JDK

    Lecture 8 Installing Spark 3 & Hadoop

    Lecture 9 Running PySpark in the Console

    Lecture 10 PyCharm PySpark Hello DataFrame

    Lecture 11 PyCharm Hadoop Spark programming

    Lecture 12 Special instructions for Mac users

    Lecture 13 Quick tips - winutils permission

    Lecture 14 Python basics

    Section 3: Creating a PySpark coding framework

    Lecture 15 Structuring code with classes and methods

    Lecture 16 How Spark works?

    Lecture 17 Creating and reusing SparkSession

    Lecture 18 Spark DataFrame

    Lecture 19 Separating out Ingestion, Transformation and Persistence code

    Section 4: Logging and Error Handling

    Lecture 20 Python Logging

    Lecture 21 Managing log level through a configuration file

    Lecture 22 Having custom logger for each Python class

    Lecture 23 Error Handling with try except and raise

    Lecture 24 Logging using log4p and log4python packages

    Section 5: Creating a Data Pipeline with Hadoop Spark and PostgreSQL

    Lecture 25 Ingesting data from Hive

    Lecture 26 Transforming ingested data

    Lecture 27 Installing PostgreSQL

    Lecture 28 Spark PostgreSQL interaction with Psycopg2 adapter

    Lecture 29 Spark PostgreSQL interaction with JDBC driver

    Lecture 30 Persisting transformed data in PostgreSQL

    Section 6: Reading configuration from properties file

    Lecture 31 Organizing code further

    Lecture 32 Reading configuration from a property file

    Section 7: Unit testing PySpark application

    Lecture 33 Python unittest framework

    Lecture 34 Unit testing PySpark transformation logic

    Lecture 35 Unit testing an error

    Section 8: spark-submit

    Lecture 36 PySpark spark-submit

    Lecture 37 Thank you

    Section 9: Appendix - PySpark on Colab and DataFrame deep dive

    Lecture 38 Running Python Spark 3 on Google Colab

    Lecture 39 SparkSDL and Dataframe deep dive on Colab

    Section 10: Appendix - Big Data Hadoop Hive for beginners

    Lecture 40 Big Data concepts

    Lecture 41 Hadoop concepts

    Lecture 42 Hadoop Distributed File System (HDFS)

    Lecture 43 Understanding Google Cloud (GCP) Dataproc

    Lecture 44 Signing up for a Google Cloud free trial

    Lecture 45 Storing a file in HDFS

    Lecture 46 MapReduce and YARN

    Lecture 47 Hive

    Lecture 48 Querying HDFS data using Hive

    Lecture 49 Deleting the Cluster

    Lecture 50 Analyzing a billion records with Hive

    Students looking at moving from Big Data Spark academic background to a real world developer role