Spark Programming In Python For Beginners - Apache Spark 3

Posted By: ELK1nG

Spark Programming In Python For Beginners - Apache Spark 3
Published 7/2023
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
Language: English | Size: 851.56 MB | Duration: 2h 21m

Unlock the Power of Apache Spark 3

What you'll learn

Basic knowledge of Apache Spark

Apache Spark installation and configuration on local machine as well as on cloud

How to use Spar-shell

Installation of multi-node cluster on Google Cloud Platform

Using clusters in notebooks

Creating and configuring spark session

Creating Spark project Build Configuration

Configuring spark application logs

How to load different file formats in Dataframe

Dataframe and Data sets transformations

Aggregations in spark

Spark Dataframe Joins

Requirements

Basic Programming Knowledge Using Python Language

A Recent 64-bit Windows/Mac/Linux Machine with 8 GB RAM

Description

Get ready for the Apache Spark with Python complete course. Gain familiarity with the course details and topics designed to help you succeed.Apache Spark™ is a multi-language engine for executing data engineering, data science, and machine learning on single-node machines or clusters. This course is designed for students, professionals, and people in non-technical roles who are willing to develop a Data Engineering pipeline and application using Apache Spark. The managers and architects, who are not directly involved in the Spark implementation process, are another group of people. Still, they collaborate with those who really put Apache Spark into practice.Learn Apache Spark with Hands-On LabsThe Spark Programming in Python course is a hands-on practice course designed to teach you the basic and intermediate concepts of spark via practical demonstration through hands-on labs. The course comprises approximately 22 labs starting from the basics and moving to high levels in terms of complexity.Who should take this course?The course is intended for software developers who want to build an Apache Spark-based data engineering pipeline and application. The data architects and data engineers who are in charge of creating the data-centric architecture for the company can also benefit from it. The managers and architects, who are not directly involved in the Spark implementation process, are another group of people. Still, they collaborate with those who really put Apache Spark into practice.Requirements● Basic Programming Knowledge Using Python Language● A Recent 64-bit Windows/Mac/Linux Machine with 8 GB RAMWho this course is for:● Software Engineers and Architects who are willing to design and develop Big data Engineering Projects using Apache Spark● Programmers and developers who are aspiring to grow and learn Data Engineering using Apache SparkWhat you’ll learn● Basic knowledge of Apache Spark● Apache Spark installation and configuration on the local machine as well as on the cloud● How to use Spar-shell● Installation of the multi-node cluster on the Google Cloud Platform● Using clusters in notebooks● Creating and configuring spark session● Creating Spark project Build Configuration● Configuring spark application logs● How to load different file formats in a dataframe● Dataframe and Data set transformations● Aggregations in spark● Spark dataframe JoinsAre there any course requirements or prerequisites?● Basic Programming Knowledge Using Python Language● A Recent 64-bit Windows/Mac/Linux Machine with 8 GB RAMWho this course is for:● Software Engineers and Architects who are willing to design and develop Big data Engineering Projects using Apache Spark● Programmers and developers who are aspiring to grow and learn Data Engineering using Apache Spark

Overview

Section 1: Introduction

Lecture 1 Introduction to Apache Spark

Lecture 2 Demo: Installing and Running Apache Spark in local mode using cmd

Lecture 3 Demo: Configuring Apche Spark in IDE-PyCharm

Lecture 4 Demo: Apache Spark in cloud-Databricks

Section 2: Spark Architecture and Execution Model

Lecture 5 Spark Execution Methods

Lecture 6 Distributed Processing Model of Spark

Lecture 7 Demo: Working with PySpark Shell

Lecture 8 Demo: Creating a Multi-Node Spark Cluster using GCP

Lecture 9 Demo: Working with Zeppelin Notebook in cluster

Section 3: Spark Programming Model

Lecture 10 Introduction to DataFrame

Lecture 11 Demo: Creating Spark Project Build Configuration

Lecture 12 Demo: Configuration Spark Project Application Logs

Lecture 13 Demo: Creating and Configuring Spark Session

Section 4: Spark Data Sources and Sinks

Lecture 14 Spark APIs

Lecture 15 Demo: Reading CSV, JSON and Parquet Files

Lecture 16 Demo: Creating Spark DataFrame Schema

Lecture 17 Demo: Writing data using DataFrame Writer and Managing Layouts

Lecture 18 Demo: Working with Spark SQL Tables

Section 5: Spark DataFrame and DataSets Transformation

Lecture 19 Demo: Working with DataFrame Rows

Lecture 20 Demo: DataFrame Rows and Unit Testing

Lecture 21 Demo: Working with DataFrame Columns

Lecture 22 Demo: Creating and using user-defined Functions

Section 6: Aggregations in Apache Spark

Lecture 23 Demo: Simple Aggregations

Lecture 24 Demo: Grouping Aggregations

Lecture 25 Demo: Windowing Aggregations

Section 7: Spark DataFrame Joins

Lecture 26 Demo: Inner Join

Lecture 27 Demo: Outer Join

Software Engineers and Architects who are willing to design and develop a Big data Engineering Projects using Apache Spark,Programmers and developers who are aspiring to grow and learn Data Engineering using Apache Spark