Tags
Language
Tags
December 2024
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31 1 2 3 4

Learning Apache Spark | Master Spark For Big Data Processing

Posted By: ELK1nG
Learning Apache Spark | Master Spark For Big Data Processing

Learning Apache Spark | Master Spark For Big Data Processing
Published 10/2024
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
Language: English | Size: 2.77 GB | Duration: 7h 11m

Embark on a comprehensive journey to Master Apache Spark from Data Manipulation to Machine Learning!

What you'll learn

Understand the fundamentals of Spark’s architecture and its distributed computing capabilities

Learn to write and optimize Spark SQL queries for efficient data processing

Master the creation and manipulation of DataFrames, a core component of Spark

Learn to read data from different file formats such as CSV and Parquet

Develop skills in filtering, sorting, and aggregating data to extract meaningful insights

Learn to process and analyze streaming data for real-time insights

Explore the capabilities of Spark’s MLlib for machine learning

Learn to create and fine-tune models using pipelines and transformers for predictive analytics

Requirements

You should know how to write and run Python code

Basic understanding of Python syntax and concepts is necessary

Understanding SQL (Structured Query Language) is important

You should know how to create and manage tables, transform data, and run queries

Description

Unlock the power of big data with Apache Spark!In this course, you’ll learn how to use Apache Spark with Python to work with data.We’ll start with the basics and move up to advanced projects and machine learning.Whether you’re just starting or already know some Python, this course will teach you step-by-step how to process and analyze big data.What You’ll Learn:Use PySpark’s DataFrame: Learn to organize and work with data.Store Data Efficiently: Use formats like Parquet to store data quickly.Use SQL in PySpark: Work with data using SQL, just like with DataFrames.Connect PySpark with Python Tools: Dig deeper into data with Python’s data tools.Machine Learning with PySpark’s MLlib: Work on big projects using machine learning.Real-World Examples: Learn by doing with practical examples.Handle Large Data Sets: Understand how to manage big data easily.Solve Real-World Problems: Apply Spark to real-life data challenges.Build Confidence in PySpark: Get better at big data processing.Manage and Analyze Data: Gain skills for both work and personal projects.Prepare for Data Jobs: Build skills for jobs in tech, finance, and healthcare.By the end of this course, you’ll have a solid foundation in Spark, ready to tackle real-world data challenges.

Overview

Section 1: Getting Started

Lecture 1 Why Should You Learn Apache Spark?

Lecture 2 What Does This Course Offer on Apache Spark?

Section 2: All about Apache Spark

Lecture 3 Let’s understand WordCount

Lecture 4 Let’s understand Map and Reduce

Lecture 5 Programming with Map and Reduce

Lecture 6 Let’s understand Hadoop

Lecture 7 Apache Hadoop Architecture

Lecture 8 Apache Hadoop and Apache Spark

Lecture 9 Apache Spark Architecture

Lecture 10 What is PySpark

Section 3: Installations for Apache Spark

Lecture 11 Install JAVA JDK

Lecture 12 Install Python

Lecture 13 Install JupyterLab

Lecture 14 Install PySpark

Lecture 15 Spark Session by Initialization

Lecture 16 Running PySpark on AWS EC2 Instances P1

Lecture 17 Running PySpark on AWS EC2 Instance P2

Section 4: Using Databricks Community Edition

Lecture 18 Why Use Databricks Community Edition

Lecture 19 Register for Databricks Community Edition

Lecture 20 When to use Databricks Community Edition

Lecture 21 Running Magic Commands in Databricks P1

Lecture 22 Running Magic Commands in Databricks P2

Section 5: Spark DataFrames

Lecture 23 Apache Spark DataFrame

Lecture 24 Create DataFrames from CSV Files P1

Lecture 25 Create DataFrames from CSV Files P2

Lecture 26 Create DataFrames from Parquet Files

Section 6: Spark Data Transformations

Lecture 27 Using SELECT

Lecture 28 Using FILTER

Lecture 29 Using ORDER BY

Lecture 30 Using GROUP BY

Lecture 31 Using AGGREGATE Functions

Lecture 32 Using INNER JOIN

Section 7: Spark SQL Catalog

Lecture 33 Spark SQL Catalogs

Lecture 34 Access Spark SQL Catalogs

Lecture 35 List Databases from Catalogs

Lecture 36 List Tables from Current Database

Lecture 37 Create Spark Temp View

Lecture 38 Run SQL Queries on Temp Views

Lecture 39 Drop Temp Views

Section 8: Databricks Utility FileSystem for Apache Spark

Lecture 40 Using Databricks Utilities

Lecture 41 Using dbfs - Databricks Utility FileSystem

Lecture 42 Using dbfs - Make Directory

Lecture 43 Using dbfs - Copy Files

Lecture 44 Using dbfs - Delete Files

Section 9: Pandas API on Spark

Lecture 45 Introduction to Pandas

Lecture 46 Pandas API on Spark

Lecture 47 Reading and Writing Data with Pandas P1

Lecture 48 Reading and Writing Data with Pandas P2

Lecture 49 Data Manipulation with PySpark Pandas

Lecture 50 Merging and Joining in PySpark Pandas

Lecture 51 Grouping and Aggregation with PySpark Pandas

Lecture 52 Visualizing Data in PySpark Pandas

Section 10: Structured Streaming Using Apache Spark

Lecture 53 What is Apache Spark Structure Streaming

Lecture 54 How Apache Spark handles Structured Streaming

Lecture 55 Handling Programmatically Streaming Data

Lecture 56 Programmatic Modes by Apache Spark

Lecture 57 DataFrames for Streaming

Lecture 58 readStream API

Lecture 59 writeStream API

Lecture 60 Querying Data

Lecture 61 StreamingQuery - stop

Lecture 62 Structured Streaming with Kafka and Spark P1

Lecture 63 Structured Streaming with Kafka and Spark P2

Lecture 64 Structured Streaming with Kafka and Spark P3

Lecture 65 Terminate the Kafka Environment

Lecture 66 Handling Late Data Arrivals and Water Marking P1

Lecture 67 Handling Late Data Arrivals and Water Marking P2

Section 11: Machine Learning with Spark

Lecture 68 About this section

Lecture 69 Learning about Machine Learning

Lecture 70 How to build a Machine Learning Model

Lecture 71 Apache Spark MLLib Overview

Lecture 72 Learning about ML Pipelines using Spark MLlib

Lecture 73 Data Sources by Spark MLlib to Build ML Models

Lecture 74 Create DataFrames from Data Sources

Lecture 75 Learning about Featurization using Spark MLlib

Lecture 76 Using Apache Spark MLlibs - Feature Transformers

Lecture 77 Using Tokenizer

Lecture 78 Using StringIndexer

Lecture 79 Using Pipelines

Lecture 80 Using VectorAssembler

Lecture 81 Using VectorIndexer

Lecture 82 Using MLlib Estimator - Linear Regression

Lecture 83 Using MLlib Estimator - Logisitic Regression

Lecture 84 Measure ML Effiecny using Spark MLlib Evaluators

Lecture 85 Using ML for Solving Real World Problem

Lecture 86 Building ML Model P1 - Using Local Host

Lecture 87 Building ML Model P2 - Using Databricks Community Edition

Lecture 88 Using Apache Spark MLFlow with Databricks Community Edition

IT professionals interested in big data and analytics,Aspiring Data Scientists,Aspiring Data Analysts,Aspiring Machine Learning Engineers,Business Analysts,Software Engineers,Students and Academics,Researchers,Anyone Interested in Big Data