Tags
Language
Tags
October 2025
Su Mo Tu We Th Fr Sa
28 29 30 1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31 1
    Attention❗ To save your time, in order to download anything on this site, you must be registered 👉 HERE. If you do not have a registration yet, it is better to do it right away. ✌

    ( • )( • ) ( ͡⚆ ͜ʖ ͡⚆ ) (‿ˠ‿)
    SpicyMags.xyz

    Spark Performance Tuning For Data Engineers: Part1 - Storage

    Posted By: ELK1nG
    Spark Performance Tuning For Data Engineers: Part1 - Storage

    Spark Performance Tuning For Data Engineers: Part1 - Storage
    Published 5/2025
    MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
    Language: English | Size: 1.36 GB | Duration: 3h 23m

    Data Engineering & Apache Spark Optimization Techniques on Databricks to Boost Speed, Reduce cost & Handle Big Data

    What you'll learn

    Hands on Demo based on different Scenarios & Usecases

    Learn the nuances of spark performance tuning

    Get detailed insights about different operations in spark

    Get clear understanding about how spark configs work hand in hand & best combination for optimal results

    Learn to identify and solve bottlenecks & errors in your spark application

    Requirements

    Basic Spark Architecture & internals

    Spark programming in PySpark or Scala

    Databricks Cloud Platform

    Description

    Unlock the true potential of Apache Spark by mastering storage-related performance tuning techniques. This hands-on course is packed with real-world scenarios, guided demos, and practical use cases that will help you fine-tune Spark storage strategies for speed, efficiency, and scalability.This course is perfect for Intermediate Data Engineers & Spark Developers as well as Aspiring Achitects who wants to optimize Spark jobs, reduce resource costs, and ensure fast, reliable performance for large-scale data applications.What You’ll Learn1. Understand how Apache Spark handles storage internally: memory vs disk2. Learn when and how to use Spark caching and persistence effectively3. Compare and choose the right storage levels: MEMORY_ONLY, MEMORY_AND_DISK, etc.4. Use real-world examples and hands-on demos to benchmark storage decisions5. Learn how to monitor storage metrics using the Spark UI6. Handle memory spills, disk I/O bottlenecks, and storage tuning in cluster environments7. Apply best practices for storage optimization in cloud and on-prem Spark clustersWhy Take This Course?100% Hands-on: Focused on practical implementation, not just theoryDesigned for Data Engineers, Spark Developers, and Big Data PractitionersCovers both foundational concepts and advanced tuning techniquesTeaches how to measure performance gains using real metricsHelps you make cost-efficient decisions for big data storageTools & Technologies CoveredApache Spark (2.x and 3.x)DataBricksSpark UIHDFS, DataLake (for storage scenarios)

    Overview

    Section 1: Introduction

    Lecture 1 Introduction

    Lecture 2 What is Optimization

    Lecture 3 What is Benchmarking

    Section 2: Important Concepts

    Lecture 4 Spark High Level Architecture

    Lecture 5 Spark Job Execution

    Lecture 6 Reading Spark UI

    Lecture 7 Physical Plans & DAG - Part 1

    Lecture 8 Physical Plans & DAG - Part 2

    Section 3: Optimizing Storage

    Lecture 9 Schema Inference Problem

    Lecture 10 Reuse DataFrame

    Lecture 11 Column Elimination

    Lecture 12 Row Elimination

    Lecture 13 Directory Scan Problem

    Lecture 14 Optimal File Size

    Lecture 15 Haystack Query

    Data Engineers & Spark Developers as well as Aspiring Achitects curious about advanced techniques of Performance Tuning & Optimization