Tags
Language
Tags
October 2025
Su Mo Tu We Th Fr Sa
28 29 30 1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31 1
    Attention❗ To save your time, in order to download anything on this site, you must be registered 👉 HERE. If you do not have a registration yet, it is better to do it right away. ✌

    ( • )( • ) ( ͡⚆ ͜ʖ ͡⚆ ) (‿ˠ‿)
    SpicyMags.xyz

    A Big Data Hadoop and Spark project for absolute beginners (3/2021)

    Posted By: lucky_aut
    A Big Data Hadoop and Spark project for absolute beginners (3/2021)

    A Big Data Hadoop and Spark project for absolute beginners
    Duration: 8h 57m | .MP4 1280x720, 30 fps(r) | AAC, 44100 Hz, 2ch | 3.85 GB
    Genre: eLearning | Language: English

    Data Engineering, Spark, Hive, Python, PySpark, Scala, Coding framework, Testing, IntelliJ, Maven, Glue, Streaming,

    What you'll learn
    Big Data , Hadoop and Spark from scratch by solving a real world use case using Python and Scala
    Spark Scala & PySpark real world coding framework.
    Real world coding best practices, logging, error handling , configuration management using both Scala and Python.
    Serverless big data solution using AWS Glue, Athena and S3

    Requirements
    Students should have some programming background and some knowledge of SQL queries.

    Description
    This course will prepare you for a real world Data Engineer role !

    Get started with Big Data quickly leveraging free cloud cluster and solving a real world use case! Learn Hadoop, Hive , Spark (both Python and Scala) from scratch!

    Learn to code Spark Scala & PySpark like a real world developer. Understand real world coding best practices, logging, error handling , configuration management using both Scala and Python.

    Project

    A bank is launching a new credit card and wants to identify prospects it can target in its marketing campaign.

    It has received prospect data from various internal and 3rd party sources. The data has various issues such as missing or unknown values in certain fields. The data needs to be cleansed before any kind of analysis can be done.

    Since the data is in huge volume with billions of records, the bank has asked you to use Big Data Hadoop and Spark technology to cleanse, transform and analyze this data.

    What you will learn :

    Big Data, Hadoop concepts

    How to create a free Hadoop and Spark cluster using Google Dataproc

    Hadoop hands-on - HDFS, Hive

    Python basics

    PySpark RDD - hands-on

    PySpark SQL, DataFrame - hands-on

    Project work using PySpark and Hive

    Scala basics

    Spark Scala DataFrame

    Project work using Spark Scala

    Spark Scala Real world coding framework and development using Winutil, Maven and IntelliJ.

    Python Spark Hadoop Hive coding framework and development using PyCharm

    Building a data pipeline using Hive , PostgreSQL, Spark

    Logging , error handling and unit testing of PySpark and Spark Scala applications

    Spark Scala Structured Streaming

    Applying spark transformation on data stored in AWS S3 using Glue and viewing data using Athena

    Prerequisites :

    Some basic programming skills

    Some knowledge of SQL queries

    Who this course is for:
    Beginners who want to learn Big Data or experienced people who want to transition to a Big Data role
    Big data beginners who want to learn how to code in the real world

    More Info