Tags
Language
Tags
May 2024
Su Mo Tu We Th Fr Sa
28 29 30 1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31 1

Databricks And Pyspark For Big Data: From Zero To Expert

Posted By: ELK1nG
Databricks And Pyspark For Big Data: From Zero To Expert

Databricks And Pyspark For Big Data: From Zero To Expert
Published 12/2022
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
Language: English | Size: 1013.05 MB | Duration: 2h 54m

Complete course to learn Databricks, including PySpark, Dataframes, Machine Learning, Advanced Analytics and Streaming

What you'll learn

Processing Big Data with PySpark in Databricks

Databricks environment and Platform

ETL, Dataframes and data visualization in Databricks

PySpark in Databricks with RDDs, Spark Dataframes API or Spark SQL

Spark Column Expresions and Dataframe Agregations

Spark Data Sources and Format types

Spark Architecture Concepts and Query Optimization

Advanced analytics and data visualization with Databricks

Machine Learning with Spark at Databricks

Spark Streaming at Databricks

Requirements

PySpark Fundamentals

Description

If you are looking for a hands-on, complete and advanced course to learn Databricks and PySpark, you have come to the right place.Databricks is a data analytics platform powered by Apache Spark for data engineering, data science, and machine learning. Databricks has become one of the most important platforms to work with Spark, compatible with Azure, AWS and Google Cloud. This makes Databricks and Apache Spark some of the most in-demand skills for data engineers and data scientists, and some of the most valuable skills today. This course will teach you everything you need to know to position yourself in the Big Data job market.This course is designed to prepare you to learn everything related to Databricks and Apache Spark, from the Databricks environment, platform and functionalities, to Spark SQL API, Spark Dataframes, Spark Streaming, Machine Learning, advanced analytics and data visualization in Databricks.With a complete training, downloadable study guides, hands-on exercises, and real-world use cases, this is the only course you'll ever need to learn Databricks and Apache Spark. You will learn Databricks, starting from the basics to the most advanced functionalities. To do so, we will use visual  presentations, sharing clear explanations and useful professional advice.This course covers the following sections:Introduction to Big Data and Apache SparkSpark Fundamentals with Spark RDDs, DataframesDatabricks environmentAdvanced analytics and data visualization with DatabricksMachine Learning with Spark at DatabricksSpark Streaming at DatabricksIf you're ready to improve your skills, increase your career opportunities, and become a Big Data expert, join today and get immediate and lifetime access to:• Complete Guide to Databricks with Apache Spark (PDF e-book)• Downloadable project files• Practical exercises and questionnaires• Databricks resources such as: Cheatsheets and summaries• 1 to 1 expert support• Forum of questions and answers of the courseSee you there!

Overview

Section 1: Introduction to Apache Spark and Big Data

Lecture 1 How to get the most out of this course

Lecture 2 Spark Fundamentals

Lecture 3 How Apache Spark works

Lecture 4 Apache Spark ecosystem and official documentation

Lecture 5 PySpark: cluster management and architecture

Section 2: Spark Architecture Concepts

Lecture 6 Spark Optimization Techniques

Lecture 7 Lazy Evaluation

Lecture 8 Wide and Narrow Transformations

Lecture 9 Parquet file in Spark

Lecture 10 Parallelism and Partitions

Lecture 11 Shuffling

Lecture 12 Caching and Storage Levels

Section 3: Databricks Fundamentals

Lecture 13 Introduction to Databricks

Lecture 14 Databricks Terminology and Databricks Community

Lecture 15 Create a free Databricks account

Lecture 16 Introduction to the Databricks environment

Lecture 17 First steps with Databricks

Section 4: Databricks Platform

Lecture 18 Importing notebooks, language configuration and markdown

Lecture 19 Databricks File Dystem (DBFS)

Lecture 20 Create, manipulate and visualize tables

Lecture 21 Databricks widgets

Section 5: ETL, Dataframes and data visualization in Databricks

Lecture 22 Creating and saving DataFrames in Databricks

Lecture 23 Transformation and visualization of data in Databricks

Lecture 24 Population Data Analytics Lab

Section 6: Spark DataFrame API

Lecture 25 Spark SQL and SQL Dataframe API

Lecture 26 Temporary Views vs Global Temporary Views

Lecture 27 Spark Dataframes

Lecture 28 Spark SQL and SQL Dataframe API Lab

Section 7: Spark Column Expresions

Lecture 29 Introduction to Spark Column Expresions

Lecture 30 Column Expressions, operators and methods

Lecture 31 DataFrame Transformation Methods

Lecture 32 Subset Rows in Dataframe

Section 8: Dataframe Agregations

Lecture 33 Spark Aggregation Methods

Lecture 34 Grouped data methods

Lecture 35 Aggregate Functions and Math Functions

Lecture 36 Functions and built-in functions review

Lecture 37 Dataframe NaN functions and dataframe join

Section 9: Machine Learning con Databricks y Apache Spark

Lecture 38 Import and exploratory analysis of data

Lecture 39 Variable preprocessing with PySpark and Databricks

Lecture 40 Definition of the Machine Learning model and development of the Pipeline

Lecture 41 Model evaluation with PySpark and Databricks

Lecture 42 Hyperparameter tuning and registration in MLFlow

Lecture 43 Predictions with new data and visualization of the results

Section 10: Databricks Koalas: The Pandas API for Apache Spark

Lecture 44 Spark Koalas Fundamentals

Lecture 45 Feature Engineering with Koalas

Lecture 46 Creating DataFrames with Koalas

Lecture 47 Data Manipulation and DataFrames with Koalas

Lecture 48 Working with missing data in Koalas

Lecture 49 Data visualization and graph generation with Koalas

Lecture 50 Import and export data with Koalas

Section 11: Spark Streaming at Databricks

Lecture 51 Example of Streaming word count with Spark Streaming

Lecture 52 Spark Streaming Configurations: Output Modes and Operation Types

Lecture 53 Spark Streaming Capabilities

Anyone who wants to learn Databricks,Anyone who wants to learn advanced big data skills,Anyone wants to make a career as a data engineer, data analyst or data scientist,Anyone interested in learning Apache Spark and PySpark for Big Data analytics,Anyone wants to learn cutting-edge technology in data processing