Databricks And Pyspark For Big Data: From Zero To Expert
Published 12/2022
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
Language: English | Size: 1013.05 MB | Duration: 2h 54m
Published 12/2022
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
Language: English | Size: 1013.05 MB | Duration: 2h 54m
Complete course to learn Databricks, including PySpark, Dataframes, Machine Learning, Advanced Analytics and Streaming
What you'll learn
Processing Big Data with PySpark in Databricks
Databricks environment and Platform
ETL, Dataframes and data visualization in Databricks
PySpark in Databricks with RDDs, Spark Dataframes API or Spark SQL
Spark Column Expresions and Dataframe Agregations
Spark Data Sources and Format types
Spark Architecture Concepts and Query Optimization
Advanced analytics and data visualization with Databricks
Machine Learning with Spark at Databricks
Spark Streaming at Databricks
Requirements
PySpark Fundamentals
Description
If you are looking for a hands-on, complete and advanced course to learn Databricks and PySpark, you have come to the right place.Databricks is a data analytics platform powered by Apache Spark for data engineering, data science, and machine learning. Databricks has become one of the most important platforms to work with Spark, compatible with Azure, AWS and Google Cloud. This makes Databricks and Apache Spark some of the most in-demand skills for data engineers and data scientists, and some of the most valuable skills today. This course will teach you everything you need to know to position yourself in the Big Data job market.This course is designed to prepare you to learn everything related to Databricks and Apache Spark, from the Databricks environment, platform and functionalities, to Spark SQL API, Spark Dataframes, Spark Streaming, Machine Learning, advanced analytics and data visualization in Databricks.With a complete training, downloadable study guides, hands-on exercises, and real-world use cases, this is the only course you'll ever need to learn Databricks and Apache Spark. You will learn Databricks, starting from the basics to the most advanced functionalities. To do so, we will use visual presentations, sharing clear explanations and useful professional advice.This course covers the following sections:Introduction to Big Data and Apache SparkSpark Fundamentals with Spark RDDs, DataframesDatabricks environmentAdvanced analytics and data visualization with DatabricksMachine Learning with Spark at DatabricksSpark Streaming at DatabricksIf you're ready to improve your skills, increase your career opportunities, and become a Big Data expert, join today and get immediate and lifetime access to:• Complete Guide to Databricks with Apache Spark (PDF e-book)• Downloadable project files• Practical exercises and questionnaires• Databricks resources such as: Cheatsheets and summaries• 1 to 1 expert support• Forum of questions and answers of the courseSee you there!
Overview
Section 1: Introduction to Apache Spark and Big Data
Lecture 1 How to get the most out of this course
Lecture 2 Spark Fundamentals
Lecture 3 How Apache Spark works
Lecture 4 Apache Spark ecosystem and official documentation
Lecture 5 PySpark: cluster management and architecture
Section 2: Spark Architecture Concepts
Lecture 6 Spark Optimization Techniques
Lecture 7 Lazy Evaluation
Lecture 8 Wide and Narrow Transformations
Lecture 9 Parquet file in Spark
Lecture 10 Parallelism and Partitions
Lecture 11 Shuffling
Lecture 12 Caching and Storage Levels
Section 3: Databricks Fundamentals
Lecture 13 Introduction to Databricks
Lecture 14 Databricks Terminology and Databricks Community
Lecture 15 Create a free Databricks account
Lecture 16 Introduction to the Databricks environment
Lecture 17 First steps with Databricks
Section 4: Databricks Platform
Lecture 18 Importing notebooks, language configuration and markdown
Lecture 19 Databricks File Dystem (DBFS)
Lecture 20 Create, manipulate and visualize tables
Lecture 21 Databricks widgets
Section 5: ETL, Dataframes and data visualization in Databricks
Lecture 22 Creating and saving DataFrames in Databricks
Lecture 23 Transformation and visualization of data in Databricks
Lecture 24 Population Data Analytics Lab
Section 6: Spark DataFrame API
Lecture 25 Spark SQL and SQL Dataframe API
Lecture 26 Temporary Views vs Global Temporary Views
Lecture 27 Spark Dataframes
Lecture 28 Spark SQL and SQL Dataframe API Lab
Section 7: Spark Column Expresions
Lecture 29 Introduction to Spark Column Expresions
Lecture 30 Column Expressions, operators and methods
Lecture 31 DataFrame Transformation Methods
Lecture 32 Subset Rows in Dataframe
Section 8: Dataframe Agregations
Lecture 33 Spark Aggregation Methods
Lecture 34 Grouped data methods
Lecture 35 Aggregate Functions and Math Functions
Lecture 36 Functions and built-in functions review
Lecture 37 Dataframe NaN functions and dataframe join
Section 9: Machine Learning con Databricks y Apache Spark
Lecture 38 Import and exploratory analysis of data
Lecture 39 Variable preprocessing with PySpark and Databricks
Lecture 40 Definition of the Machine Learning model and development of the Pipeline
Lecture 41 Model evaluation with PySpark and Databricks
Lecture 42 Hyperparameter tuning and registration in MLFlow
Lecture 43 Predictions with new data and visualization of the results
Section 10: Databricks Koalas: The Pandas API for Apache Spark
Lecture 44 Spark Koalas Fundamentals
Lecture 45 Feature Engineering with Koalas
Lecture 46 Creating DataFrames with Koalas
Lecture 47 Data Manipulation and DataFrames with Koalas
Lecture 48 Working with missing data in Koalas
Lecture 49 Data visualization and graph generation with Koalas
Lecture 50 Import and export data with Koalas
Section 11: Spark Streaming at Databricks
Lecture 51 Example of Streaming word count with Spark Streaming
Lecture 52 Spark Streaming Configurations: Output Modes and Operation Types
Lecture 53 Spark Streaming Capabilities
Anyone who wants to learn Databricks,Anyone who wants to learn advanced big data skills,Anyone wants to make a career as a data engineer, data analyst or data scientist,Anyone interested in learning Apache Spark and PySpark for Big Data analytics,Anyone wants to learn cutting-edge technology in data processing