Tags
Language
Tags
May 2025
Su Mo Tu We Th Fr Sa
27 28 29 30 1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31
    Attention❗ To save your time, in order to download anything on this site, you must be registered 👉 HERE. If you do not have a registration yet, it is better to do it right away. ✌

    ( • )( • ) ( ͡⚆ ͜ʖ ͡⚆ ) (‿ˠ‿)
    SpicyMags.xyz

    Apache Spark 3 For Data Engineering & Analytics With Python

    Posted By: Sigha
    Apache Spark 3 For Data Engineering & Analytics With Python

    Apache Spark 3 For Data Engineering & Analytics With Python
    MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
    Language: English (US) | Size: 1.93 GB | Duration: 8h 39m

    Learn how to use Python and PySpark 3.0.1 for Data Engineering / Analytics (Databricks) - Beginner to Ninja

    What you'll learn
    Learn the Spark Architecture
    Learn Spark Execution Concepts
    Learn Spark Transformations and Actions using the Structured API
    Learn Spark Transformations and Actions using the RDD (Resilient Distributed Datasets) API
    Learn how to set up your own local PySpark Environment
    Learn how to interpret the Spark Web UI
    Learn how to interpret DAG (Directed Acyclic Graph) for Spark Execution
    Learn the RDD (Resilient Distributed Datasets) API (Crash Course)
    Learn the Spark DataFrame API  (Structured APIs)
    Learn Spark SQL
    Learn Spark on Databricks
    Learn to Visualize (Graphs and Dashboards) Data on Databricks

    Requirements
    A basic laptop PC running Windows or Mac OS with at least 6 - 8GB of RAM
    Basic programming knowledge

    Description
    The key objectives of this course are as follows;Learn the Spark ArchitectureLearn Spark Execution ConceptsLearn Spark Transformations and Actions using the Structured APILearn Spark Transformations and Actions using the RDD (Resilient Distributed Datasets) APILearn how to set up your own local PySpark EnvironmentLearn how to interpret the Spark Web UILearn how to interpret DAG (Directed Acyclic Graph) for Spark ExecutionLearn the RDD (Resilient Distributed Datasets) API (Crash Course)RDD TransformationsRDD ActionsLearn the Spark DataFrame API  (Structured APIs)Create Schemas and Assign DataTypesRead and Write Data using the DataFrame Reader and WriterRead Semi-Structured Data such as JSONCreate and New Data Columns to the DataFrame using ExpressionsFilter the DataFrame using the "Filter" and "Where" TransformationsEnsure that the DataFrame has unique rowsDetect and Drop DuplicatesAugment the DataFrame by Adding New Rows Combine 2 or More DataFramesOrder the DataFrame by Specific ColumnsRenaming and Drop Columns from the DataFrameClean the DataFrame by detecting and Removing Missing or Bad DataCreate  User-Defined Spark FunctionsRead and Write to/from Parquet FilePartition the DataFrame and Write to Parquet FileAggregate the DataFrame using Spark SQL functions (count, countDistinct, Max, Min, Sum, SumDistinct, AVG)Perform Aggregations with GroupingLearn Spark SQL and DatabricksCreate a Databricks AccountCreate a Databricks ClusterCreate Databricks SQL and Python NotebooksLearn Databricks shortcutsCreate Databases and Tables using Spark SQLUse DML, DQL, and DDL with Spark SQLUse Spark SQL FunctionsLearn the differences between Managed and Unmanaged TablesRead CSV Files from the Databricks File SystemLearn to write Complex SQLUse Spark SQL FunctionsCreate Visualisations with DatabricksCreate a Databricks DashboardThe Python Spark project that we are going to do together;Sales DataCreate a Spark SessionRead a CSV file into a Spark DataframeLearn to Infer a SchemaSelect data from the Spark DataframeProduce analytics that shows the topmost sales orders per Region and CountryConvert Fahrenheit to Degrees CentigradeCreate a Spark SessionRead and Parallelize data using the Spark Context into an RDDCreate a Function to Convert Fahrenheit to Degrees CentigradeUse the Map Function to convert data contained within an RDDFilter temperatures greater than or equal to 13 degrees celsiusXYZ ResearchCreate a set of RDDs that hold Research DataUse the union transformation to combine RDDsLearn to use the subtract transformation to minus values from an RDDUse the RDD API to answer the following questionsHow many research projects were initiated in the first three years?How many projects were completed in the first year?How many projects were completed in the first two years?Sales AnalyticsCreate the Sales Analytics DataFrame to a set of CSV FilesPrepare the DataFrame by applying a StructureRemove bad records from the DataFrame (Cleaning)Generate New Columns from the DataFrameWrite a Partitioned DataFrame to a Parquet DirectoryAnswer the following questions and create visualizations using Seaborn and MatplotlibWhat was the best month in sales?What city sold the most products?What time should the business display advertisements to maximize the likelihood of customers buying products?What products are often sold together in the state "NY"? Technology SpecPythonJupyter NotebookJupyter LabPySpark (Spark with Python)PandasMatplotlibSeaborneDatabricksSQL

    Who this course is for:
    Python Developers who wish to learn how to use the language for Data Engineering and Analytics with PySpark,Aspiring Data Engineering and Analytics Professionals,Data Scientists / Analysts who wish to learn an analytical processing strategy that can be deployed over a big data cluster,Data Managers who want to gain a deeper understanding of managing data over a cluster


    Apache Spark 3 For Data Engineering & Analytics With Python


    For More Courses Visit & Bookmark Your Preferred Language Blog
    From Here: English - Français - Italiano - Deutsch - Español - Português - Polski - Türkçe - Русский