Tags
Language
Tags
June 2025
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 1 2 3 4 5
    Attention❗ To save your time, in order to download anything on this site, you must be registered 👉 HERE. If you do not have a registration yet, it is better to do it right away. ✌

    ( • )( • ) ( ͡⚆ ͜ʖ ͡⚆ ) (‿ˠ‿)
    SpicyMags.xyz

    Data Engineering using Databricks features on AWS and Azure (2021)

    Posted By: ParRus
    Data Engineering using Databricks features on AWS and Azure (2021)

    Data Engineering using Databricks features on AWS and Azure
    WEBRip | English | MP4 | 1280 x 720 | AVC ~893 Kbps | 30 fps
    AAC | 128 Kbps | 44.1 KHz | 2 channels | Subs: English (.srt) | ~9.5 hours | 4.67 GB
    Genre: eLearning Video / IT & Software, Other IT & Software, Databricks

    Build Data Engineering Pipelines using Databricks core features such as Spark, Delta Lake, cloudFiles, etc.
    What you'll learn
    Data Engineering leveraging Databricks features
    Databricks CLI to manage files, Data Engineering jobs and clusters for Data Engineering Pipelines
    Deploying Data Engineering applications developed using PySpark on job clusters
    Deploying Data Engineering applications developed using PySpark using Notebooks on job clusters
    Perform CRUD Operations leveraging Delta Lake using Spark SQL for Data Engineering Applications or Pipelines
    Perform CRUD Operations leveraging Delta Lake using Pyspark for Data Engineering Applications or Pipelines
    Setting up development environment to develop Data Engineering applications using Databricks
    Building Data Engineering Pipelines using Spark Structured Streaming on Databricks Clusters
    Incremental File Processing using Spark Structured Streaming leveraging Databricks Auto Loader cloudFiles
    Overview of Auto Loader cloudFiles File Discovery Modes - Directory Listing and File Notifications
    Differences between Auto Loader cloudFiles File Discovery Modes - Directory Listing and File Notifications
    Differences between traditional Spark Structured Streaming and leveraging Databricks Auto Loader cloudFiles for incremental file processing.

    Requirements
    Programming experience using Python
    Data Engineering experience using Spark
    Ability to write and interpret SQL Queries
    This course is ideal for experience data engineers to add Databricks as one of the key skill as part of the profile

    Description
    As part of this course, you will learn all the Data Engineering using cloud platform-agnostic technology called Databricks.

    About Data Engineering

    Data Engineering is nothing but processing the data depending upon our downstream needs. We need to build different pipelines such as Batch Pipelines, Streaming Pipelines, etc as part of Data Engineering. All roles related to Data Processing are consolidated under Data Engineering. Conventionally, they are known as ETL Development, Data Warehouse Development, etc.

    About Databricks

    Databricks is the most popular cloud platform-agnostic data engineering tech stack. They are the committers of the Apache Spark project. Databricks run time provide Spark leveraging the elasticity of the cloud. With Databricks, you pay for what you use. Over a period of time, they came up with an idea of Lakehouse by providing all the features that are required for traditional BI as well as AI & ML. Here are some of the core features of Databricks.

    Spark - Distributed Computing

    Delta Lake - Perform CRUD Operations. It is primarily used to build capabilities such as inserting, updating, and deleting the data from files in Data Lake.

    cloudFiles - Get the files in an incremental fashion in the most efficient way leveraging cloud features.

    Course Details

    As part of this course, you will be learning Data Engineering using Databricks.

    Getting Started with Databricks

    Setup Local Development Environment to develop Data Engineering Applications using Databricks

    Using Databricks CLI to manage files, jobs, clusters, etc related to Data Engineering Applications

    Spark Application Development Cycle to build Data Engineering Applications

    Databricks Jobs and Clusters

    Deploy and Run Data Engineering Jobs on Databricks Job Clusters as Python Application

    Deploy and Run Data Engineering Jobs on Job Cluster using Notebooks

    Deep Dive into Delta Lake using Dataframes

    Deep Dive into Delta Lake using Spark SQL

    Building Data Engineering Pipelines using Spark Structured Streaming on Databricks Clusters

    Incremental File Processing using Spark Structured Streaming leveraging Databricks Auto Loader cloudFiles

    Overview of Auto Loader cloudFiles File Discovery Modes - Directory Listing and File Notifications

    Differences between Auto Loader cloudFiles File Discovery Modes - Directory Listing and File Notifications

    Differences between traditional Spark Structured Streaming and leveraging Databricks Auto Loader cloudFiles for incremental file processing.

    We will be adding few more modules related to Pyspark, Spark with Scala, Spark SQL, Streaming Pipelines in the coming weeks.

    Desired Audience

    Here is the desired audience for this advanced course.

    Experienced application developers to gain expertise related to Data Engineering with prior knowledge and experience of Spark.

    Experienced Data Engineers to gain enough skills to add Databricks to their profile.

    Testers to improve their testing capabilities related to Data Engineering applications using Databricks.

    Prerequisites

    Logistics

    Computer with decent configuration (At least 4 GB RAM, however 8 GB is highly desired)

    Dual Core is required and Quad-Core is highly desired

    Chrome Browser

    High-Speed Internet

    Valid AWS Account

    Valid Databricks Account (free Databricks Account is not sufficient)

    Experience as Data Engineer especially using Apache Spark

    Knowledge about some of the cloud concepts such as storage, users, roles, etc.

    Associated Costs

    As part of the training, you will only get the material. You need to practice on your own or corporate cloud account and Databricks Account.

    You need to take care of the associated AWS or Azure costs.

    You need to take care of the associated Databricks costs.

    Training Approach

    Here are the details related to the training approach.

    It is self-paced with reference material, code snippets, and videos provided as part of Udemy.

    One needs to sign up for their own Databricks environment to practice all the core features of Databricks.

    We would recommend completing 2 modules every week by spending 4 to 5 hours per week.

    It is highly recommended to take care of all the tasks so that one can get real experience of Databricks.

    Support will be provided through Udemy Q&A.

    Who this course is for:
    Beginner or Intermediate Data Engineers who want to learn Databricks for Data Engineering
    Intermediate Application Engineers who want to explore Data Engineering using Databricks
    Data and Analytics Engineers who want to learn Data Engineering using Databricks
    Testers who want to learn Databricks to test Data Engineering applications built using Databricks

    also You can find my other last: IT & Software-posts

    General
    Complete name : 003 Interacting with File System using CLI.mp4
    Format : MPEG-4
    Format profile : Base Media
    Codec ID : isom (isom/iso2/avc1/mp41)
    File size : 70.4 MiB
    Duration : 9 min 34 s
    Overall bit rate : 1 029 kb/s
    Writing application : Lavf58.12.100

    Video
    ID : 1
    Format : AVC
    Format/Info : Advanced Video Codec
    Format profile : Main@L3.1
    Format settings : CABAC / 4 Ref Frames
    Format settings, CABAC : Yes
    Format settings, RefFrames : 4 frames
    Format settings, GOP : M=4, N=60
    Codec ID : avc1
    Codec ID/Info : Advanced Video Coding
    Duration : 9 min 34 s
    Bit rate : 893 kb/s
    Nominal bit rate : 3 000 kb/s
    Width : 1 280 pixels
    Height : 720 pixels
    Display aspect ratio : 16:10
    Frame rate mode : Constant
    Frame rate : 30.000 FPS
    Color space : YUV
    Chroma subsampling : 4:2:0
    Bit depth : 8 bits
    Scan type : Progressive
    Bits/(Pixel*Frame) : 0.032
    Stream size : 61.1 MiB (87%)
    Writing library : x264 core 148
    Encoding settings : cabac=1 / ref=3 / deblock=1:0:0 / analyse=0x1:0x111 / me=umh / subme=6 / psy=1 / psy_rd=1.00:0.00 / mixed_ref=1 / me_range=16 / chroma_me=1 / trellis=1 / 8x8dct=0 / cqm=0 / deadzone=21,11 / fast_pskip=1 / chroma_qp_offset=-2 / threads=22 / lookahead_threads=3 / sliced_threads=0 / nr=0 / decimate=1 / interlaced=0 / bluray_compat=0 / constrained_intra=0 / bframes=3 / b_pyramid=2 / b_adapt=1 / b_bias=0 / direct=1 / weightb=1 / open_gop=0 / weightp=2 / keyint=60 / keyint_min=6 / scenecut=0 / intra_refresh=0 / rc_lookahead=60 / rc=cbr / mbtree=1 / bitrate=3000 / ratetol=1.0 / qcomp=0.60 / qpmin=0 / qpmax=69 / qpstep=4 / vbv_maxrate=3000 / vbv_bufsize=6000 / nal_hrd=none / filler=0 / ip_ratio=1.40 / aq=1:1.00

    Audio
    ID : 2
    Format : AAC
    Format/Info : Advanced Audio Codec
    Format profile : LC
    Codec ID : mp4a-40-2
    Duration : 9 min 34 s
    Bit rate mode : Constant
    Bit rate : 128 kb/s
    Channel(s) : 2 channels
    Channel positions : Front: L R
    Sampling rate : 44.1 kHz
    Frame rate : 43.066 FPS (1024 SPF)
    Compression mode : Lossy
    Stream size : 8.76 MiB (12%)
    Default : Yes
    Alternate group : 1

    Screenshots

    Data Engineering using Databricks features on AWS and Azure (2021)

    Data Engineering using Databricks features on AWS and Azure (2021)

    Data Engineering using Databricks features on AWS and Azure (2021)

    Data Engineering using Databricks features on AWS and Azure (2021)

    Data Engineering using Databricks features on AWS and Azure (2021)

    ✅ Exclusive eLearning Videos ParRus-blogadd to bookmarks
    Feel free to contact me PM
    when links are dead or want any repost

    Data Engineering using Databricks features on AWS and Azure (2021)