Google Dataflow With Apache Beam - Beginner To Pro Course
Published 7/2025
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 2.44 GB | Duration: 4h 30m
Published 7/2025
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 2.44 GB | Duration: 4h 30m
Master Google Dataflow with hands-on projects | Apache Beam basics to advanced streaming & batch data pipelines
What you'll learn
Understand what Google Cloud Dataflow is and how it enables scalable data processing
Learn the Apache Beam programming model, with PCollections and PTransforms
Build end-to-end ETL pipelines for both batch and streaming data
Use Google Pub/Sub for real-time data ingestion and understand its architecture
Implement template-based pipelines for reusability and automation
Requirements
Basic understanding of Python
Familiarity with GCP is helpful but not mandatory
A willingness to learn hands-on and solve real-world challenges
Description
Are you looking to master Google Dataflow and Apache Beam to build scalable, production-ready data pipelines on Google Cloud Platform (GCP)? Whether you're a data engineer, cloud enthusiast, or aspiring GCP professional, this course will take you from zero to advanced level, through hands-on labs, real-world case studies, and practical assignments.What You'll LearnUnderstand the fundamentals of Google Cloud Dataflow and how it fits in the data engineering ecosystemExplore the Apache Beam framework – the programming model behind DataflowLearn core concepts like PCollections and PTransformsDifferentiate Dataflow vs Dataproc and when to use eachSet up your own Cloud Workbench environment for hands-on practiceBuild real-world ETL pipelines (Extract, Transform, Load) using Apache BeamUse Google Pub/Sub for real-time data ingestion and understand its architectureDevelop pipelines using both:Template-based methodCase Study 1: Template-driven pipelineCustom code approachCase Study 2: end to end Batch pipelineCase Study 3: end to end Streaming pipelineComplete hands-on assignments to reinforce learning and prepare for real-world scenariosHands-On Labs Include:Beam Basics with Python/Java SDKETL development on DataflowStreaming pipeline using Pub/SubBatch pipeline using Cloud StorageDebugging, monitoring, and optimizing pipeline performanceend to end pipeline creations from scratch
Overview
Section 1: Dataflow with Apache Beam
Lecture 1 Material and Datasets
Lecture 2 Course Introduction
Lecture 3 What is Dataflow - Apache Beam Introduction - How it is different from Dataproc
Lecture 4 Workbench Creation - Beam Basics - Extract data from Multiple Data Sources
Lecture 5 How to write Data to Multiple Sinks
Lecture 6 Apache beam Transformations
Lecture 7 Pipeline Creation : Template Method - Case study-1
Lecture 8 Batch Pipeline Creation : Custom code - Case Study-2
Lecture 9 Streaming Pipeline Creation with Pubsub : Custome code - Case study-3
Beginner to Learn and Master Google Dataflow and Apache beam,Aspiring Data Engineers and GCP enthusiasts