Tags
Language
Tags
October 2025
Su Mo Tu We Th Fr Sa
28 29 30 1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31 1
    Attention❗ To save your time, in order to download anything on this site, you must be registered 👉 HERE. If you do not have a registration yet, it is better to do it right away. ✌

    ( • )( • ) ( ͡⚆ ͜ʖ ͡⚆ ) (‿ˠ‿)
    SpicyMags.xyz

    GCP Data Engineering - End to End Project - Retailer Domain

    Posted By: lucky_aut
    GCP Data Engineering - End to End Project - Retailer Domain

    GCP Data Engineering - End to End Project - Retailer Domain
    Published 5/2025
    Duration: 6h 5m | .MP4 1920x1080 30 fps(r) | AAC, 44100 Hz, 2ch | 3.18 GB
    Genre: eLearning | Language: English

    Industry Standard Project in Retailer Domain using GCP services like GCS, BigQuery, Dataproc, Composer, GitHub, CICD

    What you'll learn
    - Understand the End to End Data Engineering Project for Retailer Domain
    - Design and Implement Scalable ETL Pipelines for Healthcare Data
    - Implement Key Techniques like Incremental Data, SCD2, Metadata driven approach, Medallion Arch, Error Handling, CDM , CICD & Many more..
    - Develop and Deploy Data Solutions with CI/CD Practices

    Requirements
    - Basic Knowledge on Python and SQL

    Description
    This project focuses on building a data lake in Google Cloud Platform (GCP) for Retailer Domain

    The goal is to centralize, clean, and transform data from multiple sources, enabling Retailers providers and insurance companies to streamline billing, claims processing, and revenue tracking.

    GCP Services Used:

    Google Cloud Storage (GCS):Stores raw and processed data files.

    BigQuery:Serves as the analytical engine for storing and querying structured data.

    Dataproc:Used for large-scale data processing with Apache Spark.

    Cloud Composer (Apache Airflow):Automates ETL pipelines and workflow orchestration.

    Cloud SQL (MySQL):Stores transactional Electronic Medical Records (EMR) data.

    GitHub & Cloud Build:Enables version control and CI/CD implementation.

    CICD (Continuous Integration & Continuous Deployment):Automates deployment pipelines for data processing and ETL workflows.

    Techniques involved :

    Metadata Driven Approach

    SCD type 2 implementation

    CDM(Common Data Model)

    Medallion Architecture

    Logging and Monitoring

    Error Handling

    Optimizations

    CICD implementation

    many more best practices

    Data Sources

    EMR (Electronic Medical Records) data from two hospitals

    Claims files

    CPT (Current Procedural Terminology) Code

    NPI (National Provider Identifier) Data

    Expected Outcomes

    Efficient Data Pipeline: Automating the ingestion and transformation of RCM data.

    Structured Data Warehouse: gold tables in BigQuery for analytical queries.

    KPI Dashboards: Insights into revenue collection, claims processing efficiency, and financial trends.

    Who this course is for:
    - Aspiring Data Engineers, Data Professionals
    - For getting interview Ready
    More Info

    Please check out others courses in your favourite language and bookmark them
    English - German - Spanish - French - Italian
    Portuguese