Tags
Language
Tags
December 2024
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31 1 2 3 4

System Design For Big Data Pipelines

Posted By: ELK1nG
System Design For Big Data Pipelines

System Design For Big Data Pipelines
Published 4/2023
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
Language: English | Size: 1.82 GB | Duration: 6h 33m

Analyze, Design and Build scalable, resilient and cost-effective Big Data pipelines with a methodical process

What you'll learn

Learn about the building blocks of a big data pipeline, their functions and challenges

Adapt an end-to-end methodical approach to designing a big data pipeline

Explore techniques to ensure overall scaling of a big data pipeline

Study design patterns for building blocks, their advantages, shortcomings, applications and available technologies

Focus additionally on Infrastructure, Operations and Security for Big Data deployments

Exercise the learnings in the course with a Batch and Realtime use case study

Requirements

Big Data Technology Concepts

Familiarity with Big Data Technologies like Apache Spark, Apache Kafka and NoSQL

Development / Deployment Experience with Big Data Technologies and Pipelines

Software Design and Development Experience including Cloud & Microservices

Description

Big data technologies have been growing exponentially over the past few years and have penetrated into every domain and industry in software development. It has become a core skill for a software engineer. Robust and effective big data pipelines are needed to support the growing volume of data and applications in the big data world. These pipelines have become business critical and help increase revenues and reduce cost.Do quality big data pipelines happen by magic? High quality designs that are scalable, reliable and cost effective are needed to build and maintain these pipelines.How do you build an end-to-end big data pipeline that leverages big data technologies and practices effectively to solve business problems? How do you integrate them in a scalable and reliable manner? How do you deploy, secure and operate them? How do you look at the overall forest and not just the individual trees? This course focuses on this skill gap.What are the topics covered in this course?We start off by discussing the building blocks of big data pipelines, their functions and challenges.We introduce a structured design process for building big data pipelines.We then discuss individual building blocks, focusing on the design patterns available, their advantages, shortcomings, use cases and available technologies.We recommend several best practices across the course.We finally implement two use cases for illustration on how to apply the learnings in the course to a real world problem. One is a batch use case and another is a real time use case.

Overview

Section 1: Introduction & Expectations

Lecture 1 Need for Quality Pipeline Design

Lecture 2 Course Coverage and Pre-requisites

Lecture 3 Cloud Serverless Technologies

Section 2: Building Blocks for Big Data Pipelines

Lecture 4 The Big Data Pipeline Network

Lecture 5 Data Acquisition Blocks

Lecture 6 Data Transport Blocks

Lecture 7 Data Processing Blocks

Lecture 8 Data Storage Blocks

Lecture 9 Data Serving Blocks

Lecture 10 Data Pipeline Infrastructure

Lecture 11 Data Pipeline Operations

Section 3: System Design Process

Lecture 12 System Design Process Overview

Lecture 13 Analyze Functional Requirements

Lecture 14 Analyze Pipeline Input

Lecture 15 Analyze Non-functional Requirements

Lecture 16 Draw a Pipeline Flowchart

Lecture 17 Create a Skeleton Design

Lecture 18 Analyze Scaling

Lecture 19 Select Technologies

Lecture 20 Design Infrastructure and Operations

Lecture 21 Develop a Test Strategy

Section 4: Scalable Pipelines - Design Principles

Lecture 22 Batch vs Realtime Pipelines

Lecture 23 Distributed Architectures

Lecture 24 Microservices based Architectures

Lecture 25 Batch Pipelines - Best Practices

Lecture 26 Realtime Pipelines - Best Practices

Lecture 27 Performance Benchmarking for Big Data Pipelines

Section 5: Data Acquisition Design

Lecture 28 File Transfer Pattern

Lecture 29 Extraction Client Pattern

Lecture 30 Ingestion API Pattern

Lecture 31 Pub Sub Acquisition Pattern

Lecture 32 Data Acquisition Design Practices

Section 6: Data Transport Design

Lecture 33 Extract Load Pattern

Lecture 34 Request Response Pattern

Lecture 35 Event Streaming Pattern

Lecture 36 Data Transport Design Practices

Section 7: Data Processing & Transformation Design

Lecture 37 Data Processing Patterns

Lecture 38 Distributed Processing with Big Data

Lecture 39 Batch Processing Design Practices - Part 1

Lecture 40 Batch Processing Design Practices - Part 2

Lecture 41 Stream Processing Design Practices

Lecture 42 Batch vs Realtime Processing

Lecture 43 Input and Output Considerations for Processing

Lecture 44 Processing Engine Technologies

Section 8: Storage Design

Lecture 45 Distributed File System Pattern

Lecture 46 Relational Database Pattern

Lecture 47 Document Database Pattern

Lecture 48 Columnar Database Pattern

Lecture 49 Graph Database Pattern

Lecture 50 Distributed Cache Pattern

Lecture 51 Data Storage Design Practices - 1

Lecture 52 Data Storage Design Practices - 2

Section 9: Serving Design

Lecture 53 Query Interface Pattern

Lecture 54 Serving API Pattern

Lecture 55 Push Client Pattern

Lecture 56 Publish Subscribe Pattern

Lecture 57 Data Serving Design Practices

Section 10: Infrastructure and Deployments

Lecture 58 Infrastructure Technologies

Lecture 59 Microservices Deployments

Lecture 60 Processing Jobs Deployments

Lecture 61 Databases and Queues Deployments

Lecture 62 Geographical Distribution

Section 11: Security

Lecture 63 Pipeline Security by Design

Lecture 64 Secure External Interfaces

Lecture 65 Secure Data Storage

Lecture 66 Privacy Considerations

Lecture 67 Multi-Tenancy Considerations

Section 12: Serviceability

Lecture 68 Elements of Serviceability

Lecture 69 Monitoring Pipelines

Lecture 70 Data to Monitor

Lecture 71 Pipeline Troubleshooting

Section 13: Use Case I : Customer Journey Analytics (CJA)

Lecture 72 Problem Definition for CJA

Lecture 73 Study CJA Functional Requirements

Lecture 74 Analyze CJA Input Data

Lecture 75 Study CJA Non-Functional Requirements

Lecture 76 Study CJA Pipeline Flowchart

Lecture 77 Create CJA Skeleton Design

Lecture 78 Analyze CJA Scaling

Lecture 79 Select Technologies for CJA

Lecture 80 Design Infrastructure and Operations for CJA

Section 14: Use Case II : Suspicious Login Alerting (SLA)

Lecture 81 Problem Definition for SLA

Lecture 82 Study SLA Functional Requirements

Lecture 83 Analyze SLA Input Data

Lecture 84 Study SLA Non-Functional Requirements

Lecture 85 Draw SLA Pipeline Flowchart

Lecture 86 Create SLA Skeleton Design

Lecture 87 Analyze SLA Scaling

Lecture 88 Select SLA Technologies

Lecture 89 Define SLA Infrastructure and Operations

Section 15: Conclusion

Lecture 90 Thank You

Big Data Pipeline Designers & Architects,Big Data Developers looking to move into Design/Architecture roles,Software Architects looking to gain Big Data Experience