Data Pipelines With Snowflake And Streamlit

Posted By: ELK1nG

Data Pipelines With Snowflake And Streamlit
Published 9/2024
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 1.97 GB | Duration: 5h 18m

Using Snowflake to data engineer Kaggle and Google Trends data with Python procedures and tasks

What you'll learn

Setup Snowflake and AWS Accounts

Work with Kaggle and SerpAPI

Download and manipulate data with Jupyter Notebooks on VS Code

Work with External Access Integration and Storage Integration on Snowflake

Create Snowflake Python based procedures

Create Snowflake tasks

Create Streamlit apps inside of Snowflake

Requirements

Proficient knowledge on SQL and basic knowledge on Snowflake database

Basic knowledge on data modeling and engineering

Proficient Python knowledge

Description

This course focuses on building a data engineering pipeline that integrates multiple data sources, including Kaggle datasets and Google Trends data (fetched via SerpAPI), to analyze the relationship between Netflix show releases and the popularity of actors. You'll learn to gather and combine data on Netflix actors and their trends on Google, particularly in the weeks following a show's release.You will use Kaggle as a source for the Netflix shows and actors dataset and Google Trends (accessed via SerpAPI) to fetch real-time search data for the actors. This data will be stored and processed within the Snowflake database, leveraging its cloud-native architecture for optimal scalability and performance.Technical Stack Overview:Snowflake Database: The central repository for storing and querying data.Streamlit in Snowflake: A web app framework to visualize the data directly inside Snowflake.AWS S3: For data storage and retrieval, particularly for intermediate datasets.Snowflake Python Procedures: Automating data manipulation and pipeline processes.Snowflake External Access & Storage Integrations: Managing secure access to external services and storage.By the end of the course, you'll have a fully functional data pipeline that processes and combines streaming data, cloud storage, and APIs for trend analysis, visualized through an interactive Streamlit app within Snowflake.

Overview

Section 1: Introduction

Lecture 1 Introduction

Section 2: Setup - Part 1

Lecture 2 Snowflake Setup - 1

Lecture 3 Snowflake Setup - 2

Lecture 4 External Access Integration Request

Lecture 5 Kaggle Setup

Lecture 6 SerpAPI Setup

Lecture 7 VS Code Setup

Lecture 8 AWS Account Setup

Section 3: Sample download code

Lecture 9 Kaggle download script - 1

Lecture 10 Kaggle download script - 2

Lecture 11 SerpAPI download script - 1

Lecture 12 SerpAPI download script - 2

Section 4: Setup - Part 2

Lecture 13 Snowflake EAI request completion

Section 5: Database preparation

Lecture 14 Database preparation - 1

Lecture 15 Database preparation - 2

Section 6: Kaggle Python procedure

Lecture 16 Kaggle Python procedure - 1

Lecture 17 Kaggle Python procedure - 2

Lecture 18 Kaggle Python procedure - 3

Section 7: SerpAPI Python procedure

Lecture 19 SerpAPI Python procedure - 1

Lecture 20 SerpAPI Python procedure - 2

Lecture 21 SerpAPI Python procedure - 3

Lecture 22 SerpAPI Python procedure - 4

Section 8: Task design and DWH layer

Lecture 23 Task design - 1

Lecture 24 Task design - 2

Lecture 25 DWH design - 1

Lecture 26 DWH design - 2

Section 9: Streamlit app

Lecture 27 Streamlit app - 1

Lecture 28 Streamlit app - 2

Section 10: Pipeline enhancements

Lecture 29 Improvements summary

Lecture 30 Kaggle procedure update

Lecture 31 SerpAPI procedure update - 1

Lecture 32 SerpAPI procedure update - 2

Lecture 33 SerpAPI procedure update - 3

Lecture 34 SerpAPI procedure update - 4

Lecture 35 SerpAPI procedure update - 5

Lecture 36 SerpAPI procedure update - 6

Lecture 37 SerpAPI procedure update - 7

Lecture 38 SerpAPI procedure update - 8

Section 11: Conclusion

Lecture 39 Conclusion

Lecture 40 Course content

Data Engineers looking to get proficient on Snowflake and Streamlit for building data pipelines