Exploratory Data Analysis | Build Eda App (Streamlit)
Last updated 6/2023
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 101.08 MB | Duration: 2h 42m
Last updated 6/2023
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 101.08 MB | Duration: 2h 42m
Master The Analysis and Transformation techniques done before the ML Project | Ensure Maximum Value for your data
What you'll learn
What is EDA
Why EDA is needed
What is multi collinearity
How to identify outliers
How to identify relationship between variables
Chi Square Test & other tests
How to transform continuous data
How to transform categorical dara
Central Tendency Vs Dispersion
How to handle missing values in your dataset
How to apply EDA (through an assignment)
How to derive maximum value for your data
Requirements
Knowledge of Python and Machine Learning
Description
Recent updatesJan 2023: EDA libraries (Klib, Sweetviz) that complete all the EDA activities with a few lines of code have been addedJuly 2022: An explanatory video on the differences between data analysis and exploratory data analysis has been added.Jan 2022: Conditional Scatter plots have been added to assist with bi variate analysisNov 2021: An exhaustive exercise covering all the possibilities of EDA has been added.Testimonials about the course"I found this course interesting and useful. Mr. Govind has tried to cover all important concepts in an effective manner. This course can be considered as an entry-level course for all machine learning enthusiasts. Thank you for sharing your knowledge with us." Dr. Raj Gaurav M."He is very clear. It's a perfect course for people doing ML based on data analysis." Dasika Sri Bhuvana V."This course gives you a good advice about how to understand your data, before start using it. Avoids that you create a bad model, just because the data wasn't cleaned." Ricardo VSetting the contextBefore you start a machine learning project, its important to ensure that the data is ready for modeling work. Exploratory Data Analysis (EDA) ensures the readiness of the data for Machine Learning. In fact, EDA ensures that the data is more usable. Without a proper EDA, Machine Learning work suffer from accuracy issues and many times, the algorithms won't work.What is exploratory data analysis?Exploratory data analysis (EDA) is used by data scientists to analyze and investigate data sets and summarize their main characteristics, often employing data visualization methods. It helps determine how best to manipulate data sources to get the answers you need, making it easier for data scientists to discover patterns, spot anomalies, test a hypothesis, or check assumptions.EDA is primarily used to see what data can reveal beyond the formal modeling or hypothesis testing task and provides a provides a better understanding of data set variables and the relationships between them. It can also help determine if the statistical techniques you are considering for data analysis are appropriate. Originally developed by American mathematician John Tukey in the 1970s, EDA techniques continue to be a widely used method in the data discovery process today.Why is exploratory data analysis important in data science?The main purpose of EDA is to help look at data before making any assumptions. It can help identify obvious errors, as well as better understand patterns within the data, detect outliers or anomalous events, find interesting relations among the variables.Data scientists can use exploratory analysis to ensure the results they produce are valid and applicable to any desired business outcomes and goals. EDA also helps stakeholders by confirming they are asking the right questions. EDA can help answer questions about standard deviations, categorical variables, and confidence intervals. Once EDA is complete and insights are drawn, its features can then be used for more sophisticated data analysis or modeling, including machine learning.Programming Language UsedPython: an interpreted, object-oriented programming language with dynamic semantics. Its high-level, built-in data structures, combined with dynamic typing and dynamic binding, make it very attractive for rapid application development, as well as for use as a scripting or glue language to connect existing components together. Python and EDA can be used together to identify missing values in a data set, which is important so you can decide how to handle missing values for machine learning.What is covered in this course?This course will teach you the techniques and approaches in exploratory data analysis, which will help you to derive maximum value from the data. If you jump into machine learning without doing this EDA, you are setting yourself up for failure besides ending up with lower accuracy. This course is designed by an AI and tech veteran and comes to you straight from the oven!
Overview
Section 1: Introduction to EDA (Exploratory Data Analysis)
Lecture 1 Introduction to EDA
Section 2: Clarification between data analysis and EDA
Lecture 2 Clarification between data analysis and EDA
Section 3: Understanding EDA
Lecture 3 Dependent and Independent Variables & Data Type
Lecture 4 Null Values and Encoding
Lecture 5 Outliers and Data Transformation
Lecture 6 Multi Collinearity
Lecture 7 Imbalanced Dataset
Lecture 8 Data Scaling
Section 4: Data Analysis Using Pandas
Lecture 9 Getting Started with Pandas
Lecture 10 Data Analysis Using Pandas
Section 5: Code Walkthrough for EDA
Lecture 11 Code Walkthrough
Section 6: Assignment
Section 7: Deep Dive into Bivariate Analysis
Lecture 12 Bivariate Analysis - Continuous & Continuous
Lecture 13 Conditional Scatter Plots and Heatmap Using Seaborn: Advanced Data Visualization
Lecture 14 Bivariate Analysis - Categorical and Continuous
Lecture 15 Bivariate Analysis - Categorical and Categorical
Section 8: Addressing an imbalanced dataset
Lecture 16 Addressing an imbalanced dataset
Section 9: EDA Apps/Libraries - Klib, Sweetviz
Lecture 17 EDA Apps/Libraries - Klib, Sweetviz
Section 10: Create EDA App Using Streamlit
Lecture 18 Context Setting
Lecture 19 Infrastructure for Streamlit
Lecture 20 Creating a very simple web app and Getting started with streamlit
Lecture 21 Header and Sub Header
Lecture 22 Reading and displaying contents of a file
Lecture 23 Uploading a file
Lecture 24 EDA app
Section 11: Quiz
Section 12: Bonus Lecture
Lecture 25 Bonus Lecture
Data Scientists, Python Programmers, ML Practitioners, IT Managers managing data science projects,Beginners in Machine Learning