Python Statistical Methods: Machine Learning & Data Science
Published 10/2022
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
Language: English | Size: 2.30 GB | Duration: 7h 21m
Published 10/2022
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
Language: English | Size: 2.30 GB | Duration: 7h 21m
Practical Statistics with Python for Data Science & Machine Learning Modeling Using Sci-kit Learn and Scipy
What you'll learn
You will learn to use data exploratory analysis in data science.
You will learn the most common data types such as continuous and categorical data.
You will learn the central tendency measures and the dispersion measures in statistics.
You will learn the concepts of population data vs sample data.
You will learn what random sampling means and how it affects data analysis.
You will learn about outliers and sampling errors and how they are related to data analysis.
You will learn how to visualize data distribution using boxplots, violin plots, histograms, and density plots.
You will learn how to visualize categorical data using bar plots and pie charts.
You will learn how to calculate correlation and covariance between features in the dataset.
You will learn how to visualize a correlation matrix using heat maps.
You will learn the most common probability distributions such as normal distribution and binomial distribution.
You will learn how to perform normality tests to check for deviation from normality.
You will learn how to test skewed distributions in real-world data.
You will learn how to standardize and normalize data to have the same scale.
You will learn how to transform skewed data to be normally distributed using different transformation methods such as log, square root, and power transformation
You will learn how to calculate confidence intervals for statistical estimates such as model accuracy.
You will learn bootstrapping in statistics and how it is used in machine learning.
You will learn how to evaluate machine learning models.
You will practically understand the concepts of bias and variance in data modeling.
You will understand what we mean by underfitting and overfitting in machine leaning and statistical modeling.
You will learn the most common evaluation metrics for regression models in machine learning.
You will learn the evaluation metrics for classification models.
You will learn how to validate predictive machine learning such as regression and classification models.
You will learn how to use different validation techniques for machine learning such as hold-out validation and cross-validation techniques.
Requirements
No background in statistics is needed, everything will be explained in this course. A basic knowledge in python is helpful.
Description
This course is ideal for you if you want to gain knowledge in statistical methods required for Data Science and machine learning!Learning Statistics is an essential part of becoming a professional data scientist. Most data science learners study python for data science and ignore or postpone studying statistics. One reason for that is the lack of resources and courses that teach statistics for data science and machine learning.Statistics is a huge field of science, but the good news for data science learners is that not all statistics are required for data science and machine learning. However, this fact makes it more difficult for learners to study statistics because they are not sure where to start and what are the most relevant topics of statistics for data science.This course comes to close this gap.This course is designed for both beginners with no background in statistics for data science or for those looking to extend their knowledge in the field of statistics for data science.I have organized this course to be used as a video library for you so that you can use it in the future as a reference. Every lecture in this comprehensive course covers a single topic.In this comprehensive course, I will guide you to learn the most common and essential methods of statistics for data analysis and data modeling.My course is equivalent to a college-level course in statistics for data science and machine learning that usually cost thousands of dollars. Here, I give you the opportunity to learn all that information at a fraction of the cost! With 77 HD video lectures, many exercises, and two projects with solutions.All materials presented in this course are provided in detailed downloadable notebooks for every lecture.Most students focus on learning python codes for data science, however, this is not enough to be a proficient data scientist. You also need to understand the statistical foundation of python methods. Models and data analysis can be easily created in python, but to be able to choose the correct method or select the best model you need to understand the statistical methods that are used in these models. Here are a few of the topics that you will be learning in this comprehensive course:· Data Types and Structures· Exploratory Data Analysis· Central Tendency Measures· Dispersion Measures· Visualizing Data Distributions· Correlation, Scatterplots, and Heat Maps· Data Distribution and Data Sampling· Data Scaling and Transformation· Data Scaling and Transformation· Confidence Intervals· Evaluation Metrics for Machine Learning· Model Validation Techniques in Machine LearningEnroll in the course and gain the essential knowledge of statistical methods for data science today!
Overview
Section 1: Introduction
Lecture 1 Overview of Course Curriculum
Lecture 2 Installing Jupyter Notebook Environment
Lecture 3 How to Download Exercises & Course Notebooks
Section 2: Data Types and Structures
Lecture 4 Built-in Data Structures - Tuple and List
Lecture 5 Built-in Data Structures - Dictionary and Set
Lecture 6 Numpy Arrays
Lecture 7 Pandas Series and Dataframes
Lecture 8 Data Types (Numeric or Categorical)
Lecture 9 Exercise: Create Data Structures in Python
Section 3: Exploratory Data Analysis (1): Central Tendency Measures
Lecture 10 Mean (Average)
Lecture 11 Weighted Average
Lecture 12 Median
Lecture 13 Population vs. Sample
Lecture 14 Application in Data Science
Lecture 15 Exercise: Calculate Central Tendency Measures
Section 4: Exploratory Data Analysis (2): Variability Measures
Lecture 16 Range
Lecture 17 Variance and Standard Deviation
Lecture 18 Percentile & Quartile
Lecture 19 Outlier – part 1
Lecture 20 Outlier – part 2
Lecture 21 Sampling Error
Lecture 22 Application in Data Science
Lecture 23 Exercise: Calculate Variability Measures
Section 5: Visualizing Data Distributions
Lecture 24 Box Plot
Lecture 25 Violin Plot
Lecture 26 Histogram and Density Plot
Lecture 27 Bar Plot for Categorical Data
Lecture 28 Pie Chart for Categorical Data
Lecture 29 Application in Data Science
Lecture 30 Exercise: Exploring Data Distribution
Section 6: Correlation, Scatterplots, and Heat Maps
Lecture 31 Correlation and Covariance Coefficients
Lecture 32 Correlation Using Scatter plot
Lecture 33 Mapping with Scatter plots
Lecture 34 Heat Maps
Lecture 35 Application in Data Science
Lecture 36 Exercise: Create Mapped Scatterplots and Heat Maps
Section 7: Capstone Project for Exploratory Analysis
Lecture 37 Project Description
Lecture 38 Solution walk-through of The Project
Section 8: Data Distributions and Data Sampling
Lecture 39 Random Sampling and Bias
Lecture 40 Central Limit Theorem
Lecture 41 Normal distribution
Lecture 42 Normality Tests for Real-World Data
Lecture 43 Skewed Data: Real-life Distributions
Lecture 44 Probability: A Practical Introduction
Lecture 45 Common Probability Distributions
Lecture 46 Exercise: Normal Distribution and Skewness
Section 9: Data Scaling and Transformation
Lecture 47 Data Scaling: Standardization
Lecture 48 Data Scaling: Normalization
Lecture 49 Log and Square Root Transformations
Lecture 50 Power Transformation (PowerTransformer)
Lecture 51 Application in Data Science
Lecture 52 Exercise: Data Scaling and Transformation
Section 10: Confidence Intervals (CI)
Lecture 53 C.I for Continuous Data
Lecture 54 C.I for Classification Data
Lecture 55 Bootstrapping For Unknown Distributions
Lecture 56 Nonparametric Confidence Interval with Bootstrapping
Lecture 57 Exercise: Create Confidence Interval
Section 11: Evaluation Metrics for Machine Learning
Lecture 58 Bias vs. Variance
Lecture 59 Overfitting and Underfitting
Lecture 60 Information Criteria for Model Selection
Lecture 61 Evaluation Metrics for Regression Models
Lecture 62 Evaluation Metrics for Classification Models _Part One
Lecture 63 Evaluation Metrics for Classification Models – Part Two
Lecture 64 Application in Data Science
Lecture 65 Exercise: Evaluating Machine Learning Models
Section 12: Model Validation Techniques in Machine Learning
Lecture 66 Hold Out Validation - Train/Test Split
Lecture 67 K-Fold Cross-Validation
Lecture 68 Leave-One-Out Cross-Validation (LOOCV)
Lecture 69 Application in Data Science
Lecture 70 Exercise: Validation Techniques in Machine Learning
Section 13: Final project
Lecture 71 Project Description
Lecture 72 Walk-through Solution of the Project – Part One
Lecture 73 Walk-through Solution of the Project – Part Two
Lecture 74 Walk-through Solution of the Project – Part Three
This course is for students who want to learn statistics from data science perspective.