Statistics With R: Core Concepts & Applications
Published 3/2024
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 1.79 GB | Duration: 5h 48m
Published 3/2024
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 1.79 GB | Duration: 5h 48m
Discovering Statistical Analysis: Exploring Essential Skills and Concepts
What you'll learn
Learn the essentials of R programming, including installation, setup, and exploring datasets for effective data analysis.
Understand the concept of subjects within a population and their relevance in statistical analysis.
Explore five types of statistical questions and their applications in summarizing, comparing, and predicting data.
Differentiate between categorical and quantitative data and understand their significance in statistical analysis.
Gain insights into both descriptive and inferential statistics and their usage in analyzing sample and population data.
Explore variable distribution and frequency tables to gain insights into data patterns.
Learn to visualize categorical and quantitative data distributions using various graphical representations.
Understand the different shapes of distributions for quantitative variables and their implications.
Learn methods to describe the center of quantitative data, including mean, median, and mode.
Explore measures of variability, including range and standard deviation, to understand data spread.
Gain insights into the empirical rule for understanding data distribution and identifying outliers.
Understand percentiles and quartiles and their significance in summarizing data variability.
Explore the relationship between different variables, including categorical and quantitative variables, and understand correlation analysis.
Learn predictive analysis techniques to make informed predictions based on data patterns and trends.
Requirements
The prerequisites for taking this course include a willingness to learn. No prior experience with statistical analysis or programming is required. Whether you're a beginner or seeking to enhance your skills, this course offers a solid foundation in statistical analysis techniques using R programming.
Description
Discover the world of statistical analysis in our comprehensive course, designed to cover key concepts and skills. Starting with R programming, we'll explore its significance and how to set it up for statistical computing. Then, we'll delve into understanding subjects within populations and different types of statistical questions. Moving forward, we'll learn about descriptive and inferential statistics, data types, and visualization techniques. We'll explore the distribution of variables, visualize distributions using graphs, and understand the shape of distributions. We'll also cover topics such as measuring the center and variability of quantitative data, empirical rules, percentiles, quartiles, and their graphical representation using box plots. Additionally, we'll explore the relationship between variables, including categorical and quantitative variables, and delve into correlation analysis. Through engaging modules, you'll gain practical skills and knowledge essential for effective statistical analysis in various settings.Course Outline:1: R ProgrammingIn this section, we'll explore the world of statistical computing with R, an essential tool in data science. We'll learn why R is a fantastic choice—it's free and widely used in machine learning. We'll cover how to download and set up R and RStudio on different operating systems, ensuring everyone can follow along. Then, we'll explore the RStudio interface and learn how to create a new project. We'll also discuss R packages, which extend R's capabilities, and how to install and load them. In this section, we'll start by installing R and RStudio., and then we will learn the basics of R programming.we'll delve into R comments and datasets. We'll explore datasets stored in data frames, covering variables like integers, numerics, and factors (categorical). Understanding these concepts will aid in effective data analysis in R.By the end, you'll be ready to embark on your statistical journey using R!2: Subjects in the population.In this section, the data outlines the concept of subjects within a population for statistical analysis. Subjects can range from people to various objects, such as orange trees, cars, or chickens, depending on the research focus. Examples illustrate the diverse nature of subjects in statistical studies.3: Statistical QuestionStatistical questions are categorized into five types: Descriptive, Comparative, Relationship, Causal, and Predictive. Descriptive questions aim to summarize data, while Comparative questions compare groups. Relationship questions explore connections between variables, while Causal questions investigate direct causation. Predictive questions use data summaries to make predictions. Examples illustrate each question type's purpose and application4: Types of Data.In this section, we will look at how the answers to our statistical questions can be divided into two types of data: categorical data and quantitative data. We will also explore their subtypes and understand why it is essential to classify data into these two categories.5: Descriptive and inferential statistics.Here We will discuss when we have a sample from the population, we need both descriptive and inferential statistics.However, if we have data for the entire population, we only need to use descriptive statistics. There's no need for inferential statistics in such cases.We'll use real-life examples, like surveys from the General Social Survey website and the UK age pyramid from the 2020 census, to understand these concepts.7: Distribution of a variable and frequency tableAfter that We're going to explore the concept of variable distribution and how frequency tables can help us see this distribution clearly. To understand this, we'll use survey data from the General Social Survey website, look at an age pyramid based on UK census data, and analyze information from the Titanic dataset on Wikipedia.8: Visualize the distribution of a variable using graphs.we will create bar graphs and pie charts to show the distribution of categorical variables using the Titanic dataset. For the discrete variable, we will make a histogram using the GSS survey data. Lastly, we will create a histogram for a continuous variable using the Titanic dataset.9: The shape of the distribution.We're going to talk about different shapes of distributions for quantitative variables. Imagine distributions like hills or valleys. There are three types: one with a single hill in the middle, another where one side stretches out longer than the other, and one where there are two hills. We'll learn about these using examples like heights of people, survey results, and how people rate products. It's like looking at different patterns in numbers.10: Center of Quantitative Data.There are three ways to describe the center of quantitative data: mean, median, and mode. We will examine the population mean and median using a hypothetical population of female heights, and then we'll explore the sample mean and median using a simulated sample from this same hypothetical female height population.11: Measuring the variability of Quantitative data.The simplest way of describing the variability of the quantitative data is range which is easily affected by outliers. The more robust measure for variability is standard deviation. We will discuss how most of the times sample variability underestimates the population variability that’s why we tweak the sample standard deviation formula a little bit.We will also learn how we can generate a hypothetical normal distribution just by using two values mean and standard deviation.12: Empirical Rule.Then we will discuss empirical rule that tells us using standard deviation we can find out how much data falls at different standard deviations away from the mean in a normal distribution. Then we will find out what makes an observation an outlier. At the end we will see what could be the plausible values of the standard deviation.13: Percentiles and quartiles.In this section we will discuss percentiles and quartiles using growth charts, Growth charts are special kinds of charts that help us understand how kids are growing by comparing their height and weight to other children of the same age and gender.We will also discuss sat exam scores, which is another good example of using percentiles to compare how well students perform as compared to each other.Then we will summaries the entire dataset using 5 number which is called a 5 number summary. At the end we will see another way of measuring variability which is called interquartile range which does not get effected by outliers like range and standard deviation and also its graphical representation box plot.14: Relationship Between VariablesUp until now, we've focused on individual variables, examining their types, distributions, and methods for visual representation using graphs. However, it's time to explore how two variables relate to each other.There are three primary types of relationships:The relationship between two categorical variables.The relationship between two quantitative variables.The relationship between a categorical variable and a quantitative variable.Furthermore, we will delve into the concept of correlation, which not only reveals the direction but also measures the strength of the relationship between two quantitative variables.
Overview
Section 1: Introduction
Lecture 1 Introduction
Lecture 2 What you will learn in this tutorial
Section 2: R Programming
Lecture 3 Why Do We Need R?
Lecture 4 What Is R and R Studio?
Lecture 5 R Installation
Lecture 6 RStudio Installation
Lecture 7 R Studio Interface (Console and Help Tab)
Lecture 8 R Studio Interface (File, Packages, Plot, and Environment Tabs)
Lecture 9 Create a New Project in R
Lecture 10 Download Code and Data Files for This Project
Lecture 11 R Packages
Lecture 12 Install a Package in R
Lecture 13 Load a Package in R
Lecture 14 R Data Sets
Lecture 15 Broom in R Studio
Lecture 16 Get the Feel of the Data
Lecture 17 Get the Feel of the Data: View()
Lecture 18 Get the Feel of the Data: glimpse()
Lecture 19 Types of Variables in R
Lecture 20 Types of Variables in R: Integers
Lecture 21 Types of Variables in R: Numerics
Lecture 22 Types of Variables in R: Factors or Categorical Variables
Lecture 23 Types of Variables in R: Types of Factors or Categorical Variables
Lecture 24 Errors, Warnings, and Messages
Lecture 25 Error, Warning, and Messages: Errors
Lecture 26 Error, Warning, and Messages: Information Messages
Lecture 27 Error, Warning, and Messages: Warning Messages
Lecture 28 Error, Warning, and Messages: A Quick Recap
Section 3: Subjects in the Population
Lecture 29 Subjects in a Population
Section 4: Statistical Questions
Lecture 30 Statistical Questions
Lecture 31 Types of Statistical Questions
Lecture 32 Descriptive Questions
Lecture 33 Comparative Questions
Lecture 34 Relationship Questions
Lecture 35 Causal Questions
Lecture 36 Predictive Questions
Section 5: Types of Data
Lecture 37 Types of Data
Lecture 38 Categorical Data
Lecture 39 Nominal Categorical Data
Lecture 40 Ordinal Categorical Data
Lecture 41 Dichotomous Categorical Data
Lecture 42 Quantitative Data
Lecture 43 Discrete Quantitative Data
Lecture 44 Continuous Quantitative Data
Lecture 45 Why It Is Important to Classify Variables
Section 6: Descriptive and Inferential Statistics
Lecture 46 Descriptive and Inferential Statistics
Lecture 47 Descriptive Statistics
Lecture 48 General Social Survey
Lecture 49 Real World Example of Descriptive Statistics: GSS Survey
Lecture 50 Descriptive Statistics for the Entire Population
Lecture 51 Inferential Statistics
Lecture 52 Sample Statistics and Population Parameters
Section 7: Distribution of a Variable and Frequency Table
Lecture 53 Distribution of a variable and Frequency table
Lecture 54 Distribution of a Categorical Variable
Lecture 55 Frequency Table for Categorical Variables
Lecture 56 Understanding Relative Frequency: Proportions and Percentages
Lecture 57 Distribution of Quantitative Variables
Lecture 58 Frequency Table for Discrete Quantitative Variables
Lecture 59 Frequency table for discrete Var: Hours Per Day Watching TV (Limited Outcomes)
Lecture 60 Frequency table for discrete Var: Ideal Number of Kids (Limited Outcomes)
Lecture 61 Frequency table for discrete Var: Math Exam Scores (Wide Range of Outcomes)
Lecture 62 Frequency Table: Continuous Quantitative Variables
Lecture 63 Frequency Table for Continuous Variables: Age in Census
Lecture 64 Calculate Proportion and Percentages in Excel
Lecture 65 Task 1: Frequency Table for Discrete Quantitative Variable in Excel
Lecture 66 Task 2: Frequency Table for Categorical Variable in Excel
Lecture 67 Titanic Dataset
Lecture 68 Loading Titanic Data Set from Excel into R
Lecture 69 Getting to Know the Titanic Dataset: Exploring Variables
Lecture 70 Frequency Table for Categorical Variables in R
Lecture 71 Proportions and Percentages in Frequency Table in R
Lecture 72 Pipe Operator
Lecture 73 Discrete Data
Lecture 74 General Social Survey: Number of Kids
Lecture 75 Recreating GSS Survey Data in Excel: Ideal Number of Kids
Lecture 76 Frequency Table for Discrete Data 1: Loading Number of Kids Data in R
Lecture 77 Frequency Table for Discrete Data 2: Creating a Frequency Table
Lecture 78 Frequency Table for Continuous Data 1: Loading Titanic Data In R
Lecture 79 Frequency Table for Continuous Data 2: Calculating Range
Lecture 80 Frequency Table for Continuous Data 3: Grouping Passengers by Age Group
Lecture 81 Frequency Table for Continuous Data 4: Left-Closed and Right-Open Intervals
Lecture 82 Frequency Table for Continuous Data 5: Creating a Frequency Table
Lecture 83 Frequency Table for Continuous Data 6: Missing Values NAs
Section 8: Visualizing the Distribution of a Variable Using Graphs
Lecture 84 Graphs
Lecture 85 Bar Graph
Lecture 86 Pie Chart
Lecture 87 Pie Chart in R
Lecture 88 Bar graph or Pie chart (Article )
Lecture 89 Histogram for a Discrete Variable
Lecture 90 Histogram for a Discrete Variable in R
Lecture 91 Histogram for a Continuous Variable
Lecture 92 Histogram for a Continuous Variable
Lecture 93 Histogram for a Continuous Var: Left-Closed, Right-Open: Adjusting Intervals
Lecture 94 Histogram for a Continuous Var: Dealing with Missing Values
Lecture 95 Histogram for a Continuous Var: Setting Interval Lengths in a Histogram
Section 9: The Shape of the Distribution
Lecture 96 The Shape of a Distribution
Lecture 97 Unimodal Distribution
Lecture 98 Symmetric Distribution
Lecture 99 Symmetric Distribution of Male Height: An Example
Lecture 100 Simulating Hypothetical Normal Distribution in R (Male Heights)
Lecture 101 Symmetric Distribution Histogram in R
Lecture 102 Skewed Distributions
Lecture 103 Left-Skewed Distribution
Lecture 104 Left-Skewed Distribution Histogram in R
Lecture 105 Right-Skewed Distribution
Lecture 106 Right-Skewed Distribution Histogram in R
Lecture 107 Bimodal Distributions
Lecture 108 Histogram for Bimodal Distribution
Lecture 109 Another Example of Bimodal Distribution
Lecture 110 Histogram for Bimodal Distribution 2
Lecture 111 Uniform Distribution
Lecture 112 Simulating Hypothetical Uniform Distribution in R (Dice Output)
Lecture 113 Histogram for Uniform Distribution
Section 10: Center of Quantitative Data
Lecture 114 Center of Quantitative Data
Lecture 115 Mode
Lecture 116 Mode in Symmetric Discrete Distribution
Lecture 117 Mode in Left Skewed Discrete Distribution
Lecture 118 Mode in Right Skewed Discrete Distribution
Lecture 119 Mode in Uniform Discrete Distribution
Lecture 120 Mode in Categorical Variables
Lecture 121 Mean
Lecture 122 Calculating Mean in Excel
Lecture 123 Impact of Outliers on the Mean
Lecture 124 Formula for the Mean(Article)
Lecture 125 Compute Population Mean in R
Lecture 126 Compute Sample Mean in R (Part 1)
Lecture 127 Compute Sample Mean in R (Part 2)
Lecture 128 Median
Lecture 129 Impact of Outliers on the Median
Lecture 130 Formula for Median (Article)
Lecture 131 Compute Population Median in R
Lecture 132 Compute Sample Median in R
Lecture 133 Outliers and Skewed Distributions (Article)
Lecture 134 Mean and Median in Symmetric Distribution
Lecture 135 Mean and Median in Symmetric Distribution In R
Lecture 136 Mean, Median in Skewed Distribution (Article)
Lecture 137 Mean, Median in Right Skewed Distribution In R
Lecture 138 Mean, Median in Left Skewed Distribution In R
Lecture 139 The Mean or Median? (Article)
Section 11: Measuring the Variability of Quantitative Data
Lecture 140 What is the Variability
Lecture 141 Range
Lecture 142 Standard Deviation
Lecture 143 Compute Standard Deviation Manually in Excel(Part 1)
Lecture 144 Compute Standard Deviation Manually in Excel(Part 2)
Lecture 145 Formula for Population Standard Deviation (Article)
Lecture 146 Formula for the Sample Standard Deviation (Article)
Lecture 147 Population Standard Deviation In R
Lecture 148 Sample Standard Deviation In R
Lecture 149 Sample Standard Deviation Formula Recap
Lecture 150 Calculate Standard Deviation Manually in R Using n-1 (Part 1)
Lecture 151 Calculate Standard Deviation Manually in R Using n-1 (Part 2)
Lecture 152 Sample Standard Deviation vs. Population Standard Deviation (Article)
Lecture 153 "n" versus "n-1" Mathematically (Article)
Lecture 154 Hypothetical Normal Distribution
Lecture 155 Hypothetical Normal Distribution In R
Lecture 156 Z-Score (Article)
Section 12: Empirical Rule
Lecture 157 Empirical Rule
Lecture 158 Empirical Rule in R (Part 1)
Lecture 159 Empirical Rule in R (Part 2)
Lecture 160 Understanding Data Distribution with the Empirical Rule
Lecture 161 Outliers in Normal Distributions (Article)
Lecture 162 Plausible Value of Standard Deviation (Article)
Section 13: Percentiles and Quartiles
Lecture 163 Percentiles and Quartiles
Lecture 164 Percentiles and Quartiles in R
Lecture 165 Real-Life Usage of Percentiles: Growth Charts
Lecture 166 Real-Life Usage of Percentiles: SAT Exams
Lecture 167 Making Sense of SAT Scores: The Scaling Process Simplified (Article)
Lecture 168 5-Number Summary Using Quartiles (Article)
Lecture 169 Measuring Variability Using Interquartile Range (IQR)
Lecture 170 Identifying Outliers Using Interquartile Range (IQR)
Lecture 171 Boxplot in R
Lecture 172 Creating Side-by-Side Box Plots
Lecture 173 Comparing Two Distributions Using Box Plots
Section 14: Relationship Between Variables
Lecture 174 Relationship Between Variables
Lecture 175 Response and Explanatory Variables (Article)
Lecture 176 Types of Relationship
Lecture 177 Relationship Between Two Categorical Variables
Lecture 178 Contingency Table
Lecture 179 Contingency Table In R
Lecture 180 Stacked Bar Plot (Article)
Lecture 181 Stacked Bar Plot In R
Lecture 182 Relationship between Categorical and Quantitative Variables
Lecture 183 Relationship between Two Quantitative Variables
Lecture 184 Positive Relationship (Article)
Lecture 185 Negative Relationship (Article)
Lecture 186 No Relationship (Article)
Lecture 187 Scatterplot in R
Lecture 188 Correlation (Article)
Lecture 189 Correlation In R
Section 15: Summary
Lecture 190 Summary
This course is suitable for anyone interested in learning statistical analysis techniques using R programming. Whether you're a beginner looking to acquire new skills or someone already familiar with statistical concepts seeking to deepen your knowledge, this course provides valuable insights and practical guidance.