Bioinformatics: Rna-Seq/Differential Expression In Bash & R!
Published 11/2022
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
Language: English | Size: 3.05 GB | Duration: 4h 31m
Published 11/2022
MP4 | Video: h264, 1280x720 | Audio: AAC, 44.1 KHz
Language: English | Size: 3.05 GB | Duration: 4h 31m
Carry out a RNA-seq pipeline in the command line in Colab and use R to carry out differential expression & GO analysis!
What you'll learn
The basics of Next Generation Sequencing and how it can be used for Differential gene expression analysis via RNA sequencing.
How to use RNA-seq tools from the command line. Examples done in Google Colab so everyone can follow.
Preprocessing RNA sequencing data.
Aligning the reads to a genome using Salmon.
Transcript quantification.
Differential Expression in R.
Gene ontology and Pathway analysis in R.
Ultimately understand how technologies like RNA sequencing could be used to identify specific genes that can cause certain conditions.
An explanation of how you would upload your data to a server to run a big job.
Requirements
Background knowlege of Biology and genetics
Knowledge of the composition and function of DNA & RNA.
An interest in the bioinformatics side of genetics and how it applies to understanding gene expression in different conditions.
No background knowledge in coding needed but an interest is required.
Description
Ever wonder which technologies allow researchers to discover new markers of cancer or to get a greater understanding of genetic diseases? Or even just what genes are important for cellular growth? This is usually carried out using an application of Next Generation Sequencing Technology called RNA sequencing. RNA sequencing allows you to interpret the gene expression pattern of cells. Throughout this course, you will be equipped with the tools and knowledge to not only understand but perform RNA sequencing using bash scripting and R. Discover how the transcriptome of cell changes throughout its growth cycle. To ensure that you have a full understanding of how to perform RNA sequencing yourself every step of the process will be explained! You will first learn how to use bash scripting in Google Colab to understand how to run the important RNA-sequencing tools. I will then explain what a bash script that you may upload to an HPC server would look like. We will then take the data outputted from the pipeline and move into Rstudio where you will learn how to code with the basics of R! Here you will also learn how to quality control your counts, perform differential expression analysis and perform gene ontology analysis. As an added bonus I will also show you how to map differential expression results onto Kegg pathways! Once you've completed this course you will know how to:Download publically available data from a FTP site directly to a HPC cluster. Obtain the needed raw files for genome alignment. Perform genome alignment using a tool called Salmon.Analyse the quality of your RNA-seq data using FastQC and MultiQC, while also doing a custom analysis in R.Carry out a differential expression using DESeq2 to find out what changes between a cell on day 4 Vs day 7 of growth. Carry out gene ontology analysis to understand what pathways are up and down-regulated using fgsea and clusterprofiler.Use Pathview to create annotated KEGG maps that can be used to look at specific pathways in more detail.Practical BasedThe course has one initial lecture explaining some of the basics of sequencing and what RNA sequencing can be used for. Then it's straight into the practical! Throughout the 19 lectures, you are guided step by step through the process from downloading the data to how you could potentially interpret the data at the final stages. Unlike most courses, the process is not simplistic. The project has real-world issues, such as dealing with code errors, using a non-model organism and how you can get around them with some initiative! This course is made for anyone that has an interest in Next-Generation Sequencing and the technologies currently being used to make breakthroughs in genetic and medical research! The course is also meant for beginners in RNA-seq to learn the general process and complete a full walkthrough that is applicable to their own data!
Overview
Section 1: Introduction
Lecture 1 Introduction
Section 2: Using Google Colabs bash shell function to write a RNA-Seq pipeline.
Lecture 2 Getting Started with Google Collab
Lecture 3 Downloading the raw sequence reads
Lecture 4 Installing the tools we need for RNA-seq in Google Colab
Lecture 5 Creating a transcriptome index using Salmon
Lecture 6 Aligning the genome with Salmon quant.
Lecture 7 Running Fastqc and collating reports using MultiQC
Lecture 8 Talking through the MultiQC Report
Lecture 9 A example of a Bash Script and workflow managers.
Section 3: Differential Expression and Gene Ontology in R
Lecture 10 Installing R and Rstudio
Lecture 11 Installing the required packages for the exercise
Lecture 12 Importing the Salmon abundance estimations into R using txiimport
Lecture 13 Creating a annotated dataset using BiomartR
Lecture 14 Counts and Quality Control (Part 1)
Lecture 15 Count and Quality Control (Part 2)
Lecture 16 Starting Differential Expression using DESeq2
Lecture 17 Visulising the DESeq2 results
Lecture 18 Fast Gene Set Enrichment Analysis (fgsea)
Lecture 19 KEGG Pathviews and clusterprofiler overrepresentation analysis
People who want to preform RNA-seq themselves through tools such as R and the command line.,People generally interested in new research methologies and would like to try them themselves!,People looking to learn about RNA-seq and differential gene expression.,Become independent from bioinformatic tools such as Galaxy.