Concept & Coding Llm Transformer,Attention, Deepseek Pytorch
Published 3/2025
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 1.28 GB | Duration: 3h 37m
Published 3/2025
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 1.28 GB | Duration: 3h 37m
How does LLMs works, Understand Concept & Coding of Transformer,Attention, Deepseek using pytorch
What you'll learn
Learn how attention helps models focus on important text parts.
Understand transformers, self-attention, and multi-head attention mechanisms.
Explore how LLMs process, tokenize, and generate human-like text.
Study DeepSeek’s architecture and its optimizations for efficiency.
Explore the transformer architecture
Requirements
python
Description
Welcome to this comprehensive course on how Large Language Models (LLMs) work! In recent years, LLMs have revolutionized the field of artificial intelligence, powering applications like ChatGPT, DeepSeek, and other advanced AI assistants. But how do these models understand and generate human-like text? In this course, we will break down the fundamental concepts behind LLMs, including attention mechanisms, transformers, and modern architectures like DeepSeek.We will start by exploring the core idea of attention mechanisms, which allow models to focus on the most relevant parts of the input text, improving contextual understanding. Then, we will dive into transformers, the backbone of LLMs, and analyze how they enable efficient parallel processing of text, leading to state-of-the-art performance in natural language processing (NLP). You will also learn about self-attention, positional encodings, and multi-head attention, key components that help models capture long-range dependencies in text.Beyond the basics, we will examine DeepSeek, a cutting-edge open-weight model designed to push the boundaries of AI efficiency and performance. You’ll gain insights into how DeepSeek optimizes attention mechanisms and what makes it a strong competitor to other LLMs.By the end of this course, you will have a solid understanding of how LLMs work, how they are trained, and how they can be fine-tuned for specific tasks. Whether you're an AI enthusiast, a developer, or a researcher, this course will equip you with the knowledge to work with and build upon the latest advancements in deep learning and NLP. Let’s get started!
Overview
Section 1: Introduction
Lecture 1 Introduction to Course
Section 2: Introduction to Transformer
Lecture 2 AI History
Lecture 3 Language as bag of Words
Section 3: Transformer Embedding
Lecture 4 Word embedding
Lecture 5 Vector Embedding
Lecture 6 Types of Embedding
Section 4: Transformer -Encoder Decoder context
Lecture 7 Encoding Decoding context
Lecture 8 Attention Encoder Decoder context
Section 5: Transformer Architecture
Lecture 9 Transformer Architecture with Attention
Lecture 10 GPT vs Bert Model
Lecture 11 Context length and number of Parameter
Section 6: Transformer -Tokenization code
Lecture 12 Tokenization
Lecture 13 Code Tokenization
Section 7: Transformer model and block
Lecture 14 Transformer architecture
Lecture 15 Transformer block
Section 8: Transformer coding
Lecture 16 Decoder Transformer setup and code
Lecture 17 Tranformer model download
Lecture 18 Transformer model code architecture
Lecture 19 Transforme model summary
Lecture 20 Transformer code generate token
Section 9: Attention-Intro
Lecture 21 Transformer attention
Lecture 22 Word embedding
Lecture 23 Positional encoding
Section 10: Attention-Maths
Lecture 24 Attention Math Intro
Lecture 25 Attention Query,Key,Value example
Lecture 26 Attention Q,K,V transformer
Lecture 27 Encoded value
Lecture 28 Attention formulae
Lecture 29 Calculate Q,K transpose
Lecture 30 Attention softmax
Lecture 31 Why multiply by V in attention
Section 11: Attention-code
Lecture 32 Attention code overview
Lecture 33 Attention code
Lecture 34 Attention code Part2
Section 12: Mask Self Attention
Lecture 35 Mask self attention
Section 13: Mask Self Attention code
Lecture 36 Mask Self Attention code overview
Lecture 37 Mask Self Attention code
Section 14: Multimodal Attention
Lecture 38 Encoder decoder transformer
Lecture 39 Types of Transformer
Lecture 40 Multimodal attention
Section 15: Multi-Head Attention
Lecture 41 Multi-Head Attention
Lecture 42 Multi-Head Attention Code Part1
Section 16: Multi-Head Attention code
Lecture 43 Multihead attention code overview
Lecture 44 Multi-head attention encoder decoder attention code
Section 17: Deepseek R1 and R1-zero
Lecture 45 Deepseek R1 training
Lecture 46 Deepseek R1-zero
Lecture 47 Deepseek R1 Architecture
Lecture 48 Deepseek R1 Paper
Section 18: Deepseek R1 Paper
Lecture 49 Deepseek R1 paper Intro
Lecture 50 Deepseek R1 Paper Aha moments
Lecture 51 Deepseek R1 Paper Aha moments Part 2
Section 19: Bonus lecture
Lecture 52 Deepseek R1 summary
Generative AI enthusiasts