How To Benchmark Machine Learning Models
Published 12/2024
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 2.18 GB | Duration: 5h 3m
Published 12/2024
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 2.18 GB | Duration: 5h 3m
Master the art of benchmarking Machine learning models for any usage from Generative AI to narrow ai as computer vision
What you'll learn
What is Machine Learning benchmarking and how does it work
Standard Metrics used in AI ( Reliability, F1 Score, Recall)
Run a test through an API
How to run a benchmark against GLUE Metric
How to run a benchmark against BLUE Metric
MMLU (Massive Multitask Language Understanding) Benchmarking
TruthfulQA -Evaluation of Truthfulness in Language Models
Run Benchmark against SQuAD (Stanford Question Answering Dataset)
Understand the AI Model Lifecycle
Perplexity and Bias Benchmarking
Benchmark Against AI Fairness- Bias in Bios
Usage of HuggingFace models for benchmark and training
Computer Vision benchmark with CIFAR 10 dataset
Requirements
some python programming experience, you can also do without
basic understanding of AI Principles
Desire to learn the hottest skill on the market
5$ API Credits for OPEN AI - optional, you can use free models
VS Code, Postman, Python, Node
Description
This comprehensive course delves into the essential practices, tools, and datasets for AI model benchmarking. Designed for AI practitioners, researchers, and developers, this course provides hands-on experience and practical insights into evaluating and comparing model performance across tasks like Natural Language Processing (NLP) and Computer Vision.What You’ll Learn:Fundamentals of Benchmarking:Understanding AI benchmarking and its significance.Differences between NLP and CV benchmarks.Key metrics for effective evaluation.Setting Up Your Environment:Installing tools and frameworks like Hugging Face, Python, and CIFAR-10 datasets.Building reusable benchmarking pipelines.Working with Datasets:Utilizing popular datasets like CIFAR-10 for Computer Vision.Preprocessing and preparing data for NLP tasks.Model Performance Evaluation:Comparing performance of various AI models.Fine-tuning and evaluating results across benchmarks.Interpreting scores for actionable insights.Tooling for Benchmarking:Leveraging Hugging Face and OpenAI GPT tools.Python-based approaches to automate benchmarking tasks.Utilizing real-world platforms to track performance.Advanced Benchmarking Techniques:Multi-modal benchmarks for NLP and CV tasks.Hands-on tutorials for improving model generalization and accuracy.Optimization and Deployment:Translating benchmarking results into practical AI solutions.Ensuring robustness, scalability, and fairness in AI models.Hands-On Modules:Implementing end-to-end benchmarking pipelines.Exploring CIFAR-10 for image recognition tasks.Comparing supervised, unsupervised, and fine-tuned model performance.Leveraging industry tools for state-of-the-art benchmarking
Overview
Section 1: Introduction
Lecture 1 Introduction
Lecture 2 About your Instructor
Lecture 3 5 minute AI Benchmark Challenge
Section 2: What is benchmarking and how does it work
Lecture 4 Additional Testing required for AI
Lecture 5 How basic Model Tuning works
Lecture 6 How Benchmarking Works
Lecture 7 The Myth of the all Powerful AI Model
Section 3: Introduction to AI - Optional if you know the basics of AI
Lecture 8 What makes up AI
Lecture 9 Natural Language Processing - NLP
Lecture 10 Types of Machine Learning
Lecture 11 Machine Learning - Supervised ML
Lecture 12 Machine Learning - Unsupervised ML
Lecture 13 Machine Learning - Reinforced ML
Lecture 14 Importance of Training Data
Lecture 15 What is a token in LLMs
Lecture 16 Weak AI vs Gen AI vs AGI - Know the difference
Section 4: Setting up the Environment
Lecture 17 Install VS Code
Lecture 18 Installing Python
Lecture 19 Install Python Dependencies - PIP
Lecture 20 Installing Conda - Environment Isolator tool
Lecture 21 Install NodeJS and NPM
Lecture 22 Clone the Repository on your machine
Lecture 23 Create CHAT GPT Subscription
Lecture 24 Get OPEN AI API KEY
Section 5: Hugging Face Platform - AI Engineer repo
Lecture 25 Introduction to Hugging Face Community Page
Lecture 26 Hugging Face Transformers Python Package
Lecture 27 How to load and use any model from Huggingface
Lecture 28 Hugging Face Evaluate Python Package
Section 6: Common Traditional Metrics for LLMs ML Model and how to calculate them
Lecture 29 Ground Truth Table - source of Truth | Test Oracle
Lecture 30 Machine Learning Metrics - Accuracy for LLMs
Lecture 31 Machine Learning Metrics -Precision of LLMs
Lecture 32 Machine Learning Metrics -Recall in LLMs
Lecture 33 Machine Learning Metrics -F1 Score for LLMs
Lecture 34 Machine Learning Metrics -Perplexity for LLMs
Lecture 35 Demo - PyTorch - Calculate Perplexity for a Model
Section 7: GLUE - Benchmark against NLP
Lecture 36 What is GLUE NLP Benchmark
Lecture 37 What are the 11 benchmark Tasks of GLUE
Lecture 38 How to run a GLUE benchmark test
Lecture 39 Glue benchmarking on Bert Huggingface Model
Lecture 40 Demo - Python GLUE benchmark against GHATGPT
Section 8: AI Fairness - Bias in Bio benchmarking
Lecture 41 About Bias in Bios evaluation benchmark
Lecture 42 Model Selection and Download
Lecture 43 Data Download and Tokenization
Lecture 44 Run the benchmark and get results
Section 9: Evaluation Metric for Machine Translation
Lecture 45 What is BLEU
Lecture 46 Demo - Benchmark Ghat GPT for BLEU Score
Lecture 47 What is TER
Lecture 48 Demo - TER Metric Calculation on CHATGPT
Section 10: TruthfulQA -Evaluation of Truthfulness in Language Models
Lecture 49 TruthfulQA: Measuring How Models Mimic Human Falsehoods
Lecture 50 Demo TruthfulQA- Model Benchmaking
Lecture 51 What is Hellaswag NLU Benchmark
Lecture 52 Demo - Hellaswag Evaluation - Python - Tensorflow
Section 11: Evaluate the toxicity of a model
Lecture 53 What is PerspectiveAPI
Lecture 54 Get a Perspective API Key
Lecture 55 Installing Postman and first API Test
Lecture 56 Demo - VS Code - Call Perspective API
Lecture 57 Demo - Python - Test CHATGPT Response against Perspective APIs
Lecture 58 Test the Toxicity of a Hugging face model -GPT2
Section 12: CIFAR10 - Image Classification
Lecture 59 Intro to CIFAR-10 Computer Vision benchmarking
Lecture 60 Download Dataset and Model for Test
Lecture 61 Run the Benchmark - model not fine tuned
Lecture 62 Understand the result - Read the confusion Matrix
Lecture 63 Train the model and redo benchmark
Section 13: Conclusions
Lecture 64 Conclusion
AI Engineers,AI Project Managers,ML Testers,AI Testers,Production Owners that work with AI