Tags
Language
Tags
December 2024
Su Mo Tu We Th Fr Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31 1 2 3 4

How To Benchmark Machine Learning Models

Posted By: ELK1nG
How To Benchmark Machine Learning Models

How To Benchmark Machine Learning Models
Published 12/2024
MP4 | Video: h264, 1920x1080 | Audio: AAC, 44.1 KHz
Language: English | Size: 2.18 GB | Duration: 5h 3m

Master the art of benchmarking Machine learning models for any usage from Generative AI to narrow ai as computer vision

What you'll learn

What is Machine Learning benchmarking and how does it work

Standard Metrics used in AI ( Reliability, F1 Score, Recall)

Run a test through an API

How to run a benchmark against GLUE Metric

How to run a benchmark against BLUE Metric

MMLU (Massive Multitask Language Understanding) Benchmarking

TruthfulQA -Evaluation of Truthfulness in Language Models

Run Benchmark against SQuAD (Stanford Question Answering Dataset)

Understand the AI Model Lifecycle

Perplexity and Bias Benchmarking

Benchmark Against AI Fairness- Bias in Bios

Usage of HuggingFace models for benchmark and training

Computer Vision benchmark with CIFAR 10 dataset

Requirements

some python programming experience, you can also do without

basic understanding of AI Principles

Desire to learn the hottest skill on the market

5$ API Credits for OPEN AI - optional, you can use free models

VS Code, Postman, Python, Node

Description

This comprehensive course delves into the essential practices, tools, and datasets for AI model benchmarking. Designed for AI practitioners, researchers, and developers, this course provides hands-on experience and practical insights into evaluating and comparing model performance across tasks like Natural Language Processing (NLP) and Computer Vision.What You’ll Learn:Fundamentals of Benchmarking:Understanding AI benchmarking and its significance.Differences between NLP and CV benchmarks.Key metrics for effective evaluation.Setting Up Your Environment:Installing tools and frameworks like Hugging Face, Python, and CIFAR-10 datasets.Building reusable benchmarking pipelines.Working with Datasets:Utilizing popular datasets like CIFAR-10 for Computer Vision.Preprocessing and preparing data for NLP tasks.Model Performance Evaluation:Comparing performance of various AI models.Fine-tuning and evaluating results across benchmarks.Interpreting scores for actionable insights.Tooling for Benchmarking:Leveraging Hugging Face and OpenAI GPT tools.Python-based approaches to automate benchmarking tasks.Utilizing real-world platforms to track performance.Advanced Benchmarking Techniques:Multi-modal benchmarks for NLP and CV tasks.Hands-on tutorials for improving model generalization and accuracy.Optimization and Deployment:Translating benchmarking results into practical AI solutions.Ensuring robustness, scalability, and fairness in AI models.Hands-On Modules:Implementing end-to-end benchmarking pipelines.Exploring CIFAR-10 for image recognition tasks.Comparing supervised, unsupervised, and fine-tuned model performance.Leveraging industry tools for state-of-the-art benchmarking

Overview

Section 1: Introduction

Lecture 1 Introduction

Lecture 2 About your Instructor

Lecture 3 5 minute AI Benchmark Challenge

Section 2: What is benchmarking and how does it work

Lecture 4 Additional Testing required for AI

Lecture 5 How basic Model Tuning works

Lecture 6 How Benchmarking Works

Lecture 7 The Myth of the all Powerful AI Model

Section 3: Introduction to AI - Optional if you know the basics of AI

Lecture 8 What makes up AI

Lecture 9 Natural Language Processing - NLP

Lecture 10 Types of Machine Learning

Lecture 11 Machine Learning - Supervised ML

Lecture 12 Machine Learning - Unsupervised ML

Lecture 13 Machine Learning - Reinforced ML

Lecture 14 Importance of Training Data

Lecture 15 What is a token in LLMs

Lecture 16 Weak AI vs Gen AI vs AGI - Know the difference

Section 4: Setting up the Environment

Lecture 17 Install VS Code

Lecture 18 Installing Python

Lecture 19 Install Python Dependencies - PIP

Lecture 20 Installing Conda - Environment Isolator tool

Lecture 21 Install NodeJS and NPM

Lecture 22 Clone the Repository on your machine

Lecture 23 Create CHAT GPT Subscription

Lecture 24 Get OPEN AI API KEY

Section 5: Hugging Face Platform - AI Engineer repo

Lecture 25 Introduction to Hugging Face Community Page

Lecture 26 Hugging Face Transformers Python Package

Lecture 27 How to load and use any model from Huggingface

Lecture 28 Hugging Face Evaluate Python Package

Section 6: Common Traditional Metrics for LLMs ML Model and how to calculate them

Lecture 29 Ground Truth Table - source of Truth | Test Oracle

Lecture 30 Machine Learning Metrics - Accuracy for LLMs

Lecture 31 Machine Learning Metrics -Precision of LLMs

Lecture 32 Machine Learning Metrics -Recall in LLMs

Lecture 33 Machine Learning Metrics -F1 Score for LLMs

Lecture 34 Machine Learning Metrics -Perplexity for LLMs

Lecture 35 Demo - PyTorch - Calculate Perplexity for a Model

Section 7: GLUE - Benchmark against NLP

Lecture 36 What is GLUE NLP Benchmark

Lecture 37 What are the 11 benchmark Tasks of GLUE

Lecture 38 How to run a GLUE benchmark test

Lecture 39 Glue benchmarking on Bert Huggingface Model

Lecture 40 Demo - Python GLUE benchmark against GHATGPT

Section 8: AI Fairness - Bias in Bio benchmarking

Lecture 41 About Bias in Bios evaluation benchmark

Lecture 42 Model Selection and Download

Lecture 43 Data Download and Tokenization

Lecture 44 Run the benchmark and get results

Section 9: Evaluation Metric for Machine Translation

Lecture 45 What is BLEU

Lecture 46 Demo - Benchmark Ghat GPT for BLEU Score

Lecture 47 What is TER

Lecture 48 Demo - TER Metric Calculation on CHATGPT

Section 10: TruthfulQA -Evaluation of Truthfulness in Language Models

Lecture 49 TruthfulQA: Measuring How Models Mimic Human Falsehoods

Lecture 50 Demo TruthfulQA- Model Benchmaking

Lecture 51 What is Hellaswag NLU Benchmark

Lecture 52 Demo - Hellaswag Evaluation - Python - Tensorflow

Section 11: Evaluate the toxicity of a model

Lecture 53 What is PerspectiveAPI

Lecture 54 Get a Perspective API Key

Lecture 55 Installing Postman and first API Test

Lecture 56 Demo - VS Code - Call Perspective API

Lecture 57 Demo - Python - Test CHATGPT Response against Perspective APIs

Lecture 58 Test the Toxicity of a Hugging face model -GPT2

Section 12: CIFAR10 - Image Classification

Lecture 59 Intro to CIFAR-10 Computer Vision benchmarking

Lecture 60 Download Dataset and Model for Test

Lecture 61 Run the Benchmark - model not fine tuned

Lecture 62 Understand the result - Read the confusion Matrix

Lecture 63 Train the model and redo benchmark

Section 13: Conclusions

Lecture 64 Conclusion

AI Engineers,AI Project Managers,ML Testers,AI Testers,Production Owners that work with AI