Complete Guide to Evaluating Large Language Models (LLMs)

Posted By: IrGens

Complete Guide to Evaluating Large Language Models (LLMs)
.MP4, AVC, 1280x720, 30 fps | English, AAC, 2 Ch | 7h 56m | 1.32 GB
Instructor: Sinan Ozdemir

In this comprehensive course, AI and LLM expert Sinan Ozdemir shares with you the knowledge and skills to assess LLM performance effectively. Get a detailed introduction to the process of evaluating LLMs, Multimodal AI, and AI-powered applications like agents and RAG. Learn how to thoroughly assess and evaluate these powerful and often unwieldy AI tools so you can make sure they meet your real-world needs. This course prepares you to evaluate and optimize LLMs so you can produce cutting edge AI applications.

Learning objectives

  • Distinguish between generative and understanding tasks.
  • Apply key metrics for common tasks.
  • Evaluate multiple-choice tasks.
  • Evaluate free text response tasks.
  • Evaluate embedding tasks.
  • Evaluate classification tasks.
  • Build an LLM classifier with BERT and ChatGPT.
  • Evaluate LLMs with benchmarks.
  • Probe LLMs.
  • Fine-tune LLMs.
  • Evaluate and clean data.
  • Evaluate AI agents.
  • Evaluate retrieval-augmented generation (RAG) systems.
  • Evaluate a recommendation engine.
  • Use evaluation to combat AI drift.