Complete Computer Vision Bootcamp: YOLO to Multimodal AI

Posted By: lucky_aut

Date: Sept. 29, 2025

Complete Computer Vision Bootcamp: YOLO to Multimodal AI
Published 9/2025
Duration: 4h 13m | .MP4 1280x720 30 fps(r) | AAC, 44100 Hz, 2ch | 4.79 GB
Genre: eLearning | Language: English

Build practical applications with YOLO, DeepSORT, Streamlit, and state-of-the-art vision-language models

What you'll learn
- Getting Started with YOLO11
- YOLO11 Implementation | Google Colab
- Creating Analytical Graphs and Visualizing Data with YOLO11
- Counting Object Entries and Exits using YOLO11 and DeepSORT
- Streamlit Application: Object Detection, Segmentation & Pose Estimation
- Using Ultralytics YOLO11 with SAHI for Object Detection in Drone Footage
- Estimate Real Distance to Objects with ML Depth Pro and YOLO11
- Performing Zero-Shot Object Detection with Qwen2.5-VL
- Run Vision Tasks: Object Detection, Image Captioning & OCR with Florence 2
- Google Gemini 2.5 Pro: Detect Objects, Generate Captions & OCR

Requirements
- Basic knowledge of Python programming

Description
This course takes you from the basics of YOLO11 to advanced computer vision applications. You’ll explore object detection, segmentation, pose estimation, and image classification, while also learning to create analytical graphs and track object movements. Beyond YOLO11, you’ll build real-world projects with Streamlit, enhance detection with SAHI, estimate distances with Depth Pro, and explore cutting-edge multimodal AI models like Qwen2.5-VL, Florence 2, and Google Gemini 2.5. By the end, you’ll have hands-on experience with modern tools to solve practical computer vision challenges.

What You Will Learn:

Getting Started with YOLO11:

YOLO11 Updates and New Features

Implementing YOLO11 in Google Colab:

YOLO11 for Object Detection, Segmentation, Pose Estimation & Classification

Creating Analytical Graphs and Visualizing Data with YOLO11:

How to Generate Analytical Graphs with YOLO11

Counting Object Entries and Exits using YOLO11 and DeepSORT:

Tracking Objects with YOLO11 and DeepSORT for Entry–Exit Counts

Streamlit Application: Object Detection, Segmentation & Pose Estimation:

Building a Streamlit App for Object Detection, Segmentation, and Pose Estimation

Using Ultralytics YOLO11 with SAHI for Object Detection in Drone Footage:

YOLO11 + SAHI = Better Detection for Small Objects! (Step-by-Step Guide)

Estimate Real Distance to Objects with ML Depth Pro and YOLO11:

Learn how to estimate real distances to objects using Depth Pro and YOLO11.

Performing Zero-Shot Object Detection with Qwen2.5-VL:

Zero-Shot Object Detection Using Qwen2.5-VL

Run Vision Tasks: Object Detection, Image Captioning & OCR with Florence 2:

How to use Florence 2 for Object Detection, Image Captioning and OCR

Google Gemini 2.5 Pro: Detect Objects, Generate Captions & OCR:

How to do Object Detection, Image Captioning, Reasoning and OCR with Gemini-2.5

Who this course is for:
- Anyone interested in Computer Vision
- Students and researchers exploring AI and vision-language models.
- Anyone excited about building AI-powered applications
More Info

Download from icerbox.com