Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Audio-Visual Active Speaker Detection on Embedded Devices - Optimizing Deep Learning Models

EDGE AI FOUNDATION via YouTube

Overview

Watch a technical conference presentation exploring the development and optimization of Active Speaker Detection (ASD) models for embedded devices. Learn how researchers at NXP Semiconductors created computationally efficient deep learning architectures that can identify active speakers in video by analyzing both visual and audio features in real-time. Discover the innovative approaches used to drastically reduce computational costs through multi-objective optimization and a novel modality fusion scheme, enabling implementation on both high-end MPUs and resource-constrained MCUs. Follow the complete optimization journey from model design modifications to quantization and integration on NXP devices, with detailed analysis of the trade-offs between computational efficiency and system accuracy.

Syllabus

tinyML EMEA - Baptiste Pouthier: Audio-Visual Active Speaker Detection on Embedded Devices

Taught by

EDGE AI FOUNDATION

Reviews

Start your review of Audio-Visual Active Speaker Detection on Embedded Devices - Optimizing Deep Learning Models

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.