Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

From Images to Text: New Forms of Human-AI Interaction

AI Doctoral Academy via YouTube

Overview

Watch a 48-minute lecture exploring the convergence of Computer Vision and Natural Language Processing, focusing on groundbreaking developments in Vision-Language integration and Embodied AI. Discover how AI systems can generate image descriptions, respond to questions, and navigate environments using natural language instructions. Explore cutting-edge techniques for text generation from visual content, methods for human-controlled AI systems, and the training of large-scale models using web datasets. Learn about the application of these technologies to embodied agents performing navigation and physical world interactions. Gain insights into evaluation metrics and current challenges in the field, with specific emphasis on recent research developments in human-AI interaction paradigms.

Syllabus

From Images to Text New forms of Human-AI Interaction

Taught by

AI Doctoral Academy

Reviews

Start your review of From Images to Text: New Forms of Human-AI Interaction

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.