Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

XLSTM: Understanding Extended LSTMs with sLSTM and mLSTM Architecture

AI Bites via YouTube

Overview

Explore a 14-minute technical video that delves into the evolution and enhancement of Long Short-Term Memory Networks (LSTMs) through the introduction of XLSTM (Extended LSTM). Learn about the historical limitations of traditional LSTMs in parallel processing and GPU utilization, and discover how the new XLSTM architecture addresses these constraints through its two main components: sLSTM and mLSTM. Understand the mathematical foundations, including detailed equations, as the video progresses from basic concepts of Recurrent Neural Networks to advanced implementations. Compare the performance of this parallel-capable LSTM variant with modern transformers, examining the technical specifications of both the normalizer and stabilizer in sLSTM, and the comprehensive structure of mLSTM blocks. Gain insights into the practical advantages and evaluation metrics of XLSTM, making it relevant for professionals working with sequence-related tasks such as text generation and translation.

Syllabus

- Intro and overview of XLSTM
- Problems with LSTMs
- Recurrent Neural Networks RNNs
- LSTMs overview
- Drawbacks of LSTMs
- Sigmoid vs Exponential Function
- sLSTM block
- Normalizer in sLSTM
- Stabilizer in sLSTM
- mLSTM block
- Detailed block of sLSTM
- Detailed block of mLSTM
- XLSTM
- Advantages of XLSTM
- Evaluation

Taught by

AI Bites

Reviews

Start your review of XLSTM: Understanding Extended LSTMs with sLSTM and mLSTM Architecture

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.