Delta Keyword Transformer: Bringing Transformers to the Edge Through Dynamically Pruned Multi-Head Self-Attention

Overview

Explore the cutting-edge developments in bringing Transformers to edge devices through a 21-minute conference talk from the tinyML Research Symposium 2022. Delve into the innovative Delta Keyword Transformer, presented by Zuzana Jelčicoová, an Industrial PhD student at Oticon. Learn about dynamically pruned multi-head self-attention and its applications in edge computing. Gain insights into the Keyword Transformer (KWT) model analysis, the Delta algorithm, and its implementations in regular and delta matrix multiplication, as well as softmax operations. Discover the results and implications of this groundbreaking research, concluding with a glimpse into EDGE IMPULSE technology.

Syllabus

Intro
Overview Transformers
Previous work
Keyword Transformer (KWT)
KWT-Model analysis
Delta algorithm
Delta-regular matrix multiplication
Delta-delta matrix multiplication
Delta for softmax
Delta Keyword Transformer
Results 1
Conclusion
Premier: EDGE IMPULSE

Taught by

tinyML

Reviews

Start your review of Delta Keyword Transformer: Bringing Transformers to the Edge Through Dynamically Pruned Multi-Head Self-Attention

Taught by

Nyströmformer- A Nyström-Based Algorithm for Approximating Self-Attention

Extremely Low-Bit Quantization for Transformers - tinyML Asia 2021

10 Best Machine Learning Courses for 2024: Scikit-learn, TensorFlow, and more

Never Stop Learning.