Mozilla's DeepSpeech and Common Voice Projects - Open and Offline-Capable Voice Recognition
Mozilla Hacks via YouTube
Overview
Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore Mozilla's DeepSpeech and Common Voice projects in this 27-minute conference talk presented by Tilman Kamp at FOSDEM 2018. Dive into the world of open and offline-capable voice recognition technology, learning about the motivations behind these initiatives and their potential impact. Gain insights into scaling challenges, user perspectives, and the technical aspects of preparing the environment and working with data. Discover the development process, including data preparation, contribution methods, and the importance of open data. Examine specific topics such as loss propagation, overfitting, and the differences between acoustic and language models. Understand the application of these technologies to different languages, with a focus on German corpus development and training. Conclude with next steps and ways to get involved in these groundbreaking projects.
Syllabus
Introduction
What are we doing
Why are we doing this
Scaling
User perspective
Preparing the environment
Downloading the data
Code
Rust bindings
Quality
Development
Preparation
Getting Data
Contributing
Open Data
Import CV
Loss
Propagation
Overfitting
Acoustic vs Language
Terminology
German
Corpus
CSV
Language models
Conversion
Run
German Training
Next Steps
Contact Us
Taught by
Mozilla Hacks