Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Bernice: A Multilingual Pre-trained Encoder for Twitter

Center for Language & Speech Processing(CLSP), JHU via YouTube

Overview

Explore a groundbreaking multilingual RoBERTa language model called Bernice, designed specifically for Twitter data analysis. Learn about the development of this pre-trained encoder, which was trained from scratch on 2.5 billion tweets across multiple languages. Discover how Bernice outperforms other models adapted to social media data and strong multilingual baselines in various monolingual and multilingual Twitter benchmarks. Gain insights into the unique challenges of processing Twitter's multilingual content and how Bernice addresses the significant differences between Twitter language and other domains commonly used in large language model training.

Syllabus

Bernice: A Multilingual Pre-trained Encoder for Twitter - Alexandra DeLucia - October 2022

Taught by

Center for Language & Speech Processing(CLSP), JHU

Reviews

Start your review of Bernice: A Multilingual Pre-trained Encoder for Twitter

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.