Detoxification of Large Language Models Using TrustyAI Detoxify and HuggingFace SFTTrainer

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!

Grab it

Explore the process of detoxifying large language models in this DevConf.US 2024 conference talk. Learn how to leverage TrustyAI Detoxify, an open-source library for scoring and rephrasing toxic content, in conjunction with HuggingFace's Supervised Finetuning Trainer (SFT) to optimize the detoxification process. Discover the challenges of curating high-quality, human-aligned training data and how TrustyAI Detoxify can be used to rephrase toxic content for supervised fine-tuning. Gain insights into the capabilities of TrustyAI Detoxify and its practical application in improving the ethical performance of language models. Follow along as speaker Christina Xu demonstrates the integration of these tools to streamline the detoxification protocol and create more responsible AI systems.