Overview
Explore data curation techniques for open source LLM fine-tuning in this 19-minute conference talk by Clemens Schroeer from Lemon AI, presented at the Data Science Festival MayDay event 2024. Delve into the challenges of acquiring high-quality data for fine-tuning models like Mistral 7B, and learn strategies for iterating towards effective datasets. Gain insights into the complexities of understanding and leveraging company data for optimal fine-tuning results. Designed for technical practitioners, this talk offers valuable experience-based knowledge on dataset curation and considerations for successful open source LLM fine-tuning.
Syllabus
Data Curation for Open Source LLM Fine Tuning - Data Science Festival
Taught by
Data Science Festival