Overview
Learn about the development and evaluation of TelBench, a specialized benchmark for measuring Large Language Model (LLM) performance in telecommunications services, in this 24-minute conference talk from SK AI SUMMIT 2024. Discover how TelTask and TelInstruct learning datasets were designed to enable LLMs to understand telecommunications terminology, knowledge, and business context. Explore the benchmarking process and results used to validate Telco LLM, including collaborative synergies between linguists and engineers, professional evaluations by customer service representatives, and the development of telecommunications-specific LLM-as-a-judge. Gain insights into the potential of large language models within the telecommunications industry through the experiences shared by Sunwoo Lee, SK Telecom's Data Construction/Evaluation Team Leader, who combines linguistic expertise with NLP implementation to design training data, evaluate model performance, and drive service applications.
Syllabus
TelBench : Telco 서비스향 LLM 성능측정을 위한 벤치마크 개발 및 평가 | SK텔레콤 이선우
Taught by
SK AI SUMMIT 2024