Influential Data Retrieval Explained

Overview

Explore a 58-minute session featuring Huawei Lin from The Rochester Institute of Technology, co-author of the paper "Token-wise Influential Training Data Retrieval for Large Language Models". Delve into the proposed RapidIn framework, a scalable solution for estimating the influence of training data on large language models. Learn about the two-stage process of caching and retrieval that adapts to LLMs. Gain insights into the latest advancements in AI research and industry trends, with links to additional resources such as The Deep Dive newsletter and Unify's blog. Connect with the Unify community through various social media platforms and explore their GitHub repository for hands-on engagement with AI deployment technologies.