Layer Increasing Network Scaling (LiNeS) - Preventing Catastrophic Forgetting in Language Models

Overview

Learn about a groundbreaking post-training technique called LiNeS (Layer increasing Network Scaling) in this 21-minute technical video that addresses catastrophic forgetting in large pre-trained models. Explore how this layer-wise scaling approach linearly adjusts parameter updates based on layer depth, with minimal scaling for shallow layers to preserve foundational knowledge while allowing deeper layers more flexibility for task-specific updates. Discover the technique's effectiveness in multi-task scenarios, where it enhances model merging methods across vision and NLP benchmarks by mitigating negative task interference. Follow along as the presentation covers VLM limitations, ChatGPT search hallucinations, detailed LiNeS methodology visualization, multi-model merging strategies, AdaMerging for multi-task learning, anisotropic scaling, and task vector implementations. Gain insights into this computationally efficient solution that maintains both specialized task performance and broad generalization capabilities in resource-constrained AI environments.

Syllabus

VLMs suffer from catastrophic forgetting
ChatGPT SEARCH 4omni
ChatGPT SEARCH hallucinates complete study
Post-training Layer Scaling VLM, LLM
Simple visualization of LiNeS method
Multi-task to multi-model merge
AdaMerging multi-task learning
Anisotropic scaling
Task vectors are not task vectors

Taught by

Discover AI

Reviews

Start your review of Layer Increasing Network Scaling (LiNeS) - Preventing Catastrophic Forgetting in Language Models

Taught by

10 Best Machine Learning Courses for 2024: Scikit-learn, TensorFlow, and more

Never Stop Learning.