Layer Increasing Network Scaling (LiNeS) - Preventing Catastrophic Forgetting in Language Models
Discover AI via YouTube
Overview
Learn about a groundbreaking post-training technique called LiNeS (Layer increasing Network Scaling) in this 21-minute technical video that addresses catastrophic forgetting in large pre-trained models. Explore how this layer-wise scaling approach linearly adjusts parameter updates based on layer depth, with minimal scaling for shallow layers to preserve foundational knowledge while allowing deeper layers more flexibility for task-specific updates. Discover the technique's effectiveness in multi-task scenarios, where it enhances model merging methods across vision and NLP benchmarks by mitigating negative task interference. Follow along as the presentation covers VLM limitations, ChatGPT search hallucinations, detailed LiNeS methodology visualization, multi-model merging strategies, AdaMerging for multi-task learning, anisotropic scaling, and task vector implementations. Gain insights into this computationally efficient solution that maintains both specialized task performance and broad generalization capabilities in resource-constrained AI environments.
Syllabus
VLMs suffer from catastrophic forgetting
ChatGPT SEARCH 4omni
ChatGPT SEARCH hallucinates complete study
Post-training Layer Scaling VLM, LLM
Simple visualization of LiNeS method
Multi-task to multi-model merge
AdaMerging multi-task learning
Anisotropic scaling
Task vectors are not task vectors
Taught by
Discover AI