Analyzing and Exposing Vulnerabilities in Language Models - Session 131

Overview

Explore critical vulnerabilities in Large Language Models through this 37-minute research presentation from Stanford University's MedAI Group Exchange Sessions. Delve into two groundbreaking papers presented by PhD student Yibo Wang that examine robustness and fairness challenges in LLMs. Learn about a novel Distribution-Aware Adversarial Attack method that improves attack effectiveness while maintaining lower detectability and better transferability across models. Understand how gender biases manifest in text generation systems and discover approaches to quantify and mitigate these inherent stereotypes. Gain insights from Wang's expertise in natural language processing, particularly in developing trustworthy language models and code generation systems. Part of Stanford's weekly MedAI series examining the intersection of artificial intelligence and medicine, this talk includes an interactive discussion and Q&A session that bridges theoretical concepts with practical applications in the field.