Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Instruction Backdoor Attacks Against Customized Large Language Models

USENIX via YouTube

Overview

Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a research presentation from USENIX Security '24 that investigates critical security vulnerabilities in customized Large Language Models (LLMs). Learn about the first instruction backdoor attacks targeting applications integrated with untrusted customized LLMs, particularly focusing on GPTs. Discover how researchers developed three levels of attacks - word-level, syntax-level, and semantic-level - that can embed backdoors through prompt design without modifying the underlying LLM architecture. Examine experimental results across 6 major LLMs and 5 benchmark text classification datasets, demonstrating successful attack implementation while maintaining model utility. Understand proposed defense strategies against these vulnerabilities and grasp the broader implications for LLM customization security.

Syllabus

USENIX Security '24 - Instruction Backdoor Attacks Against Customized LLMs

Taught by

USENIX

Reviews

Start your review of Instruction Backdoor Attacks Against Customized Large Language Models

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.