Poisoning Web-Scale Training Datasets is Practical

Overview

Explore strategies used by model builders to create large datasets and discover two attacks that exploit these mechanics in this 33-minute Black Hat conference talk. Learn about the vulnerabilities of deep learning models that rely on massive, distributed datasets gathered from the internet, including issues related to expired domains and potential exploitation by malicious actors. Understand how this problem affects not only StableDiffusion but also Large-Language Models like ChatGPT trained on internet-sourced data. Gain insights into the practical implications of poisoning web-scale training datasets and its impact on popular AI models.

Syllabus

Poisoning Web-Scale Training Datasets is Practical

Taught by

Black Hat

Reviews

Start your review of Poisoning Web-Scale Training Datasets is Practical

Taught by

Web Cache Entanglement - Novel Pathways to Poisoning

Practical Web Cache Poisoning - Redefining 'Unexploitable'

Practical Web Cache Poisoning - Redefining 'Unexploitable'

Manipulating Machine Learning - Poisoning Attacks and Countermeasures for Regression Learning

Property Inference from Poisoning

How Supercomputer-Scale Neural Network Models Apply to Defensive Cybersecurity Problems

10 Best Machine Learning Courses for 2024: Scikit-learn, TensorFlow, and more

Never Stop Learning.