AI-Driven Solutions for Text Anonymization - Implementing Machine Learning for Data Privacy
Data Science Conference via YouTube
Overview
Learn about AI-driven text anonymization in this 26-minute conference talk from DSC ADRIA 23, where explore automated solutions for protecting sensitive data through machine learning. Discover essential concepts including data labeling processes, NLP model training, and human-in-the-loop monitoring systems. Gain insights into implementing MLOPS principles, understanding PII detection modules, and leveraging synthetic replacement techniques. Master the fundamentals of text anonymization, its importance, and practical applications using large language models. Follow the journey from initial rule engines to continuous improvement strategies, with detailed explanations of AWS and Azure human-in-the-loop implementations. Understand key considerations for deploying these solutions in industrial settings while ensuring privacy and maintaining model performance.
Syllabus
Intro
What is text anonymization?
Why anonymization?
Large Language Models
Model Training and Fine Tuning
Data Labelling
Step 1 - Model
Rule Engine
Step 2 - PII Detection Module
Human in the Loop - AWS
Human in the Loop - Azure
Synthetic replacement
Continuous Improvement
Downstream Application
Key takeaways
Taught by
Data Science Conference