Save Big on Coursera Plus. 7,000+ courses at $160 off. Limited Time Only!
Explore a conference talk from USENIX Security '23 that introduces Calpric, an innovative approach to labeling privacy policies using crowdsourcing and active learning. Learn how this method combines automatic text selection and segmentation with crowdsourced annotators to generate a large, balanced training set for privacy policies at a reduced cost. Discover how Calpric enables the creation of more accurate deep learning models that cover a wider range of data categories and provide more detailed, fine-grain labels than previous work. Understand the cost-effectiveness of this approach, which achieves reliable labeled data at approximately $0.92-$1.71 per text segment, and produces a labeled dataset of 16K privacy policy text segments across 9 data categories with balanced positive and negative samples.