Overview
Explore the complexities of dataset licensing in AI software development through this conference talk. Delve into the OpenDataology project, an open-source initiative addressing license compliance challenges for publicly available datasets. Learn about the risks associated with using multiple data sources, each with potentially different licenses. Discover how OpenDataology proposes a novel approach to assess potential license compliance violations and acts as a crowd-sourced medium for identifying and documenting risks. Gain insights into the project's key thrusts, available tools, and its efforts to enhance SPDX for better dataset license compliance analysis. Understand the importance of proper dataset licensing and the steps being taken to improve the current landscape in AI development.
Syllabus
Introduction
About Gopi
Disclaimer
Project Logo
Shoutouts
How to Get Data
Licenses
Rights
Challenges
Project Aims
Project Process
Extract License
Extract Metadata
Data License Format
Tagging Distribution
The Only Difference
The Problem
Custom Licenses
Standard Licenses
Montreal Data License Research Paper
Risks
Other Scary Aspects
OpenDataology
Four Key Thrusts
Tools
Generating a License
Community
Standards Community
Conclusion
Dataset Licensing
Taught by
Linux Foundation