Overview

Computing applications involving large amounts of data – the domain of data science – impact the lives of most people in the U.S. and the world. These impacts include recommendations made to us by internet-based systems, information that is available about us online, techniques that are used for security and surveillance, data that is used in health care, and many more. In many cases, they are affected by techniques in artificial intelligence and machine learning. This course examines some of the ethical issues related to data science, with the fundamental objective of making data science professionals aware of and sensitive to ethical considerations that may arise in their careers. It does this through a combination of discussion of ethical frameworks, examination of a variety of data science applications that lead to ethical considerations, reading current media and scholarly articles, and drawing upon the perspectives and experiences of fellow students and computing professionals. Ethical Issues in Data Science can be taken for academic credit as part of CU Boulder’s Master of Science in Data Science (MS-DS) degree offered on the Coursera platform. The MS-DS is an interdisciplinary degree that brings together faculty from CU Boulder’s departments of Applied Mathematics, Computer Science, Information Science, and others. With performance-based admissions and no application process, the MS-DS is ideal for individuals with a broad range of undergraduate education and/or professional experience in computer science, information science, mathematics, and statistics. Learn more about the MS-DS program at https://www.coursera.org/degrees/master-of-science-data-science-boulder.

Syllabus

Ethical Foundations

This module begins with an introduction to the course including motivation for the topic, the course goals, what topics the course will cover, and what is expected of the students. It then reviews the three ethical frameworks that are most commonly applied to ethical discussions in data science and computing: Kantianism/deontology, virtue ethics, and utilitarianism. Case studies are used to illustrate the application and properties of these frameworks.

Internet, Privacy, and Security

This module begins with some background about the Internet, which is the foundation for most of the topics that we study in this course. It then discusses the two most basic ethical issues in using the internet, privacy and security, in the context of data science. It goes through a number of real case studies and examples for each to illustrate the diversity of issues.

Professional Ethics

This module provides insight into the ethical issues in the data science profession and workplace (as opposed to technical topics in data science). It starts with discussion of two highly relevant codes of professional ethics, from professional societies in statistics and in computing. It then looks at a variety of recent workplace ethics issues in tech companies. A key part of this module is interviewing a data science professional about ethical issues they have encountered in their career.

Algorithmic Bias

Algorithmic bias may be the topic that people associate most with ethical issues in data science. This module begins by providing some general background on algorithmic bias and considering varying views on the pros and cons of algorithmic vs. human decision making. It then reviews an illustrative set of examples of algorithmic bias related to gender and race, which is a particularly important class of instances of algorithmic bias. The final part of the module discusses what is perhaps the single most prominent and discussed instance of algorithmic decision making and bias, facial recognition.

Medical Applications and Implications

Data science is applied to a wide variety of important application areas, each with their own ethical issues. This module focuses on an application area that is both particularly important and leads to a rich set of ethical issues: medical applications. This includes looking at current issues involved with health databases and the uses of artificial intelligence in healthcare, and more futuristic issues, gene editing and neurological interventions. The module concludes with a crucial topic that every data science profession should consider: the implications of the fields of data science and computing on the future of human work.