Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Coursera

Introduction to Data Science

Ball State University via Coursera

Overview

We reside in a world experiencing an explosion of information, with a rapid and exponential growth of data. This surge in data captures increasing interest across various fields. Data science involves the gathering of extensive data and the fusion of domain expertise, programming skills, mathematics, and statistical knowledge to derive meaningful insights. Given the breadth and depth of data science, this course aims to furnish you with a comprehensive theoretical foundation and framework to initiate your journey in this field. "Data" permeates every aspect of data science. The course is divided into five parts, each centered around core topics related to "data". The initial part introduces data ethics, outlining the ethical issues surrounding data collection, usage, and reporting. The second part delves into data collection, acquisition sources, and data structures. The third part focuses on cutting-edge research in Data Science, immersing you in the realm of data science. The fourth part acquaints you with basic data processing using programming, specifically in R, the prevailing data analytics tool. Here, you will gain familiarity with R fundamentals, execute basic data wrangling tasks, develop an understanding of data storage and management, and gain experience in data visualization. The fifth part of the course imparts fundamental knowledge of probability and statistics, preparing you to move to the next stage of exploration.

Syllabus

  • Informed Consent and Data Ownership
    • What is data science and what activities and topics will have in data science? This module will answer the questions first, and then come to one of topics-data ethics. This module will provide a big picture about the data ethic issues within data science and focus on two critical data ethics topics, Informed Consent and Data Ownership. In this module, you will learn to define, explain, and discuss those two specific topics and identify ethical and unethical activities related to them.
  • Privacy, Transaction Transparency and Anonymity
    • In this module, we will focus on three important concepts in data ethics: Privacy, Transaction Transparency, and Anonymity. These concepts often intersect and influence each other. In this module, we will explain and describe each term and provide examples to illustrate how these concepts are applied in the field of data science. Special attention is given to de-identification for privacy protection in the module.
  • Data Validity and Algorithmic Fairness
    • In this module, we will specifically discuss two important concepts: Data Validity and Algorithmic Fairness. The accuracy and bias of input data is related to data validity, which strongly influences the outcomes and fairness of algorithms. In this module, we will explore how and why inappropriate and unethical data validity can result in unfairness.
  • Societal Consequences and Code of Ethics
    • Unethical activities during research design, data collections and data analysis usually lead to societal consequences. However, even if the whole procedure about data is ethical, there may still be unintended consequences due to the development of new technology.In this module, societal consequences in data science are discussed and the code of ethics in research and environmental sciences are outlined to ethically guide potential behavior of data scientists.
  • Data Sources and Data Acquisition
    • This module focuses on the initial phase of a data science project, which involves obtaining data. Specifically, the module covers the following topics of data acquisition: identifying and describing data sources, sampling techniques for data collection, and the impact of sampling bias on research. Through these discussions, the module aims to provide a comprehensive understanding of the initial steps involved in obtaining data for a data science project.
  • Data Types and Data Structures
    • This module is dedicated to exploring various concepts about data, such as file formats for delivery and sharing, data types for variables’ basic nature and characteristics, and data structures for data manipulation and data analysis. The concepts of data files, data types and data structures, common data types and structures in programming languages, and specifically data structures in R, are covered.

Taught by

Dr. Aihua Li

Reviews

Start your review of Introduction to Data Science

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.