This course continues our gentle introduction to programming in R designed for 3 types of learners. It will be right for you, if:
• you want to do data analysis but don’t know programming
• you know programming but aren’t too familiar with R
• you know some R programming but want to learn more about the tidyverse verbs
It is best taken following the first course in the specialization or if you already are familiar with ggplot, RMarkdown, and basic function writing in R. You will use learn to use readr to read in your data, dplyr to analyze your data, and stringr and forcats to manipulate strings and factors.
Overview
Syllabus
- Projects, Tibbles and Importing Data
- When analyzing data, you will often be required to import data from CSV or txt files. In this module, you will learn how to import and parse data in base R and the readr library, a package in the Tidyverse. You will also be introduced to R projects, which help store and organize data files associated with an analysis.
- Tidying Data
- Data are stored in tabular forms and are often organized differently depending on its use. In this module, you will learn how to reorganize data to produce a "tidy" data set, where every variable is stored in its own column, every observation is stored in its own row, and each value is stored in a table cell.
- Relational Data
- Data analysis rarely involves a single data table and you will be required to combine multiple related tables to answer questions you are interested in. In this module, you will learn and practice mutating variables and filtering observations from relational data.
- String Manipulation and Regular Expressions
- This module will introduce string manipulation in R. You will learn the basics of strings, including string creation, merging, and subsetting. Then, you will use regular expressions to describe and view patterns in strings.
- Categorical Variables and Factors
- In the last module of the course, you will use the forcats package in the tidyverse to work with categorical variables, variables that have discrete values. The forcats package introduces factors - data objects used to categorize the data in levels. You will practice creating and modifying factors.
Taught by
Jane Wall