Learn about the principles of tidy data and discover how to import, transform, clean, and wrangle data using the R programming language.
Overview
Syllabus
Introduction
- Preparing for data wrangling
- What you need to know
- Exercise files
- What is tidy data?
- Variables, observations, and values
- Common data problems
- Using the tidyverse
- Building and printing tibbles
- Subsetting tibbles
- Filtering tibbles
- What are CSV files?
- Importing CSV files into R
- What are TSV files?
- Importing TSV files into R
- Importing delimited files into R
- Importing fixed-width files into R
- Importing Excel files into R
- Reading data from databases and the web
- Wide vs. long datasets
- Making wide datasets long with pivot_longer()
- Making long datasets wide with pivot_wider()
- Converting data types in R
- Working with dates and times in R
- Detecting outliers
- Missing and special values in R
- Breaking apart columns with separate()
- Combining columns with unite()
- Manipulating strings in R with stringr
- Understanding the coal dataset
- Reading in the coal dataset
- Converting the coal dataset from wide to long
- Segmenting the coal dataset
- Visualizing the coal dataset
- Understanding the water quality dataset
- Reading in the water quality dataset
- Filtering the water quality dataset
- Water quality data types
- Correcting data entry errors
- Identifying and removing outliers
- Converting temperature from Fahrenheit to Celsius
- Widening the water quality dataset
- Understanding the social security disability dataset
- Importing the social security disability dataset
- Making the social security disability dataset long
- Formatting dates in the social security disability dataset
- Fiscal years in the social security disability dataset
- Widening the social security disability dataset
- Visualizing the social security disability dataset
- Next steps
Taught by
Mike Chapple