Overview

Build your statistics and probability expertise with this short course from the University of Leeds. The first week introduces you to statistics as the art and science of learning from data. Through multiple real-life examples, you will explore the differences between data and information, discovering the necessity of statistical models for obtaining objective and reliable inferences. You will consider the meaning of "unbiased" data collection, reflecting on the role of randomization. Exploring various examples of data misrepresentation, misconception, or incompleteness will help develop your statistical intuition and good practice skills, including peer review. In the second week, you will learn and practice R software skills in RStudio for exploratory data analysis, creating graphical and numerical summaries. The final week will involve completing probability experiments and computer simulations of binomial trials, such as tossing a coin or rolling a die. This will help you develop an intuitive concept of probability, encompassing both frequentist and subjective perspectives. Throughout the course, you will acquire vital statistical skills by practicing techniques and software commands and engaging in discussions with fellow students. By the end of the course, you will be able to: - Understand and explain the role of statistical models in making inferences from data. - Implement appropriate tools for numerical and graphical summaries using RStudio, and interpret the results. - Evaluate the stability of frequencies in computer simulations through experimental justification and "measurement" of probability. No matter your current level of mathematical skill, you will find practical and real-life examples of statistics in action within this course. This course is a taster of the Online MSc in Data Science (Statistics) but it can be completed by learners who want an introduction to programming and explore the basics of Python.

Syllabus

The Role of Statistical Models in Data Analysis

This first week introduces you to statistics as the art and science of learning from data. You will learn how to recognise the difference between data and information and realise the need for statistical models to gain objective and reliable inferences. You will consider examples of datasets and reflect on suitable research questions that can be posed and answered using these data. You will see the importance of 'unbiased' data collection and learn about randomisation as a tool to achieve this. Various examples of data misrepresentation, misconception or incompleteness will help you develop statistical intuition and good practice skills. In the activities section, you are introduced to the peer review tool, which is a useful way for you to improve your statistical data analysis skills.

The Basics of Exploratory Data Analysis

Week 2 gives you the opportunity to learn and practise your R skills in exploratory data analysis by producing numerical and graphical summaries of a variety of datasets. You learn to distinguish between different types of data (categorical vs numerical) and to use appropriate numerical and graphical summaries. You also gain experience in distinguishing between 'normal' and skewed data using box plots and histograms. This week offers a substantive task in RStudio to complete.

Explore and Reflect: Random Experiments and Computer Simulations

This final week gives you the opportunity to explore a remarkable stability of frequencies as an experimental support of the concept of probability. You apply your R skills and conduct computer simulations of repeated random trials (e.g. tossing a coin or rolling a dice). Based on these observations, you develop an intuitive concept of probability (frequentist and subjective). You share your findings on a discussion board or 'forum' to discuss long-term experiments as a way to 'measure' probability of various events of interest (e.g. long runs of 6 in dice rolls, or tied birthdays in a class of students).