Introduction to the tidyverse

The pipe operator, %>%, greatly simplifies statistical programming in R. The tidyverse is a collection of R packages that are designed to work with the pipe to produce concise, readable and very efficient R code. The tidyverse is particularly suited to data analysis and modelling. This course will introduce the tidyverse by taking a dataset and processing it, from reading the data through to the final report. The course will emphasise the three most important tidyverse packages; dplyr, a package for data manipulation, ggplot2, a package for graphics and rmarkdown, a package for report production. Time will be set aside for the course participants to run their own analysis of a similar dataset. The course will end with a brief overview of the more advanced tidyverse packages and their associated programming styles, including the map functions from the purrr package. The course is suitable for someone who has a basic knowledge of R and RStudio and who wants to modernise and extend their skills. If possible, bring along your own laptop with R and RStudio installed.

John is a biostatistician with an interest in epidemiology and the analysis of ge-netic studies. He was Professor of Genetic Epidemiology at the University of Leicester until he retired just before COVID struck. He now has an Emeritus post at the University. Over the last decade, John experimented with a range of ways of teaching R before settling on the tidyverse and the problem-based ap-proach that will be used for this short course. If you want to know more about John's approach to teaching R, he has a series of blog posts on the subject that start here.