Additional Resources

Computational Education and Skills Training

Textbooks

The Epidemiologist R Handbook, from Applied Epi

A free digital reference book for R code that is specifically designed for people working in applied epidemiology, public health practice, and disease control. Its objective is to serve as a quick R code reference manual (online and offline) with task-centered examples that address common epidemiological problems.

 

R for Data Science, by Hadley Wickham and Garrett Grolemund

This book teaches how to do data science with R: this includes how to import data into R, load it into the most useful structure, transform it, visualize it, and model it. Also includes ​​an introduction to the grammar of graphics, literate programming, and best practices for reproducible research.

 

R Graphics Cookbook by Winston Chang

A practical guide that provides more than 150 recipes for generating high-quality graphs quickly, without having to comb through all the details of R’s graphic systems.

 

R Workflow, by Frank Harrel 

This book covers a very useful application for reproducible research reports. This is a general primer for using R and Quatro with many examples of code, output, and interactive graphics.

 

Mastering Shiny by Hadley Wickham

A primer for going from knowing nothing about Shiny to developing complex apps using Shiny’s reactive programming model.

 

Bookdown.org by Posit (formerly RStudio)

The website bookdown.org is a Posit Connect (formerly RStudio Connect) server provided by Posit to host books. It is free to publish the static output files of your book, and you hold the full copyright of your own books. Many interesting books are provided mostly covering various topics implemented in R.

 

Modern Statistics for Modern Biology, by Susan Holmes and Wolfgang Huber

Statistical analysis of biological high-throughput data with a focus on computational functionality in R and Bioconductor. Covers fundamental concepts of statistical analysis such as important probability distributions, generative models, and hypothesis testing with applications to real data. Provides statistical theory as well numerous code examples for various analysis steps including normalization, clustering, dimensionality reduction, and differential expression analysis.

 

Orchestrating Single-Cell Analysis with Bioconductor, by Robert Amezquita, Aaron Lun, Stephanie Hicks, and Raphael Gottardo

Comprehensive introduction to using the Bioconductor ecosystem for single-cell RNA-seq analysis. Includes walk-throughs for various steps of typical analysis workflows using real world example datasets. This includes computational methods, standards for data representation and manipulation, and interactive data visualization tools.

 

 

Where Can I Get Additional Help?

 

HMS quad-based students, postdocs, research staff, and faculty are welcome to reach out to the CCB for help. If you are having an issue with a particular pipeline it is worth seeing if it has an existing community. General platforms like StackOverflow can be good for questions around coding but can be difficult to navigate. Some specific active community forums that may be helpful are listed below.

Additionally, many individual computational packages and platforms have their own Q&A forums such as the ScanPy community (https://scanpy.discourse.group/).

Please note, it’s useful to first check if someone has already asked your question before posting because an existing thread may already have the answer you need.