Summer Institute in Statistical Genetics, and Statistics and Modeling of Infectious Diseases
Module 3: Introduction to R
Instructors: Ken Rice and Ting Ye

This page will feature slides from our sessions, recordings of the various sessions, exercises for you to complete, and their solutions (all to follow). Prior to the module, please install up-to-date versions of R and RStudio on the computer you will use during the summer institute. Both are free.

Slides and exercises

Script files are posted following each session; these will contain our R code for the exercises. To make them work on your computer, remember to modify file names and locations appropriately. Also note that many different 'correct' solutions are possible. All times/dates are Pacfic, i.e. Seattle time.

The module has 10 sessions, each of 80 minutes. The basic format for a session is:

  • 45 minutes of lecture material. These will be recorded and made available as soon as possible. Recordings from 2021 are also available, if you want to watch in advance.
  • 25 minutes of exercises for you to try, with small-group "breakout" Zoom sessions available, attended by other class participants, and Teaching Assistants. These are not recorded.
  • 10 minute discussion of exercises, where the instructors will present possible solutions and answer questions. These will be recorded.

Please join the module's Slack channel, where you can ask questions and see real-time updates from the instructors and TAs.

Zoom room for all sessions.

Monday, July 9th
Time Topic Lecture Exercises/Discussion
8:00am-9:20am 1. Introductions, reading in data Slides [.pdf], Code [.R], Zoom (2022), starts 0:27:00 Exercises [.docx, .pdf] Key: [.R], Zoom (2022), starts 1:24:00
9:40am-11:00am 2. More data summary and using functions Slides [.pdf], Code [.R], Zoom (2022), starts 1:31:30 Exercises [.docx, .pdf], Key: [.R], Zoom (2022), starts 2:19:22
11:30am-12:50pm 3. Plotting functions, and formulas Slides [.pdf], Code [.R], Zoom (2022), starts 2:29:20 Exercises [.docx, .pdf], Key: [.R], Zoom (2022), starts 3:13:45
1:10pm-2:30pm 4. Adding features to plots Slides [.pdf], Code [.R], Zoom (2022), starts 3:28:30 Exercises [.docx, .pdf], Key: [.R], Zoom (2022), starts 3:57:02
Tuesday, July 10th
Time Topic Lecture Exercises/Discussion
8:00am-9:20am 5. Over and over (i.e. loops) Slides [.pdf], Code [.R], Zoom (2022), starts 0:14:25 Exercises [.docx, .pdf], Key: [.R], Zoom, starts 1:34:44
9:40am-11:00am 6. More loops, Control Structures, and Bootstrapping Slides [.pdf], Code [.R], Zoom (2022) Exercises [.docx, .pdf], Key: [.R], Zoom
11:30am-12:50pm 7. Fitting models Slides [.pdf], Code [.R], Zoom (2022) Exercises [.docx, .pdf], Key: [.R], Zoom
1:10pm-2:30pm 8. Introduction to R packages Slides [.pdf], Code [.R], Zoom (2022) Exercises [.docx, .pdf], Key: [.R], Zoom
Wednesday, July 11th
Time Topic Lecture Exercises/Discussion
8:00am-9:20am 9. Writing your own functions Slides [.pdf], Code [.R], Zoom (2022) Exercises [.docx, .pdf] Key: [.R] Zoom
9:40am-11:00am 10. The End! Slides [.pdf] , resources below, Zoom-whole session (2022)

Some special material for Session 10:

Comments forms and attendance certificates are available here - requires SISG account.


Datasets - in alphabetical order

Before trying to read data into your R session, we recommend looking at it first, in a text editor. Is the data comma- or tab-delimited? Does it have a 'header' row containing variable names?


Other resources

Some recommended books;