Module 14: Association Mapping: GWAS and Sequencing Data
Instructors: Timothy Thornton and Michael Wu
This page will feature slides, exercises, some solutions, and video recordings (all to follow).
Install R and Rstudio, and download PLINK
Prior to the module, please install up-to-date versions of R (Version 4.0.2), RStudio, and Plink on the laptop you will use during the
summer institute. All three are free.
R Packages for Module 14
The following R packages from CRAN will be used and should be installed prior to the module:
The R commands below can be used to install the two CRAN R packages :
install.packages("qqman")
install.packages("SKAT")
The following R packages from Bioconductor will be used and should be installed prior to the module
- GWASTools
- gdsfmt
- SNPRelate
- GENESIS
The R commands below can be used to install the R packages from Bioconductor with the latest version of R (Version 3.6) :
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install(c("GWASTools", "gdsfmt","SNPRelate","GENESIS"))
Session Format
The module has 10 sessions, each of 80 minutes. The standard format for a session is approximately:
- 40 minutes of lecture material that will be recorded via Zoom live and posted at the end of the day
- 25 minutes of exercises for you to try, with small-group "breakout" Zoom sessions available, attended by other class participants, and Teaching Assistants
- 15 minute discussion of exercises, where the instructors will present possible solutions and answer questions
Please join the module's Slack channel, where you can ask questions and see real-time updates from the instructors and TAs.
Schedule, Slides, Exercises, and Video Recordings:
For each exercises performed in R and PLINK, script files will be posted. To make them work on your computer, remember to modify file names and locations appropriately. Also note that many different 'correct' solutions are possible.
Video recordings of the lectures and exercises will be posted at the end of each day.
All times listed below for the schedule are Pacific Standard Time (PST).
Monday, July 27th |
Time |
Topic |
Lecture |
Exercises/Discussion |
8:00am-9:20am |
1. Introduction, Case Control Association Testing |
Slides: (Intro) [.pdf ], (Lecture) [.pdf], video |
Exercises [.pdf], video , R Script:[ .R] , Key: [.html ], [.Rmd] |
9:40am-11:00am |
2. Association Testing with Quantitative Traits |
Slides: [.pdf], video |
Exercises: [.pdf], video R Script:[ .R], Key: [.html ], [.Rmd] |
11:30am-12:50pm |
3. Introduction to the PLINK Software for GWAS |
Slides [.pdf], video |
Exercises [.pdf], video, Plink Script: [ .txt], R Script:[.R], Key (Rscript) :[.R] |
1:10am-2:30pm |
4. Gene and Pathway Level Analysis of Genetic Association Studies. |
Slides [.pdf], video |
Exercises [.pdf], video, Plink and R Script: [ .txt] |
Tuesday, July 28th |
Time |
Topic |
Lecture |
Exercises/Discussion |
8:00am-9:20am |
5. Population Structure Inference |
Slides [.pdf], video |
Exercises [.pdf], video ,R Script: [.R], Key: [.html], [.Rmd] |
9:40am-11:00am |
6. GWAS in Samples with Structure |
Slides [.pdf], video |
Exercises [.pdf ],video ,R Script:[ .R] |
|
11:30am-12:50pm |
7. Interaction Analysis |
Slides [.pdf], video |
Exercises [.pdf], video , R Script:[ .txt] |
1:10am-2:30pm |
8. Introduction to Rare Variant Analysis and Collapsing Tests |
Slides [.pdf], video |
Exercises [.pdf], video , Key: R Script:[ .txt] |
Wednesday, July 29th |
Time |
Topic |
Lecture |
Exercises/Discussion |
8:00am-9:20am |
9. Rare Variant Analysis: Kernel (Variance Component) Tests and Omnibus Tests |
Slides [.pdf], video |
Exercises [.pdf], video , R Script:[ .txt] |
9:40am-11:00am |
10. Power and Sample Size, Design Considerations, and Emerging Issues |
Slides [.pdf], video |
Exercises [.pdf], video , R Script:[ .txt] |
Datasets
All individual data files below can be downloaded as a single zipped folder from dropbox. This file can be downloaded here:
SISG2021Data.zip
Alternatively, you can download each of the data files below.
Before trying to read data into an R or PLINK session, we recommend looking at it first, in a text editor. Is the data comma- or tab-delimited? Does it have a 'header'
row containing variable names?
Other Resources
|