Summer Institute in Statistical Genetics
Module 14: Elements of R for Genetics & Bioinformatics
Instructors: Thomas Lumley and Ken Rice

This page will feature slides from our sessions, exercises for you to complete, and their solutions (all to follow). Prior to the module, please install on an up-to-date version of R on the laptop you will use during the summer institute. R is free, and is available from this site.

To download and install Bioconductor to your laptop, first log on to the internet. Then open an R session and enter the following;

source("http://bioconductor.org/biocLite.R")
biocLite()

After doing this download, to download new Bioconductor packages (for example the hexbin package) use the following commands;

source("http://bioconductor.org/biocLite.R")
biocLite("hexbin")


Slides and exercises

Script files are posted following each session; these will contain our R code for the exercises. To make them work on your computer, remember to modify file names and locations appropriately. Also note that many different 'correct' solutions are possible.

Session 1, Introductions, reading in data (exercises) (R script file)

Session 2; Learning to Draw (exercises) (R script file)

Session 3; Data Manipulation (exercises) (R script file)

Session 4; Model Fitting (exercises) (R script file)

Session 5; Permutation Tests and Debugging (exercises) (R script file)

Session 6; High-throughout Work, and Writing Loops (code for timing and speedup) (exercises) (R script file)

Special Exercise: This is a more in-depth programming problem, for you to try on Thursday night; we'll discuss it in the final session

Session 7; Handling Large Datasets (exercises) (R script file)

Session 8; Bioconductor #1 (exercises) (R script file)

Session 9; Bioconductor #2 (exercises) (R script file)

Session 10; Special Exercise review (R script file) interfacing R to other software (R2WinBUGS example) (SVG+ example) (Code for the Google Maps example - unfortunately we can't post the data)

For easier searching, here are all the slides in one document (PDF).


Datasets - in alphabetical order

Before trying to read data into your R session, we recommend looking at it first, in a text editor. Is the data comma- or tab-delimited? Does it have a 'header' row containing variable names?

AMDchrom1snpStats.Rdata -- see session 8
annt.txt
bpdata.csv
data.vsn.csv
example-pheno.txt  
example-pheno.csv 
example-snp.txt
foursnps.csv
foursnps.txt
genepi.txt
justsnps.txt
niehs.csv
psa.txt
ribogreen.rda
salary.dat
sampleinfo.csv
sisg.nc -- see session 7


Other resources

Some recommended books;