This week's lab will cover two programs.
Note that some write-up is requested for each program.
If using the Biostat linux computers:
Remember that if you did not discover how to put the
source ~statgen/.statgen.cshrc
into your own .cshrc (or equiv.)
file, you will need to give this command each
time you log on, to access the statgen programs.
Alternatively, you may wish to try the more modern
fastPHASE, which is available through
Mathew Stephens page at Chicago.
Here
are the full
instructions for the PHASE software, if you need
them, or are interested in understanding the input format etc.
(The PHASE instructions are still here as of April 25, 2012.)
Here are the data on 6 individuals at 5 SNP loci that you applied Clark's algorithm to in Homework 6. Each row is one individual.
10, 10, 10, 10, 10 |
00, 00, 00, 11, 10 |
00, 00, 00, 10, 00 |
10, 10, 10, 11, 00 |
11, 11, 11, 11, 00 |
00, 00, 00, 10, 11 |
To run PHASE:
Write a couple of brief paragraphs explaining the output, and comparing your estimates with the estimates you got from Clark's algorithm in your Homework 6.
The data file ibd_test1.markers
gives marker genotypes at 2000 very closely linked SNPS for 8 individuuls.
The parameter file ibd_haplo.par
specifies the input data file and gives a number of other parameters
it will use in running ibd_haplo.
Download these two files to wherever you are running MORGAN prgrams.
Check your system knows where to find the program by saying
% which ibd_haplo
You may now run ibd_haplo by saying
% ibd_haplo haplo_pair.par > haplo_pair.out
You will have some descriptive but not very useful output in the file
haplo_pair.out. The useful output is in the file
qibd_lab3.out (This file name was specified in the parameter file).
This file contains about 16000 lines, one for each if the 2000 markers
for each of the 8 individuals.
Each line, apart from the header lines, consists of 4 numbers:
Marker number, Marker position (in Mbp), ibd-probability, non-ibd probability
Find some segments of inferred ibd between the two chromosomes of any
of these 8 individuals -- that is, high probabilties in the third column of the
output. You should be able to find at least 3 segments in total.
Wrte a brief paragraph, specifying
the individual, and the marker numbers/positions of these segments.
Hint: I did it quite crudely, by searching for the pattern
of three spaces followed by 0.9, and then I also tried
three spaces followed by 0.8 to find other less certain segments.
ibd_haplo is still not so user friendly; it is a work in progress.