This week's lab will cover two programs.
Note that some write-up is requested for each program.
The program PHASE is due to Matthew Stephens, and produces
estimates of haplotypes of individuals given their genotypes at mutiple loci.
It uses a model for similarities among haplotypes in a population, and samples
phasings of the genotypic data under this model.
PHASE is installed on the Biostat computers.
Alternatively a better version of PHASE or fastPHASE either for
linux or for Windows can be downloaded from
Matthew Stephens old UW download page. (You may be able to find a
more recent source, but this was the clearest I found.)
The second program is a MORGAN program, kin, which computes the kinship coefficient between specified pairs of individuals, and the inbreeding coefficient of specified individuals.
If using the Biostat linux computers:
Remember that if you did not discover how to put the
source ~statgen/.statgen.cshrc
into your own .cshrc (or equiv.)
file, you will need to give this command each
time you log on, to access the statgen programs.
10, 10, 10, 10, 10 |
00, 00, 00, 11, 10 |
00, 00, 00, 10, 00 |
10, 10, 10, 11, 00 |
11, 11, 11, 11, 00 |
00, 00, 00, 10, 11 |
A2) Use the PHASE program
to estimate the haplotype
frequencies and haplotypes for each individual in the problem above.
The data set in the file phase.inp is the same data as above,
but in Matthew Stephens' format!
Compare
your estimates with the estimates you get from Clark's algorithm.
To run PHASE:
Here
are the full
instructions for the PHASE software, if you need
them, or are interested in understanding the input format etc.
However, if you have problems, the first thing to do is to email me!
(The PHASE instructions are still here as of May 1, 2009, but
may go now that Matthew Stephens has left UW--
but they will still be somewhere on the web -- try Google if interested.
Also, if you are into this area, you may wish to try the more modern
fastPHASE, which is available through
Mathew Stephens page at Chicago.)
To run the program kin on the example file, type:
% kin jv_rep_kin.par
or, if you would like to send your output to a file, such as
kin.out, type
% kin jv_rep_kin.par > kin.out
or, if you decided to call the pedigree file something different
such as my-pedfile
% kin kin.par ped my-pedfile
(The ped key on the command line overrides whatever is in the
parameter file.)
Look at your output, or output file. It should be self-explanatory.
Practice modifying the jv_rep_kin.par file to calculate an inbreeding coefficient or kinship coefficient for another individual or pair of individuals. Apparently, kin thinks it is an error to try to compute a kinship of an individual with itself: I need to fix this, but it is not yet fixed!! .
Also note that for your single-component pedigrees you just leave out the component 1 bit of the statement. You can ask for any number of pairs for kinship in one statement (as the two pairs in the example), and any number of individuals for inbreeding (as the two in the example). You should have (for each component if you have more than 1), just one kinship request statement and just one inbreeding request statement in your parameter file.
Your output should include the following:
(i) Calculations of the kinship coefficients between at least two pairs of individuals in your pedigree. One of these pairs should be a pair of bilateral relatives other than siblings.
(ii) Calculations of the inbreeding coefficients for at least two individuals in your pedigree, at least one of whom is inbred.
Turn in a sheet of paper with your results, with brief explanation of the relationships between the individuals you have chosen (for kinship) or between their parents (for inbreeding).