Date: Nov. 12, 2009
The programs and scripts in this directory compute k-statistics
{k_0, k_1, k_2} and condensed identity coefficients for pairs of
individuals, based on genomewide marker data, but without reference
to known or presumed pedigree structure. Further description
is in the following paper:
Choi Y, Wijsman EM, Weir BS (2009) Case-control association testing
in the presence of unknown relationships. Genetic Epidemiology (in press)
PMID: 19333967, Online March 30, 2009.
To use the programs and scripts here, you will need to have perl
installed on your computer, and for the SNP versions, a C-compiler.
All instructions assume you are working in a linux environment.
There are two sub directories here, and three sets of programs.
1. The directory "snp" contains c-code to compute k-coefficients with
the program kstat, and condensed identity coefficients with the program
ibd_d9. The assumption is that the markers in the files are diallelic
(SNP) markers.
2. The directory "str" contains perl scripts to compute k-coefficients and
allele frequencies for str (multiallelic) markers with a different version of
the program kstat.
3. In the top level directory, there is a perl script, kinship.pl, which
computes kinship coefficients based on the estimated k-coefficients obtained
for either str or snp markers from one of the versions of kstat, or on the
basis of the condensed identity coefficients for the ibd_d9 program. This script
will work on the output files from any of these three programs, and will produce
a file that has, on each record, a pair of IDs followed by the estimated kinship
coefficient for that pair of individuals. This is a very simple-minded script,
and you can equally well use R or something else to perform the equivalent
computations.
To use the kinship.pl script, simply type:
./kinship.pl