Homework 4: A bit more computing; for discussion 10/29

Recopy the files in /user0/thompson/Class578C/Test, and rerun the make test_set_up program as before. Remember to do the make ultraclean command to clean out old stuff.
A little bit has been added to this program to give the inbreeding coefficients of all the inbred inviduals of the JV pedigree.

Now make test_sim_ibd.
Be patient; it may take up to 30 seconds if the machine is quite busy.
This is a new program, in the same /user0/thompson/Class578C/Test directory. It uses the same pedigree set-up to simulate the descent of genes in pedigrees, by simulating those meiosis indicators we met in the very first class. In general it will simulate a set of n linked marker loci (labelled 1,2,...,n), and a trait locus (labelled 0) which could also be on the same chromosome. However, right now, with the input file it is using it is simulating 6 independent (unlinked) loci, so you are just getting six realizations of the same thing.

This time the output is in the file out5 (just to be different). Say more out5 to look at your output. What is given, after all the set-up checks which you can ignore, is a table of the estimated probabilities of the IBD patterns among the paternal and maternal gene of 531 (the person at the bottom of that pedigree) and one parent (431 actually).

                look here
 pt 1 label 0    1 1 1 1 :  0.0109 0.0120 0.0118 0.0119 0.0115 0.0111
 pt 2 label 1    1 1 1 2 :  0.0255 0.0256 0.0258 0.0255 0.0255 0.0254
 pt 3 label 3    1 1 2 1 :  0.0712 0.0730 0.0722 0.0729 0.0715 0.0731
 pt 6 label 6    1 2 1 1 :  0.0520 0.0513 0.0501 0.0503 0.0526 0.0504
 pt 7 label 7    1 2 1 2 :  0.0715 0.0712 0.0706 0.0726 0.0713 0.0725
 etc. etc.
The way to read this is by looking at the column marked "look here" which specifies which of the 4 genes of the two individuals are IBD. The first two digits are for 531 (pat, mat), then for 431 (pat,mat).
So 531 has two IBD genes if the first two digits are the same.
and 431 has two IBD genes if the second pair of digits are the same.
So now find your best estimate of the inbreeding coefficient of 531, and of 431. How well do these agree with the exact values from the new test_set_up program? How could you use these numbers (all the numbers, not just the inbreeding coeffs) to get an estimate of the kinship coefficient of 531 and 431? What would this kinship coefficient be if 431 and 531 were not inbred?

In the file n_iter, which should have got copied to your directory when you did the recopy-ing above, are two numbers.
100000 1000000
The first of these is the number of realizations for each locus (currently 100,000). (The second is irrelevant, but should be larger.) This is the one thing you can change easily; edit the first number and try make test_sim_ibd again. (Don't get me, or yourself, yelled at by MSCC by making it too huge.)
How many realizations do you think you need to get good estimates of the IBD probabilities? (How good? -- you decide)