Homework 5: due 11/5

1. (Based on Lange Ch 6, #1, and an example of Cavalli-Sforza and Bodmer)
In an idealized infinite population, 10% of people marry their first cousins (kinship coefficient 1/16), 25% marry their second cousins (kinship coefficient 1/64), and the remaining 65% marry unrelated individuals. All marriages have the same family size distribution. Consider a very rare recessive trait, with allele frequency q.
(a) Show that the mean inbreeding coefficient in the population is just over 1%.
(b) Show that the overall probability an individual is affected is q(0.01 + 0.99q)
(c) Show that the probability an affected individual is the child of a first-cousin marriage is 0.00625(1+15q)/(0.01 + 0.99q).
(d) How small must q be in order that the posterior probability an affected individual is the child of a first-cousin marriage is larger than the posterior probability the child is from a second-cousin marriage. (Answer: 0.015, not guaranteed).

2. (based on Lange Ch 5 #3)
A brother-sister full-sib pair, A and B, (mice, not humans) produce two offspring, D and E.
(a) What is the inbreeding coefficient of D ?
(b) What is the kinship coefficient between D and B ?
(c) What is the kinship coefficient between D and E ?
(d) What is the probability D and E carry 4 IBD genes at a locus ?
(e) What is the probability D and E each carries two IBD genes at a locus, but these are different genes?

Answers (?): 1/4, 3/8, 3/8, 1/16, 1/32

3. (Lange Ch 5, #9)
In producing inbred lines of mice, repeated brother-sister mating is often used. That is, starting with an unrelated pair at time 0, at full-sib litter-mates from generation n are mated to produce the litter at generation (n+1). Let f_n be the inbreeding coefficient of the individuals at generation n, and g_n be the kinship coefficient between two litter-mates at generation n (I am using g, because I cannot do psi in html.)
Note f_n+1 = g_n and show
g_n+1 = (1/2)g_n + (1/4)f_n+(1/4) = (1/2)g_n + (1/4)g_n-1+(1/4)
(Think about the possibilities of where two genes segregating from generation n+1 mates came from at generation n.)
Hence (1-g_n+1 ) = = (1/2)(1-g_n) + (1/4)(1-g_n-1)
Show this second-order difference equation has solution
1 - g_n = ((1/2)+(1/s)) ((1+s)/4)ⁿ - ((1/2)-(1/s)) ((1-s)/4)ⁿ where s = sqrt(5).

4. (based on Crow and Kimura, Ch 4, #15)
An individual B has phenylketonuria, a rare recessive condition, for which the allele frequency q is 0.01. What is the probability that B's relative C has phenylketonuria if
(a) C is a first cousin of B
(b) C is a nephew of B
(c) C is a double first cousin of B
(d) C is a quadruple half first cousin of B
(Note the kinship coefficient between B and C is the same for relationships (b), (c) and (d).)

5. (based on Crow and Kimura, Ch 4, #9)
This makes exact the notion due to Wright that gene identity by descent leads to correlations between relatves.
Consider a particular allele A with allele frequency q, and define indicator random variables I(g), for a gene g, where I(g)=1 if the allelic type of gene g is A, and 0 otherwise.
(a) Show E(I(g)) = q, and var(I(g)) = q(1-q)
(b) Show that for genes g1 and g2 segregating from B and from C, the correlation between I(g1) and I(g2)) is the kinship coefficient between B and C.
(c) If g1 and g2 are the two genes in an individual, shown that the variance of (I(g1)+I(g2)) is 2q(1-q)(1+f), where f is the inbreeding coefficient of the individual.
(d) If g1 and g2 are the genes in a parent B, and g2 and g3 are the genes in his child C (that is, the g2 gene is the one inherited by C from B), show that the correlation between (I(g1)+I(g2)) and (I(g2)+I(g3)) is (1+2f_C+f_B)/(2((1+f_C)(1+f_B))^1/2) where f_B is the inbreeding coefficient of B, and f_C is the inbreeding coefficient of C.