Homework2: due 10/8

1. In the MN blood group system there are two alleles, M and N, which are codominant. In a sample of 1000 people, 400 are MM, 440 are MN, and 160 are NN.
(a) Estimate the population frequency of the M-allele
(b) Assuming Hardy-Weinberg equilibrium, estimate the variance of your estimator
(c) Do you believe this population is in Hardy-Weinberg equilibrium?

2. At one of the loci of the rhesus blood group system, there are 2 alleles C and c.
a) In the simplest test, with only anti-C reagent, C is dominant to c. In a test of 1000 people, 160 are C-negative (genotype cc). Assuming Hardy-Weinberg Equilibrium, estimate the frequency of the c allele, and the variance of your estimator.
b) An anti-c reagent now becomes available, so that C and c are codominant. It is found that 400 individuals are CC and 440 are Cc. How does this change your estimate of the frequency of the c allele? And of the variance of the estimator?

3. In the some diallelic blood type system suppose the A and B alleles are codominant. A set of N mother-child pairs are sampled. Suppose there are
n11 AA mothers with an AA child
n12 AA mothers with an AB child
n21 AB mothers with an AA child
n22 AB mothers with an AB child
n23 AB mothers with an BB child
n32 BB mothers with an AB child
n33 BB mothers with an BB child
Suppose the A allele frequency is p, and the B allele frequency is q=1-p. Assume Hardy-Weinberg equilibrium. As shown in class, the probabilities of the above 7 type pairs are then p3, p2q, p2q, pq, pq2, pq2, and q3.

As shown in class, the maximum likelihood estimator of p, using all the data, is
(3 n11 +2 (n12+n21) + (n22 + n23 + n32))/(3N-n22)
(a) Suppose we only use the mothers to estimate the allele frequency; show the MLE is
(2(n11+n12) +n21+n22+ n23)/2N.
Show this estimator is unbiased. What would be its disadvantage, by comparison with the MLE?
(b) Suppose we use all our individuals to estimate p, disregarding the relationships between mothers and children. Show the estimator of p is
(4 n11+3(n12 + n21)+2 n22 + n23 +n32)/4N
Show this estimator is unbiased. What would be its disadvantage, by comparison with the MLE?
(Only a brief comment concerning comparisons is needed-- no quantitative assessment.)

4. (For class discussion only.)
The following comes from a real study, here at UW, and concerns typing of a family at a genetic marker locus with codominant alleles here labelled A, B, C and D. There are no definitive answers.
An (untyped) couple have six kids. Three kids are typed and are all type BD. The other three are untyped, but have children.
Untyped kid1, has spouse type AD, and 4 typed children, AB, BD, AB, and CD.
Question: if no errors, what type was kid1?
Untyped kid2, has spouse type AC, and two typed children, types AA and AC.
Question: if no errors, what can be said about the type of kid2?
Untyped kid3 and his untyped spouse have four typed children, types AC, BC, AD, and BC.
Question: what does this tell us about the types of kid3 and his spouse?

(a) Is there a problem with these data? Why?
(b) Is it more likely a pedigree error or a typing error? What additional information might help resolve that?
(c) Where is the error, most likely? Identify some possibilities. What additional information might help resolve that?