STAT/BIOSTAT 550 B (DL): Homework 2.


Page numbers refer to the book, but the same info is available (briefly) in the audio lectures. For this homework, refer to the second audio lecture (Chapter 1 part 2); sections 1.3 and 1.4 of the notes"

1. (See P.12)
In the MN blood group system there are two alleles, M and N, which are codominant. In a sample of 1000 people, 400 are MM, 440 are MN, and 160 are NN.
(a) Estimate the population frequency of the M-allele
(b) Assuming Hardy-Weinberg equilibrium, estimate the variance of your estimator
(c) Do you believe this population is in Hardy-Weinberg equilibrium?

2. (See P.17)
At one of the loci of the rhesus blood group system, there are 2 alleles C and c.
a) In the simplest test, with only anti-C reagent, C is dominant to c. In a test of 1000 people, 160 are C-negative (genotype cc). Assuming Hardy-Weinberg Equilibrium, estimate the frequency of the c allele, and the variance of your estimator.
b) An anti-c reagent now becomes available, so that C and c are codominant. It is found that 400 individuals are CC and 440 are Cc. How does this change your estimate of the frequency of the c allele? And of the variance of the estimator?

3. (Example, due to Weir.)
Suppose a population consists of a mixture of two randomly-mating subpopulations of equal size. Consider a locus with three codominant alleles. In one subpopulation, the three alleles have frequencies p=0.6, q=0.3,and r=0.1, while in the other subpopulation they are p=0.4, q=0.1, r=0.5. Which genotypes have a smaller frequency than they would have in a single random-mating population with the same overall allele frequencies?

4. (See P. 18)
In some diallelic blood type system suppose the A and B alleles are codominant. A set of N mother-child pairs are sampled. Suppose there are
n00 AA mothers with an AA child
n01 AA mothers with an AB child
n10 AB mothers with an AA child
n11 AB mothers with an AB child
n12 AB mothers with an BB child
n21 BB mothers with an AB child
n22 BB mothers with an BB child
Suppose the A allele frequency is q, and the B allele frequency is 1-q. Assume Hardy-Weinberg equilibrium. From Table 2.1 (P.17), the probabilities of the above 7 type pairs are then q3, q2(1-q), q2(1-q), q(1-q), q(1-q)2, q(1-q)2, and (1-q)3.

Then we showed in class (or equation (2.8), P.18), that we can find the the maximum likelihood estimator of q, using all the data: call this MLE q0.
(a) Suppose we only use the mothers to estimate the allele frequency; show the MLE is
(2(n00+n01) +n10+n11+ n12)/2N.
Show this estimator is unbiased. What would be its disadvantage, by comparison with q0?
(b) Suppose we use all our individuals to estimate q, disregarding the relationships between mothers and children. Show the estimator of q is
(4 n00+3(n01 + n10)+2 n11 + n12 +n21)/4N
Show this estimator is unbiased. What would be its disadvantage, by comparison with q0.
(Only a brief comment concerning comparisons is needed in (a) and (b) -- no quantitative assessment.)


UW - Statistics: Wednesday, 24-Jul-19 Contact: Elizabeth Thompson <eathomp@u.washington.edu>