QSCI 482 HOMEWORK 2, DUE FRIDAY, OCTOBER 11, 2002
1. This problem is about the relationship between level of significance
(alpha) and the "P-value" ["P" denotes probability]. For the example from
Topic 1, "The Language of Singles Bars", the computed value of the test
statistic was found to be 6.98. The "P-value" of .0305 in the notes was
actually obtained via computer. Let's see how to get "close to it" using
ONLY Table B.1, and interpret the P-value.
a. Using ONLY Table B.1, find the shortest possible range for the
probability exceeded by the value 6.98 for a chisquare distribution with 2
degrees of freedom. Your answer should look like, "L < P-value < U", where
"L" is the lower bound for this probability and "U" is the upper bound.
b. Now let's interpret. If the null hypothesis were true, the probability
of seeing a test statistic as large as the one that we actually observed
(6.98) is equal to what?--that's your answer to part [1a]. Now, compare
your part [1a] answer to the specified level of significance for this test
(.05), and explain how your answer to part [1a] lends evidence either for,
or against, the null hypothesis of uniformity among the 3 categories.
2. A biological oceanographer is studying the distribution of zooplankton
in the water column. She wants to know if the zooplankton are uniformly
distributed in the mixed layer (an oceanography term we don't need to
worry about). She has divided that layer into 5 sub-layers and counted
the number of zooplankton occurring in each after releasing 60 zooplankton
into a large mesocosm (large enough that the zooplankton may be considered
independent and don't eat each other!).
The zooplankton distributed themselves as follows:
Sub-layer No. Zoopl.
Surface 06
Sub-surface 08
Mid 13
Lower Mid 15
Deep Mixed 18
a. If she analyzes this as a goodness-of-fit problem (treating the layers
simply as named categories and not considering their order), what are her
results? Include the p-value associated with your test statistic.
b. If she now recognizes that she has ordered categories and uses a test
appropriate to ordered categories, what are her results? Include the
p-value here, too.
c. Which was the "better" test to use and why?
PROBLEM #3 TO BE DONE FOLLOWING TOPIC 3 [you can try it earlier, but you might find it a little confusing …].
3. The following data are frequencies of ferrets in two geographic areas,
with and without a particular disease. Test the null hypothesis (use the
.10 level of significance) that the prevalence of the disease is the same
in both areas. Do this in three ways: compute X2 , X2 (Yates) and X2
(Cochran-Haber). Compare the three values of the test statistic and
comment on their relationship to each other. SEE DATA ON NEXT PAGE.
AREA WITH DISEASE WITHOUT DISEASE
Area 1 20 39
Area 2 16 51
b. For the ferrets of Area 1, estimate the probability of disease from
the data.
c. For the ferrets of Area 2, estimate the probability of disease from the
data.
d. For the ferrets of the two Areas combined, estimate the probability of
disease from the data.