Homework 2

QSCI 482 HOMEWORK 2, DUE FRIDAY, OCTOBER 11, 2002

1. This problem is about the relationship between level of significance

(alpha) and the "P-value" ["P" denotes probability]. For the example from

Topic 1, "The Language of Singles Bars", the computed value of the test

statistic was found to be 6.98. The "P-value" of .0305 in the notes was

actually obtained via computer. Let's see how to get "close to it" using

ONLY Table B.1, and interpret the P-value.

a. Using ONLY Table B.1, find the shortest possible range for the

probability exceeded by the value 6.98 for a chisquare distribution with 2

degrees of freedom. Your answer should look like, "L < P-value < U", where

"L" is the lower bound for this probability and "U" is the upper bound.

b. Now let's interpret. If the null hypothesis were true, the probability

of seeing a test statistic as large as the one that we actually observed

(6.98) is equal to what?--that's your answer to part [1a]. Now, compare

your part [1a] answer to the specified level of significance for this test

(.05), and explain how your answer to part [1a] lends evidence either for,

or against, the null hypothesis of uniformity among the 3 categories.

2. A biological oceanographer is studying the distribution of zooplankton

in the water column. She wants to know if the zooplankton are uniformly

distributed in the mixed layer (an oceanography term we don't need to

worry about). She has divided that layer into 5 sub-layers and counted

the number of zooplankton occurring in each after releasing 60 zooplankton

into a large mesocosm (large enough that the zooplankton may be considered

independent and don't eat each other!).

The zooplankton distributed themselves as follows:

Sub-layer No. Zoopl.

Surface 06

Sub-surface 08

Mid 13

Lower Mid 15

Deep Mixed 18

a. If she analyzes this as a goodness-of-fit problem (treating the layers

simply as named categories and not considering their order), what are her

results? Include the p-value associated with your test statistic.

b. If she now recognizes that she has ordered categories and uses a test

appropriate to ordered categories, what are her results? Include the

p-value here, too.

c. Which was the "better" test to use and why?

PROBLEM #3 TO BE DONE FOLLOWING TOPIC 3 [you can try it earlier, but you might find it a little confusing …].

3. The following data are frequencies of ferrets in two geographic areas,

with and without a particular disease. Test the null hypothesis (use the

.10 level of significance) that the prevalence of the disease is the same

in both areas. Do this in three ways: compute X2 , X2 (Yates) and X2

(Cochran-Haber). Compare the three values of the test statistic and

comment on their relationship to each other. SEE DATA ON NEXT PAGE.

AREA WITH DISEASE WITHOUT DISEASE

Area 1 20 39

Area 2 16 51

b. For the ferrets of Area 1, estimate the probability of disease from

the data.

c. For the ferrets of Area 2, estimate the probability of disease from the

data.

d. For the ferrets of the two Areas combined, estimate the probability of

disease from the data.