QSCI 482/Dr. Conquest               TAs Kennedy, Malinick

 

HW1--DUE *THIS* FRIDAY, OCTOBER 4, 2002. ALSO, IF YOU DID NOT FILL OUT A

CLASS SURVEY SHEET ON DAY 1, PLEASE DO SO BY THE END OF THE WEEK.

 

1. Attach a copy of a recent photo of yourself; it will help me learn

everyone's names. I will return it if you wish. You don't have to be

smiling ["mug shots" are OK] but photo must be in good taste [no pictures

of your last skinny-dipping party, please. No "Full Monty's".]. Thanks!

 

2. In a 1995 the Puget Sound Gilnetters Association conducted a field test

to test the effects of different kinds of fishing gear on alcid bird

entanglement (diving seabirds like rhinocerous auklet and common murre,

covered under the Migratory Bird Treaty Act).  First, a test was done to

see if the total bird entanglements followed a Poisson distribution with

mean mu=.09, since this is what was found from previous field seasons.

(That is, the overall entanglement rate is about .09 birds per net set.)

Out of 449 net sets (a fishing net set in the water and fishing for the

same specified period of time), 418 of them had 0 birds caught, 30 sets

had 1 bird caught, and 1 of the sets experienced 2-or-more birds caught.

At the .05 level of significance, do a goodness-of-fit test to test

whether the observed entanglements follow a Poisson distribution with mean

mu=.09, against the alternative hypothesis that they do not follow such a

distribution. What do you conclude?  [NOTE. The Poisson probability for a

particular count, X, is on p. 571 in Zar (in the 3rd ed. it's p. 569) and

is: exp(-mu)*(mu^X)/X!.]

 

H0: bird entanglement follows a Poisson distribution with mean 0.09

Ha: bird entanglement follows some other distribution

 

Assumptions: Data are a random sample and independent, none of the expected values is less than 1 and no more than 20% are less than 5

 

Critical value: Chisq0.05,2 = 5.991

 

Test Statistic:

num.entangled

observed.count

p(x)

expected.count

residual

Chisq

0

418

0.914

410.355

7.645

0.142424

1

30

0.082

36.932

-6.932

1.301097

2

1

0.004

1.713

-0.713

0.296731

Sums

449

1

449

 

1.740252

 

Test statistic = 1.740

 

Decision: 1.740<5.991, therefore we fail to reject H0.  There is insufficient evidence to distinguish this data set from a Pois(0.09).

 

3. This exercise involves computing probabilities for the binomial

distribution (in Zar, the chapter titled "More on Dichotomous Variables",

or look under "binomial distribution" in any elementary statistics book).

According to a mathematician who works there, The Anchor Gaming Company is

the largest supplier of slot machines in the world. Let's consider a

simple slot machine and compute some EXPECTED VALUES OF OUTCOMES FOR A

BINOMIAL PROBABILITY DISTIRIBUTION.  Suppose a slot machine has 4 slots,

each of which shows a picture of either a cherry or a pear, and the slots

work independently. For a given slot, the probability that a cherry shows

up is 0.4, and the probability that a pear shows up is 0.6. So for a given

"pull" of the machine, one can observe either: 4 cherries; 3 cherries and

1 pear; 2 cherries and 2 pears; 1 cherry and 3 pears; or 4 pears. Suppose

a given machine is "pulled" 10,000 times. How many times would we expect

to see:

 

a. 4 cherries

b. 3 cherries and 1 pear

c. 2 cherries and 2 pears

d. 1 cherry and 3 pears

e. 4 pears

 

a. Pr{4 cherries} = (.4)^4 = .0256; x 10,000 = 256 times.

b. Pr{3 cherries,1 pear} = 4 x (.4)^3 x .6 = .1536; x 10,000 = 1536

times.

c. Pr{2 cherries,2 pears} = 6 x (.4)^2 x (.6)^2 = .3456; x 10,000 = 3456

times.

d. Pr{1 cherry,3 pears} = 4 x .4 x (.6)^3 = .3456; x 10,000 = 3456

times.

e. Pr{4 pears} = (.6)^4 = .1296; x 10,000 = 1296 times.

 

The following problems (4-6, on the back side of this sheet) fall under the heading, "algebra review". If you have not done algebra for awhile, this will give you needed practice. If this stuff comes easily to you, give a classmate some help--we're all in this together.

 

 

4. You may recall that the formula for the standard error for a sample

mean X-bar is: Standard Error of X-bar = sqrt(sample variance/n), where n

is the sample size, and "sqrt" stands for "square root of". You are

reading a journal paper and trying to find out what the original sample

size was (which the authors failed to state, and the editors did not catch

the omission).  In a table, the authors state that "the standard error for

the data was 10.3 kg." Elsewhere in the paper, you find that the sample

variance was 1,697.44 kg^2. Now, solve for the original sample size, n.

 

10.3=sqrt(1697.44/n)

10.32 = 1697.44/n

n=1697.44/10.32

n=16

 

5. We have a random sample of n = 5 weights (kg) of a particular kind of

animal: 3.1, 3.4, 3.6, 3.7, 4.0. The sample mean, X-bar, is 3.56 kg.

Now compute the sample variance of these data, using the standard

"machine formula" calculation for the sample variance, s^2:

s^2 = {Sum([Xi^2]) - [(Sum[Xi])^2]/n}/(n-1)

 

Sum([Xi2]) = 63.82;      [Sum(Xi)]2 = 17.82 = 316.84; n = 5

 

s2 = (63.82-316.84/5)/4 = 0.113

 

 

6. The following data (X) are known to come from a lognormal distribution;

that is, the natural logarithms of the data, Y = ln(X), will follow a

normal (bell-shaped) distribution. Here are the data in the original

units:  3.67, 4.01, 3.85, 3.92, 3.71, 3.88, 3.74, 3.82 ml.

 

a. Compute the sample mean of the log-transformed data.

Let yi = ln(xi)

ybar = sum(yi­)/8 = 10.729/8 = 1.34

 

b. Compute the sample variance of the log-transformed data. Take the sqrt

to get the sample standard deviation.

 

s2y = (sum(yi2)-[sum(yi)]2/n)/(n-1)

 = [14.396-10.729^2/8]/7 = 0.00089

sy = sqrt(0.00089) = 0.0298

 

c. It turns out that the 95% confidence interval (CI) for the mean of the

log-transformed data is xbar +- 2.365*(std. deviation/sqrt(n)). Compute the 95%

CI for the log-transformed data; then transform each of the endpoints back

to get the 95% CI in the original units [ml]. Comment upon the symmetry of

the CI for the log-transformed units [ln(ml)], and the back-transformed

units [ml].

 

1.34+/-2.365*0.0289/sqrt(8) = 1.34+/-0.0242

1.32<=mu(y)<=1.36

e1.32<=mu(x)<=e1.36

3.74<=mu(x)<=3.90

xbar=3.825

 

The CI in the transformed units is inherently symmetric about the mean because we added and subtracted the same value from the mean to obtain the end points.  To assess symmetry in the original units, let’s measure the distance from the original mean and the lower and upper end points:

 

3.825-3.74 = 0.085

3.9-3.825 = 0.075

 

It appears that there is asymmetry in the CI for the original units, with the mean closer to the upper limit.  This is not surprising because we utilized a non-linear transformation.  That means that the relationship between the mean and the endpoints is not preserved when the data are back-transformed (it’s not simply a difference in scale).

 

Many people accounted for the difference as rounding error, and in fact different results were seen due to rounding.  Points were subtracted if symmetry was commented on without actually any calculation to assess the symmetry (you can’t just say something is so, you have to show it is).