QSCI381: Introduction to Probability and Statistics

Problem set #1

#1. The following scores were obtained by a group of 4 boys and 4 girls in a test:

girls: 35, 39, 38, 36;

boys: 38, 44, 19, 35;

Which of the following conclusions can be obtained from these figures by means of purely descriptive methods and which require a generalization (or statistical inference).

  1. The girls' average is better than that of the boys'.
  2. Girls make better students.
  3. A boy was the top scorer among all students.
  4. The score of 19 for one of the boys is probably a mistake.
  5. If the third boy's score would have been 29, there would hardly be any difference between the two averages.
  6. Girls as a group tend to be more consistent achievers than the boys.
  7. The range of girls' scores is much narrower than the boys.
  8. The boys did not study hard enough for the test.

#2. One of the things to guard against in collecting data for statistical analysis is to avoid biases. Their magnitude is difficult to measure and they may quite likely lead to wrong inferences. Explain why each of the following samples may give biased results:

a) In order to assess public attitude towards family violence, 1000 names are randomly picked from phone books. Attempts are made to reach each listed person by phone and the conclusions are based on the responses from those who were contacted.

b) To study attitude towards personal cleanliness, a random sample of campus student community are asked to state number of times per week they take bath.

c) To find out about the quality of campus food service, all those entering the campus cafeteria between 12 noon and 1 pm are questioned on the adequacy of campus food service.

d) In order to determine the percent survival in a forest plantation, sample plots are located on both sides of the only access road to that area.

e) In order to gauge income level of 1975 class, written questionnaire is sent to all graduating students, without any provision for follow-up attempt to contact those who do not respond.

#3. In EXCEL or MINITAB, enter the following data in a column and name it 'RANDOM'.

04 60 67 89 32 95 55 35 57 86

30 81 02 18 87 68 28 44 86 84

  1. generate 4 columns of data as follows, and name these columns 'SQUARE', 'LOGE', 'LOG10' and 'SQRT' respectively.

SQUARE - square of data in column ‘RANDOM’

LOGE - natural log of data in column ‘RANDOM’

LOGTEN - logarithm to base 10 of data in column ‘RANDOM’

SQRT - square-root of data in column ‘RANDOM’

b. Generate plots of:

SQUARE vs RANDOM; LOGE vs RANDOM; LOGTEN vs RANDOM.

c. Compute means of data in these columns, and do the following:

- compute mean of RANDOM

- compute square-root of the mean of SQUARE

- compute the antilogs of the means of LOGE and LOGTEN

- compute the square of mean of SQRT

Are these values equal? Explain the differences in their values.

 

d. Create a printout of your work. Make sure that you annotate and highlight the results in the computer output.