Libr 599: Methods of Research in Librarianship

MINITAB NOTES

[Based on Minitab Reference Manual, Release 7]

The following notes are designed to aid you in using Minitab for inputting and manipulating your raw data. They are meant to be a bridge from the raw questionnaire data to the production of statistical tests based on a matrix of data in Minitab. Hence it is assumed that you already possess a large sheaf of completed questionnaires, and that you know which tests to use.

Suppose that you are a long-suffering Mariners fan, and decide to investigate various aspects of the American League standings. You create the following questionnaire and administer it to each team in the AL West (Oakland, Chicago, Texas, California, Kansas City, Seattle and Minnesota) and AL East (Boston, Toronto, Milwaukee, Detroit, Baltimore, Cleveland and New York).

*****************************

Baseball Questionnaire

1. Check one: __ AL West __AL East

2. Games won: __

3. Games lost: __

4. Games back __

*****************************

One of your first acts is to create a code book that documents exactly how you are going to interpret your questionnaire data.

*****************************

Baseball Questionnaire Code Book:

Question 1. This is a nominal data that is coded 1 if AL West, 2 if AL East. [No provision is made for missing data in this example. However, you will probably have to account for missing, or unusable responses.]

2/3. These are ratio data that will be entered as numbers.

4. This is ratio data with the only complication that the leading club will not have to score. You decide to use * as the missing value indicator.

*****************************

Next you get a large piece of lined paper and draw some columns on it. Beginning with your first questionnaire, you enter the following on your paper:

Location

Won

Lost

Games back

 

1

80

48

*

[Oakland]

1

73

54

6.5

[Chicago]

1

66

63

14.5

Etc.

1

65

65

16

 

1

64

64

16

 

1

63

66

17.5

 

1

58

71

22.5

 

2

71

57

*

 

2

66

64

6

 

2

60

68

11

 

2

60

69

11.5

 

2

59

68

11.5

Etc.

2

59

69

12

[Cleveland]

2

55

73

16

[New York]

 

You check your work and then decide to commit the data to Minitab.

Getting data into Minitab

Sitting at a terminal running Minitab, you place a ruler under the first row of data, and give the command:

Read c1-c4

At the Minitab data prompt you enter the first row:

1 80 48 *

and press carriage return. At the new data prompt you enter the second line:

1 73 54 6.5

and so on entering each line in turn. Ultimately you have no more data to input and at the data prompt you type:

end

A good policy is to check your work, so you decide to type you data matrix out:

print c1-c4

and your data scrolls past you on the screen. [Depending on the terminal you use, there may be special conventions for producing a hard copy of your results. Check with the lab personnel.]

*******> DANGER! If you exit Minitab without saving your data matrix, your data are lost forever. Having invested all this time and energy, you want to save your data. You use the save command. [Depending on the terminal you use, there may be special conventions about the drive name. Hence the pathname used here may not be appropriate for you. In the following command, I'm saving my work to a diskette on drive A in the file AL.DAT.]

save 'a:al.dat'

Now remove your diskette, tuck your paper printout of your diskette under your arm and leave to get a coffee.

The next time you want to use your data matrix, you begin Minitab, insert your diskette into drive A [or appropriate drive for the terminal you are using] and retrieve the matrix with the command:

retrieve 'a:al.dat'

Changing the data

Over coffee you discover two things: "c1" is not a very informative column name, and you typed in several wrong numbers.

Sitting at the terminal again, having retrieved your data, you decide to name the columns with these commands:

name c1 = 'location'

name c2 = 'won'

name c3 = 'lost'

name c4 = 'gb' [for games back]

Now when you issue the print c1-c4 command, the data are displayed with these more informative headings.

There are several changes you want to make in your data:

(a) The first row is incorrect and should read:

1 80 49 *

To change a single number you use the let command as follows:

Let c3(1) = 49

which indicates that the first number of column c3 should be 49.

(b) The last row of the matrix (that is, the fourteen row) is completely wrong. It should read:

2 56 72 17

You decide to delete the whole row with the delete command:

Delete 14 c1-c4

and to insert the new data with the insert command:

insert c1-c4

2 56 72 17

end

******> DANGER! It's very important that you understand not to use the read command to input these new data. If you had issued the command

read c1-c4

you would instruct Minitab to write over the data already residing in these columns. This is not what you want to do. Of course if something disastrous does occur, you can always retrieve your data from your diskette and start over again.

Remember that all the changes you make are modifications to the version of your data residing in the main memory of the computer. The changes will not be written to your diskette until you use the save command again. Therefore permanent changes must be captured for future use by using the save command.

(c) Finally the effect of caffeine has given you insight into the baseball standings, to wit: there is a correlation between the number of syllables in the city name and the performance of the team!!! That is, you want to add another column to your data matrix to express "Oakland" as 2, "Chicago" as 3, "Texas" as 2, and so on. Your data occupy columns 1 through 4 leaving 46 free columns. You can set the new data in any of these unused columns with the set command:

set c5

2 3 2 5 4 3 4 2 3 3 2 3 2 2

end

You name this new column "Syllable". Satisfied that you have set the stage for a successful data analysis, you save your data matrix again and log off.

Manipulating the data for various tests

Several days later you are ready to do some tests on your data. Sitting at a terminal running Minitab with your diskette in the drive, you retrieve your data matrix and print it out:

Location

Won

Lost

GB

Syllable

1

80

49

*

2

1

73

54

6.5

3

1

66

63

14.5

2

1

65

65

16

5

1

64

64

16

4

1

63

66

17.5

3

1

58

71

22.5

4

2

71

57

*

2

2

66

64

6

3

2

60

68

11

3

2

60

69

11.5

2

2

59

68

11.5

3

2

59

69

12

2

2

56

72

17

2

 

You are now ready to do any test that requires stacked data. But first consider how you are going to capture the results. If your strategy is "cut and paste", you can toggle on the printer for your terminal [check with the lab assistant]. You will produce paper results that you can then cut up and paste into a report. On the other hand, if your strategy is "digital copy", you can use the outfile command to create a log file that will report everything that occurs at the screen. [Note that the file declaration may be terminal dependent. Once again, check with the lab assistant. I am showing the declaration of a file in drive A. Be sure your diskette is in the drive.]

outfile = 'a:mydata.out'

This command directs a copy of everything shown at the screen to be reproduced in your file "mydata.out". Note that before exiting Minitab, you will have to close this log file with the "Nooutfile" command.

Suppose that you want to do an independent t test where Location is the independent variable [the categorizing variable in this case] and Won is the dependent variable:

TWOT for data in 'Won' and groups in 'Location';

Pooled.

Suppose that for one reason or another, you were not completely satisfied with the independent samples t test for the Location/Won relationship and decide to do a Mann-Whitney test on this relationship instead. At this time, Minitab does not have a stacked version of the Mann-Whitney test so you must unstack your data. This can be done with the copy command:

Copy 'Won' into c20;

Use 'Location' = 1.

Copy 'Won' into c21;

Use 'Location' = 2.

The first copy command makes a copy of the data in Won where Location is "1" and puts it in column 20. The second copy command copies the data where Location is "2" into column 21. As a result you have:

C20

C21

80

71

73

66

66

60

65

60

64

59

63

59

58

56

 

Now that you have unstacked your data, you can issue the Mann-Whitney command:

Mann-Whitney on c20 and c21

Suppose that you wish to do another analysis: the number of games lost by teams other than leaders. You need to have two columns, Lost and GB, but you want to remove teams with "*" in the GB column. To separate these data out from the matrix, leaving behind the rows of data where games back = *, you can use the copy command:

Copy 'Lost' 'GB' c20 c21;

Omit 'GB' = '*'.

The data in columns 20 and 21 would be shorn of the two division leaders who have asterisks in column GB:

C20

C21

54

6.5

63

14.5

65

16

64

16

66

17.5

71

22.5

64

6

68

11

69

11.5

68

11.5

69

12

72

17

 

Finally you are ready to test your theory about the effect of syllables on team performances. You try various tests and come to realize that the range of number of syllables (they range from 2 to 5) is too large. Suppose that you want to test the difference between those cities with two-syllable names against all the rest. In other words, you want to recode "3", "4" and "5" in the column Syllable to "9" [or any other number that you choose]. To recode a variable you use the code command:

Code (3 4 5) 9 'Syllable' c6

This command takes the values of 3, 4, and 5 and changes each to a 9 in the column Syllable. The result is placed in column 6. Suppose you name this column 'New_Syll'. Your matrix now looks like this:

Location

Won

Lost

Games back

Syllable

New_Syll

1

80

48

*

2

2

1

73

54

6.5

3

9

1

66

63

14.5

2

2

1

65

65

16

5

9

1

64

64

16

4

9

1

63

66

17.5

3

9

1

58

71

22.5

4

9

2

71

57

*

2

2

2

66

64

6

3

9

2

60

68

11

3

9

2

60

69

11.5

2

2

2

59

68

11.5

3

9

2

59

69

12

2

2

2

55

73

16

2

2

 

Two parting admonitions: Be sure to save any new matrix of data, and be sure to close your log file with the command:

Nooutfile

[Note the lack of spacing in this command.] This will close your results file. You can then edit the results file, and in this fashion embed your statistical results into your report.