CREATING
EFFICIENT AND SYSTEMATIC CATALOGS:
LUBETZKY'S SECOND OBJECTIVE AND EMIPIRICAL INVESTIGATIONS OF AUTHORS AND
WORKS
by
ALLYSON
CARLYLE
1. Introduction
The intellectual challenge stimulated by the study of descriptive
cataloging is matched by few topics in library and information science. As a student entering library school at UCLA
in 1982, I had no idea that cataloging would provide such a rich area of
study. I was, however, quickly
enlightened, and my life changed as a result.
Seymour Lubetzky was largely responsible for
this change in that it was his work, which has provided the conceptual
foundation of the study of descriptive cataloging in this century,
that fascinated me first. I was
fortunate to have been introduced to descriptive cataloging by Betty Baughman,
who worked with Seymour Lubetzky in the development
of the cataloging courses at UCLA. The
combination of her excellent teaching and the challenges posed by Lubetzky's analysis of cataloging problems drew me and my
research to the heart of descriptive cataloging.
Central to Lubetzky's thought is the notion
of a catalog as "a systematically designed instrument in which all
entries, as component parts, must be properly integrated." (Lubetzky,
1969: p. 3). One important means
by which a catalog becomes such an instrument is in meeting the second
objective of the catalog. The second
objective of the catalog, most clearly articulated by Lubetzky
in his Code of Cataloging Rules: Author and Title Entries. An Unfinished Draft, states that a
catalog must "relate and display together the editions which a library has
of a given work and the works which it has of a given author." (Lubetzky, 1960: p. ix.)
In this paper I briefly review my research,
which has as its focus the
second objective; in particular, the organization
of author and work records in online catalogs.
My dissertation (1994), published in summary form in the Journal of the American Society for
Information Science (1996), "Ordering Author and Work Records: An Evaluation of Collocation in Online
Catalog Displays," examines the effect of various system features on the
collocation of author and work records in online catalogs. "Fulfilling the Second Objective in the
Online Catalog: Schemes for Organizing
Author and Work Records Into Usable Displays", published in Library Resources & Technical Services
(1997), investigates how codes of filing rules and Barbara Tillett's
bibliographic relationship taxonomy (1991) may be used to help organize author
and work records more effectively in online catalogs. "The Role of Classification in the
Creation of Author and Work Displays in Online Catalogues," delivered at
the Sixth International Study Conference on Classification Research
(1997), investigates the methods by
which library classification schemes have organized author and work
records. Current research,
"User Categorisation of Works: Toward
Improved Organisation of Online Catalogue Displays" (in press), looks at the
characteristics people use for grouping the editions and works related to a
particular work.
2. Ordering Author and
Work Records: An Evaluation of
Collocation in Online Catalog Displays
2.1 Introduction
Computerization has vastly expanded the
catalog's power to retrieve records. It
has, at the same time, confounded catalog designers' attempts to create
sensible displays. In my search for a
dissertation topic, I was intrigued that no study had ever been done to
determine how well a catalog of any kind fulfilled the second objective. I
also suspected that the
computerization of catalogs had an effect on the ability of catalogs to
fulfill the second objective. As a result, I decided that my dissertation
research would consist of a survey of online catalogs posing the question: What is the effect of online catalog features
on the collocation of author and work records in online catalogs?
Several measures were used to analyze the
effect of online catalog system features on collocation in online
catalogs. Five worst-case authors
(Homer, William James, H.D., Alice Walker, and Peter Gray) were searched in
eighteen sample online catalogs using author commands available in those
catalogs. Five worst-case works (Paradise Lost, John Milton; A Christmas Carol, Charles Dickens; Ulysses, James Joyce; Sonnets, William Shakespeare; Utopia, Sir Thomas More) were searched
in the same catalogs using title commands. The worst-case method was used
because worst cases were seen as more likely to bring out the weaknesses of
online catalogs than a random sample of searches.
Dependent variables to measure the
collocation of author and work records included: number of interruptions of
author and record sets, number of irrelevant intervening records retrieved, and
precision. Independent variables
representing system features analyzed for their effects on collocation included
match type (character-string vs. keyword) and catalog size (large, medium, or
small).
The authors and works selected for sample
searches, as mentioned above, were examples of worst cases. Worst cases were defined as authors and works
that had many records in online catalogs and that were represented by a variety
of record types. Sample catalogs were
selected based on catalog vendor (major vendors were selected), availability
via Internet, size (three sizes of catalog per vendor were selected: small, medium, and large), and collection
characteristics (at least 75 percent of retrospective conversion to online
catalog complete; English language, general library collection; located in the
United States).
2.2 Selected Results and
Discussion
Descriptive statistics were used to analyze
the data collected in this study as the selection of worst case searches and
sample catalogs was not random. Because
sample sizes were relatively small and many of the standard deviations were
large, the median was reported instead of the mean.
Match type, that is, left-to right phrase matching, here called
character string matching, versus keyword matching, was investigated for its
effect on collocation. Character-string
matches performed better than keyword matches for most searches. When author searches were measured using
number of interruptions, the results were dramatic (Figure 1). Only ten percent of character-string matches
had three or more interruptions. The
median number of interruptions for keyword matches (3.5) was greater than the
median for character-string matches (2.0).
That most character-string matches had two or fewer interruptions
indicated that in most cases character-string matches accomplished the goal of
collocating author record sets.
[Figure 1: Match Type
(Authors): No. of Interruptions about
here]
The effect of match type on number of interruptions in work record sets
was less clear, although character-string matches still performed better than
keyword matches (Figure 2). Seventy
percent of character-string matches for work record sets had three or more
interruptions, as opposed to 83 percent of keyword matches. The median number of interruptions for
character-string matches was four and keyword, five.
[Figure 2: Match Type
(Works): No. of Interruptions about
here]
These results were not unexpected, particularly with respect to author
record sets. Character-string matches
matched a search string in a single field.
Most character-string matches arranged records with identical author
headings together, ensuring collocation of author record sets. Although author commands with
character-string matches were available in every catalog surveyed, author
commands with keyword matches were not.
This was perhaps because systems designers assumed the superiority of
character-string matches for searching for individual authors. The results of this research support such an
assumption.
The finding that character-string matches performed as poorly as
keyword matches in collocating worst-case work record sets was somewhat
unexpected. However, because uniform
titles for works are not required in AACR2,
it was not surprising that work records did not collocate well. Also, a work is, by definition, determined by
the contents of two fields, an author and a title field, as opposed to an
author, which is determined by the contents of a single field, an author
field. The level of complexity
engendered by the additional field, the title field, may itself have had an
effect on record arrangement. This
variable, record structure, was not studied. What was unexpected was that
character-string matches differed so little from keyword matches in achieving
collocation, especially considering that keyword matches often arranged records
in essentially random (record number) order, and character-string title matches
almost always arranged records in alphabetical order by title.
Catalogs searched were selected to be representative of catalog
databases of different sizes. Small
catalogs contained fewer than 299,999 bibliographic records, medium catalogs
contained between 300,000 and 999,999 bibliographic records, and large catalogs
contained more than 1,000,000 records.
Catalog size had a much smaller effect on collocation of author and
work records than may have been expected; only about half of the results in the
study showed an impact. Catalog size had
the smallest effect on collocation of author record sets. Measured by the number of irrelevant records
intervening in an author work set (no. irrelevant intervening records), very
little effect was seen (Figure 3).
Although catalog size had some effect on the collocation of work record
sets, when measured by precision, that effect was negligible (Figure 4).
[Figure 3: Catalog Size
(Authors): No. Irrelevant Intervening
Records about here]
[Figure 4: Catalog Size
(Works): Precision about here]
The finding that catalog size had an effect on collocation only in
about half the searches performed was surprising for two reasons. First, two of the measures used in the study,
number of interruptions and number of irrelevant intervening records, were
directly related to the number of records retrieved. One would expect that increasing numbers of
records would be retrieved in small, medium, and large catalogs, respectively,
and that measures based on record numbers would reflect that increase. It is also reasonable to expect that number
of interruptions would increase as number of records retrieved increases, and
that precision would be lower in a large catalog search than a small catalog
search.
That author record sets were little affected by catalog size was
perhaps not so surprising when one examines the performance of the system
variables. For author searches, the
variables that determined collocation most strongly were those associated with
match type; character-string matches collocated author
records more successfully than did keyword matches. Since catalogs of all sizes had
character-string and keyword matches, one might predict that catalog size would
not be an important factor.
The finding that catalog size did not have a powerful influence on
collocation has implications for catalog maintenance and cataloging policies in
small and medium-sized catalogs.
Cataloging folklore purports collocation to be better in smaller
catalogs because not as many records exist to interrupt a related record
set. For example, Anglo-American Cataloguing Rules, 2nd ed., 1988 Revision (AACR2)
Rule 1.0.D, which provides catalogs options for three different levels of
description, and Rule 25.1.A, which allows catalogers the option not to use
uniform titles, are evidence that smaller catalogs have been seen as having
different requirements from larger catalogs.
The findings of this research indicate that this assumption may be
incorrect. Smaller catalogs may not be
exempt from collocation problems, particularly for worst cases. They may require use of uniform author names
and uniform titles as much as a larger catalog.
3. Fulfilling
the Second Objective in the Online Catalog:
Schemes for Organizing Author and Work Records Into Usable Displays
3.1 Introduction
A study of filing rules as schemes for display came about as a result
of both my dissertation research and Lubetzky's
perception of the importance of the catalog as a "systematically designed
instrument." I had attempted a
study of filing rules for my doctoral qualifying exam, which, for various
reasons, did not work out. However, an
initial analysis of filing rules informed my thinking as I collected data for
my dissertation, which in turn informed my thinking further with respect to
filing rules. What followed from this
interaction is the research described in this section.
3.2 Filing Rules as Schemes for
Display
Catalog displays as constructed by codes of filing rules are frequently
highly organized, consisting of a variety of categories or groups of similar
records. An historical analysis of codes
of filing rules discovered the following common categories:
Work Categories
• editions of the work in the original language
• analytics, that is, editions of the work contained within
collections
• translations
• special classes of materials, including selections and
manuscripts
• works about the work
Author Categories
• complete
works
• selected
works
• selections from a single work or from various works
• single works
• spurious and doubtful works
• works about the author
While these categories may be used to create systematic displays in
online catalogs, they fall short with respect to works, particularly when
viewed in the light of Tillett's bibliographic
relationship theory (1991). They do not,
for instance, clearly distinguish or identify works related to a work, nor do
they clearly distinguish or identify sequential relationships.
3.3 Tillett's
Taxonomy of Bibliographic Relationships as a Scheme for Display
Tillett's taxonomy of bibliographic relationships
(1991) and Smiraglia's refinement of the derivative
relationship (1992) were analyzed for their potential contribution to the
creation of systematic displays of work records in the catalog. The analysis suggested the following
interpretation of the bibliographic relationships taxonomy to be used as a
basis of a scheme for display organization:
• equivalence relationships, including:
• equivalent texts,
which share identical content and authorship
• near equivalents,
which in addition to identical content and authorship,
share other characteristics as well
• derivative relationships, including:
• revisions
• adaptations
• translations
• extractions
• amplifications
• whole-part relationships
• sequential relationships
• descriptive relationships
• shared
characteristic relationships
Using this scheme as a basis for online catalog work displays also has
limitations, particularly, the lack of a distinction between derivations whose
intellectual or artistic content are close to the original edition and those
whose intellectual or artistic content are not.
3.4 A Relationship-Based,
Organized Scheme for Display of Author and Work Records in the Online Catalog
Analysis of the strengths of the filing rules scheme and the
bibliographic relationships scheme led to the proposal of a new, organized
scheme for display of author and works records in online catalogs based on
relationships among items. The proposed
scheme also incorporated records that could be retrieved in keyword searching
which might or might not be related to the author or work searched, including
records for items which might be only peripherally related to them (see Figures
5 and 6).
[Figure 5: An Organized Display
for Works about here]
[Figure 6: An Organized Display
for Authors about here]
4. The Role of
Classification in the Creation of Author and Work Displays in Online Catalogues
Other organizational schemes that could be used to improve online
catalog displays for author and work records are library classifications. Library classification schemes such as the Universal Decimal Classification (UDC), the
Library of Congress Classification (LCC)
and the Dewey Decimal Classification
(DDC) organize authors and works associated with many items into specific
classes, each with its own notation. In
this research I analyzed the types of classes used in selected religion and
literature schedules and in auxiliary tables in the UDC, the DDC, and the LCC.
Classes identified correspond closely to the types of groupings created
by the codes of catalog filing rules.
Commonly occurring classes for authors in the classification schemes
included:
• complete
works of the author
• partial collections or selected works
• individual works
• biographies, criticism, concordances, etc.
Commonly occurring classes for works in the classification schemes
included:
• editions in the original language, sometimes including
groups for early versions, translations, annotated editions, and sequels
• translated
editions, sometimes including a special group for bilingual editions
• auxiliary materials, including concordances, indexes, dictionaries,
sources
• parts or selections
• adaptations, paraphrases
• works about the work, including history, commentary,
criticism, etc.
Classification numbers, possibly in combination with book numbers such
as Cutter numbers, which further refine the groupings of authors and works on
the shelf, might be used to create summary or grouped author and work displays
automatically in online catalogs.
Further research is necessary to determine whether or not automatic grouping
would organize records successfully.
5. User Categorisation of Works: Toward Improved Organisation of Online Catalogue Displays
The design of online information systems, including online catalogs,
should respond effectively to user needs and searching behavior. The last research project I review here
investigated how people organize items related to a work. In this research project, fifty study
participants were solicited in a shopping mall in
Written descriptions were analyzed using content analysis to discover
the types of characteristics that were used to sort items in the study. Eleven types of characteristics were
discovered. In the list below, types of
characteristics are listed with sample participant descriptions in parentheses
following each type.
• physical format (hard back books, VCR tapes, little kid
tapes)
• audience (youth,
sight impaired, grown up people, piano players)
• content description (play, more involved plots with more details,
short version)
• pictorial elements (animated, cartoon pictorial, had a mans face
on the front, color artwork, dull covers)
• usage (could be
read by small group for presentation, theater,
for relaxation, fun, dull)
• language (foreign
language, Spanish, non-English)
• physical characteristics (medium size, largest books, thick hard
bind)
• content age, integrity (unabridged, abbreviated versions,
classic, original text-line)
• textual characteristics (big print, book [sic.] that say
Scrooge
on them)
• creator, performer (produced other than Charles Dickens,
Disney
type story, adapted
by other author's take from original)
• 'odds
& ends' (alone!, miscellaneous,
other)
The grouping data were analyzed using cluster analysis (this
part of the study has not yet been published). Preliminary analysis indicated
the common groups listed below:
• audios (cassettes
and CDs)
• children's videos
• adult videos
• large format
paperbacks
• small format
paperbacks
• foreign language
materials
• adult hard cover
materials
• illustrated hard cover materials
(children's)
• trivia book
• picture book
versions with lots of text
• picture book
versions with not much text
• activity versions
(piano book, Advent calendar)
• item about A Christmas Carol
The research
described here is exploratory; the findings indicate possible types of groups
and characteristics that may be useful for organizing online catalog
displays. Every work is unique, and the
group of editions and works about a particular work equally as unique and
individual. Further research using
different works, and different types of works, for example, non-literary works,
is necessary to make generalizations regarding what groups and types of
characteristics are commonly associated with people's perceptions of
works. In addition, further research is
required to determine the impact of grouping in online catalog displays
on user searching, and whether or not grouping based on user categories is more
effective than grouping based purely on relationships among items, such as that
suggested by the filing rules and bibliographic relationships taxonomy
analysis.
6. Conclusion
The extent to which a library catalog fulfills the second objective
affects cataloger users every day. In my
doctoral program at UCLA, I supported myself by working as a librarian at the
Beverly Hills Public Library. One day at
the information desk I got a long distance telephone call from a woman in
I submit that it should not take five telephone calls to libraries in
two states and a cataloger at the information desk for a patron to find an
edition of a work held commonly in American libraries. The catalog should be, as the Paris
Principles state, "an efficient instrument for ascertaining ... which
works by a particular author and which editions of a particular work are in the
library." It should not be a roadblock, preventing users and librarians
alike from finding the items they seek. Lubetzky's life's work was to make library catalogs
efficient and systematically designed instruments; unfortunately, as the George
Sand story illustrates all too clearly, much has yet to be done to accomplish
this task.
What stimulates Lubetzky's work also inspires
my own: a recognition of the potential
of library catalogs to be effective and intelligible instruments which help
users to discover valuable resources that they may not have known about before,
and which show the relationships present among the items held in the library
clearly and unambiguously. I am grateful
for Lubetzky's invaluable contribution to the literature
and theory of cataloging. It sparked my
interested in a topic that has become the center of my professional life, and
it remains a source of inspiration and guidance. I am proud to follow in the footsteps of the
most important cataloging scholar of this century.
7. List of Cited Works
Anglo-American
cataloguing rules, 2nd edition, 1988 revision.
(1988).
Carlyle, Allyson (1996). "Ordering Author and Work Records: An Evaluation of Collocation in Online
Catalog Displays." Journal
of the American Society for Information Science. 47 (7):
538-554.
----- (1997). "Fulfilling the
Second Objective in the Online Catalog:
Schemes for Organizing Author and Work Records into Usable
Displays." Library Resources &
Technical Services. 41
(2): 79-100.
----- (1997).
"The Role of Classification in the Creation of Author and Work
Displays in Online Catalogues." in Knowledge
Organization for Information Retrieval:
Proceedings of the Sixth International Study Conference on
Classification Research, held at
-----
(In press). "User Categorisation of
Works: Toward Improved Organisation of Online Catalogue Displays." Journal of Documentation.
Lubetzky,
----- (1969). Principles of Cataloging, Final Report, Phase I:
Descriptive Cataloging.
Smiraglia, Richard.
(1992). Authority Control and the Extent of Derivative Bibliographic Relationships. Ph.D. diss.,
Tillett, Barbara B. (1991).
"A Taxonomy of Bibliographic
Relationships." Library
Resources & Technical Services. 35 (2): 150-158.

Distribution of Match Type
(Authors): No. of Interruptions
|
Match Type
(Authors): No. of Interruptions |
||||
|
Interrupts |
Chr.Str. |
Percent |
Keyword |
Percent |
|
0 to 2 |
81 |
90% |
26 |
42% |
|
3 to 5 |
7 |
8% |
8 |
13% |
|
6 to 8 |
2 |
2% |
4 |
6% |
|
9 to 11 |
0 |
0% |
0 |
0% |
|
12 up |
0 |
0% |
24 |
39% |
Statistics for Match Type
(Authors): No. of Interruptions
|
Match Type (Authors): No.of Interrupts |
||
|
STATISTIC |
CHAR.STRING |
KEYWORD |
|
Mean |
1.71 |
17.15 |
|
Standard Error |
.12 |
2.91 |
|
Median |
2.00 |
3.50 |
|
Mode |
2.00 |
2.00 |
|
Standard Dev. |
1.18 |
22.94 |
|
Variance |
1.40 |
526.03 |
|
Range |
7.00 |
110.00 |
Figure
1--Match Type (Authors): No. of
Interruptions

Distribution of Match Type
(Works): No. of Interruptions
|
Match Type
(Works): No. of Interruptions |
||||
|
Interrupts |
Chr.Str. |
Percent |
Keyword |
Percent |
|
0 to 2 |
27 |
30% |
15 |
17% |
|
3 to 5 |
36 |
40% |
37 |
41% |
|
6 to 8 |
18 |
20% |
17 |
19% |
|
9 to 11 |
6 |
7% |
8 |
9% |
|
12 up |
3 |
3% |
13 |
14% |
Statistics for Match Type (Works): No. of Interruptions
|
Match Type (Works): No.of Interruptions |
||
|
STATISTIC |
CHAR.STRING |
KEYWORD |
|
Mean |
4.67 |
6.49 |
|
Standard Error |
0.36 |
0.60 |
|
Median |
4.00 |
5.00 |
|
Mode |
2.00 |
3.00 |
|
Standard Dev. |
3.42 |
5.73 |
|
Variance |
11.66 |
32.81 |
|
Range |
19.00 |
36.00 |
Figure 2--Match Type (Works): No. of Interruptions

Distribution of Catalog Size
(Authors): No. Irrel. Int.
Recs.
|
Catalog Size (Authors): No. Irrelevant Intervening Records |
||||||
|
Int. Recs. |
Small |
Percent |
Medium |
Percent |
Large |
Percent |
|
0 - 9 |
40 |
84% |
38 |
78% |
38 |
70% |
|
10 - 19 |
0 |
0% |
0 |
0% |
4 |
7% |
|
20 - 29 |
0 |
0% |
0 |
0% |
0 |
0% |
|
30 - 39 |
0 |
0% |
1 |
2% |
0 |
0% |
|
40 - 49 |
0 |
0% |
1 |
2% |
1 |
2% |
|
50 up |
8 |
16% |
9 |
18% |
11 |
20% |
Statistics for Catalog Size
(Authors): No. Irrel. Int.
Recs.
|
Catalog Size (Authors): No. Irrelevant Intervening Records |
|||
|
STATISTIC |
SMALL |
MEDIUM |
LARGE |
|
Mean |
26.96 |
84.98 |
187.46 |
|
Standard Error |
10.10 |
31.42 |
65.08 |
|
Median |
0.00 |
0.00 |
0.00 |
|
Mode |
0.00 |
0.00 |
0.00 |
|
Standard Dev. |
70.69 |
219.93 |
478.24 |
|
Variance |
4996.91 |
48369.94 |
228717.61 |
|
Range |
366.00 |
956.00 |
2101.00 |
Figure 3--Catalog Size (Authors): No. Irrelevant Intervening Records

Distribution of Catalog Size
(Works): Precision
|
Catalog Size (Works):
Precision |
||||||
|
Precision |
Small |
Percent |
Medium |
Percent |
Large |
Percent |
|
.8 - 1 |
2 |
3% |
0 |
0% |
0 |
0% |
|
.6 - .79 |
2 |
3% |
2 |
3% |
0 |
0% |
|
.4 - .59 |
5 |
8% |
3 |
5% |
6 |
10% |
|
.2 - .39 |
13 |
22% |
13 |
22% |
20 |
33% |
|
0 - .19 |
38 |
63% |
42 |
70% |
34 |
57% |
Statistics for Catalog Size
(Works): Precision
|
Catalog Size (Works): Precision |
|||
|
STATISTIC |
SMALL |
MEDIUM |
LARGE |
|
Mean |
.22 |
.17 |
.19 |
|
Standard Error |
.03 |
.02 |
.02 |
|
Median |
.14 |
.12 |
.17 |
|
Mode |
.17 |
.13 |
.17 |
|
Standard Dev. |
.22 |
.16 |
.14 |
|
Variance |
.05 |
.03 |
.02 |
|
Range |
.99 |
.74 |
.51 |
Figure 4--Catalog Size (Works): Precision
WORK NAME / AUTHOR NAME
Editions:
• Books
• Recordings
• Large print, Braille, ...
• Work Name published with other works
• Revisions, updated editions
• Translations
• Parts, selections, ...
Adaptations & Related Works
• Abridgements, simplified
versions, summaries
• Sequels, supplements
• Videos, motions pictures
• Musical versions
• Pictures or other images
• Multimedia, computer
versions
• Indexes, concordances
• Miscellaneous
Works about Work Name
Items probably related to Work Name
Items that may or may not be related to Work Name
Other works by Author Name
Figure 5--An Organized Display for Works
AUTHOR NAME
Single
Works:
• Work names A - H
• Work names I - O
• Work names P - Z
Collected Works
Selections from Author Name's works
Spurious and doubtful works
Works about Author Name
Items probably related to Author Name
Items that may or may not be related
to Author Name
Works by the same/related
author: Author Name 2
Figure 6--An Organized Display for Authors
Reproduced by permission from The Future of
Cataloging: Insights from the Lubetzky Symposium,
edited by Tschera Harkness
Connell and Robert L. Maxwell.
© 2000 by The American
Library Association.