FINAL REPORT

Joint Task Group on Streamlining Authority Record Creation

(PCC Standing Committees on Automation and Standards)

Executive Summary

The Joint Task Group was charged with examining standards-based and automation-based approaches to streamlining the creation of authority records for contribution to national programs. An important preliminary issue is the normalization conventions used in various systems, which affect heading uniqueness. To the extent these conventions differ, catalogers spend more time on searching and duplicate headings are more easily created.

Two standards-based solutions to streamlining authority record creation are examined: revision of the rules for references in the Anglo-American Cataloguing Rules, 2nd edition, and the associated Library of Congress Rule Interpretations; and the use of a "core" authority record. In addition, two automation-based solutions are considered: the use of "implicit" references in the local system (i.e. programming the local system to provide access to forms of headings that are not explicitly stated in the authority record); and the machine-assisted creation of references.

Three appendices to the report provide more information on normalization rules, a comparison of a draft core authority record with the national standard, and a list of types of references that are possible with implicit references.

The Joint Task Group makes the following recommendations:

  1. Normalization

    Many conflicts in the name authority file stem from inconsistencies in and confusion about applying the rules for normalization, which are designed to establish heading uniqueness. In order to reduce the time spent in searching to prevent conflicts at the time an authority record is created, as well as to reduce the number of name-authority conflicts requiring resolution after records are distributed:

      a. Increase catalogers' knowledge of existing normalization rules, especially as they may differ among systems, by making them widely and easily available.

      b. Review the rules and consider revisions with an emphasis on 1) character-level normalization and 2) the universe of conflict (i.e. whether the MARC tag number should be taken into account in determining uniqueness).

      c. Encourage systems in which authority records are created for contribution to the national name authority file to take normalization rules into account during the creation of authority records, e.g. an automatic check for duplicate headings.

      d. Encourage the Library of Congress, the British Library and the bibliographic utilities to review their application of current normalization rules (agreed to in the mid-1980s) and indicate for the PCC and the library community 1) any individual system variations from the rules and 2) if there are variations, indicate the likelihood of their being changed to what the rules call for any time in the near future.


  2. Rule Revision

    Study provisions for references in AACR2 and the associated Library of Congress Rule Interpretations with the goal of reducing exceptions to the rules for cross references to an absolute minimum. This will both reduce catalogers' effort and simplify the programming of automated tools, although in the short term it may increase the keying done by catalogers at sites without highly automated systems.

  3. Core Authority Record

    Do not pursue the idea of a core authority record further, since as we have shown, the current national level authority record is a core record.

  4. Implicit References

    Do not pursue the use of implicit references, which involves the omission of certain types of references in the authority record on the assumption that search engines can be programmed to find such variants. This option is not recommended because the disadvantages in false drops during searching and complications for authority control processing strongly outweigh the advantages. It appears that the degree of uniformity required among all systems in support of this approach is unlikely to be achieved at this time.

  5. Machine-Assisted Creation of Records

    Support the development and widespread implementation of tools for the machine-assisted creation of authority records, in particular, with simplified rules for cross references (Recommendation 2 above).

Introduction

The transition from cataloging in a manual mode to that in an automated environment has been occurring for several years. This transition has now progressed to a point that calls for a consideration of the impact of this change on our cataloging practices, many of which were instituted under quite different conditions. Were all libraries working in the same system, this task would be challenging enough. It is made more daunting by the fact that libraries work in systems with differing conventions on several planes (e.g. indexing conventions, sorting conventions, normalization routines, the characters treated as word separators, etc.). How, then, do we manage a common bibliographic enterprise of national, and increasingly, international scope, when the participants work with tools whose very fundamentals may vary?

One way is to begin to understand the issues involved. An area that is currently of interest is the provision of controlled access to headings through the use of authority records. Indeed, the nature of today's technology makes this more achievable than at any time in the past. Crucial to this objective is the complement of cross references appropriate to the environment in place at the time. Now seems a fitting moment to consider whether the references mandated for authority records by AACR2 and the Library of Congress Rule Interpretations (LCRIs) are the ones suitable to the present environment. This calls for a reassessment of reference practice as it relates to an automated environment. It raises the interesting question of what part of the reference structure can be system supplied and, if so, whether there is enough commonality in the current systems participating in our common bibliographic enterprise to preclude the need to trace such references explicitly.

Another objective of such a reassessment would be to determine whether there can be changes significant enough to address the issues of training and productivity that are connected with the creation of authority records--changes that would encourage more participation in programs such as NACO and BIBCO.

A coordinated strategy is needed on the national front to set standards both for automated systems of bibliographic control and the authority records they support. Such standards may then lead to streamlining authority record creation and facilitating participation in national bibliographic and authority programs. On the international front, work must be done to coordinate emerging U.S. standards with efforts such as the IFLA Working Group on Internationally Shared Authority Data, chaired by Barbara Tillett.

It is the assumption of the Joint Task Group that catalogers who create authority records for contribution to the national database work in an automated environment.

Preliminary Issue: Normalization

(As a preliminary, the issue of normalization is addressed here only briefly. For a full discussion, see Appendix 1.)

One of the aspects of computerization in which standardization plays an important role, particularly in a shared environment, but is not often discussed, is the normalization conventions applied to data by various systems. What is involved is the treatment of certain fundamental aspects of data by various systems--such as capitalization, diacritical marks, and special symbols. A different treatment of these aspects by different systems may result in the same character string being processed differently. This is of crucial importance in an enterprise such as the building of a common authority file, since the whole objective of the operation is to treat the same things alike, to distinguish uniquely those entities that are different, and to ensure that each unique entity is represented only once.

It is the normalization conventions that make it possible to build a common authority file through the use of interconnecting systems. When normalization rules for indexing are the same among systems, a user can issue the same search on a local system and a target system and expect to retrieve the same type of results. To the extent the rules are different, searching records among systems becomes more difficult. When normalization rules for uniqueness are the same, a user can direct a search to a target system for a heading not in the local system and, if the heading is found, expect it to be acceptable for adding to the local system's files as a unique heading. To the extent the rules are different, sharing of records among systems becomes more difficult.

The fact that the MARC tag is not considered as part of the uniqueness of a heading is an aspect of the Library of Congress' experience with the current normalization conventions as applied by RLIN in conflict reporting. This condition currently requires considerable resources both in initially creating authority records and in maintaining them. Thus it appears prudent to explore any possible changes that might reduce the time and energy now spent on this aspect of authority work. An examination of normalization conventions logically precedes some of the measures outlined below, since it can change how headings and references are formulated or whether they can be considered unique.

Streamlining Authority Record Creation

It is generally agreed that authority control is an expensive part of the cataloging process, and that any gains made by streamlining authority record creation will be welcomed by the library community. Of course, streamlining must be done in a way that does not compromise the usefulness and effectiveness of authority control systems. There are several possible approaches to streamlining. One standards-based approach is to simplify the requirements for some of the references specified in AACR2 and the Library of Congress Rule Interpretations (rule revision), which would save catalogers' time. A related idea is the "core" authority record, which would presumably take less effort to create than a "full" authority record. An automation-based approach involves reprogramming library systems to index various forms of headings based on the rules (implicit references), which would again save time by obviating the need to explicitly create certain types of references. Another is the machine-assisted creation of routinely made references, e.g. with macros on the workstation, which again saves time. Some of these approaches are incompatible with each other and simultaneously pursuing all of them can lead to a waste of effort. For example, it makes little sense to spend time programming exceptions to a rule for cross references into a macro when the rule itself can be simplified. A consensus needs to be reached at the national, and perhaps international, level about which approaches should be pursued.

To date, there has been an assumption that all necessary references are stated explicitly in the authority record. This assumption is done away with in the second solution outlined below, so that some references are implicit and "made" by the local system.

I. Standards-Based Solutions

A. Rule Revision

To further the goal of devising automated methods to assist in generating references on name authority records, it is useful to examine the nature of the rules that currently govern references on national-level authority records. Some of them find their origin in filing rules for card catalogs or in historical practices of entry, such as references from political jurisdiction for government bodies entered under their own name. Some are based on current normalization routines, such as the rules for references from acronyms and initialisms. And some are designed to accommodate variations from the ways in which different language communities have standardized the entry of names in their alphabetical lists. Especially, we need to ask ourselves what changes in AACR2 and the LCRIs would facilitate the use of computerized routines to generate automatically those rule-governed references which must be supplied by the cataloger, in addition to the variant forms of a name actually found in sources.

The activity of re-evaluating references is not new. In the summer 1983 issue of the Cataloging Service Bulletin, LC's reference tracing practice, as given in its Chapter 26 rule interpretations, is described as "an attempt to provide references required in the various types of manual and machine-readable catalogs...." In this issue, LC solicits feedback from the library community as to the continuing usefulness of certain types of references. If we compare the reference practice reflected in this questionnaire to current guidelines, it becomes apparent that much progress has already been made in the direction of eliminating superfluous references. In fact, it can be argued that we have already pared the core of necessary references to a bare minimum, given the reality that left-to-right string searches are still prevalent in many systems.

Thus, what can still be done on the "rule revision front" to facilitate the automatic generation of references on authority records?--except perhaps to eliminate those references which do not affect the beginning of the character string, such as:

110 India. Ministry of Health
410 India. Health, Ministry of

The present discussion is limited to the rules for generating "see" references, coded 4XX in USMARC. "See also" or 5XX references are excluded, as are references to uniform titles and series (for lack of space). Only Latin-script languages are considered, since the references arising from romanization issues require a separate discussion. The question of "implicit" references, that is, references which are unnecessary since equivalent access can be provided by machine scanning or manipulation of character strings, is covered elsewhere. The examples given are for illustrative purposes only, and are not necessarily actual headings in the national authority file. (Diacritics are omitted throughout; examples are shown with USMARC tags, but without indicators or subfield coding.)

The types of required, cataloger-supplied references are relatively few. They include but are not limited to: a) inversions of heading elements; b) omission of initial modifying elements in the heading, or left-truncations; c) references from political jurisdiction for government bodies entered under their own name; d) references from initials and acronyms; e) references from parent body for corporate name headings; f) and references from the spelled out or written out form of abbreviations and numerals contained in the heading:

a)

100 Hamilton-Byrne, Anne
400 Byrne, Anne Hamilton-

b)

151 La Ventana (San Luis Potosi, Mexico)
451 Ventana (San Luis Potosi, Mexico)

c)

110 National Science Foundation (U.S.)
410 United States. National Science Foundation

d)

110 International Federation of Library Associations
410 IFLA

e)

110 Americans for Safe Food (Project)
410 Center for Science in the Public Interest. Americans for Safe Food

f)

110 Mt. Vernon Genealogical Society
410 Mount Vernon Genealogical Society

The complexity in the rules for references, and thus an impediment to automating the creation of rule-governed references in authority records, arises not so much in the limited repertoire of reference types, but rather in the restrictions, exceptions, and conditions governing their use in specific cases. To the extent these conditions can be simplified and applied consistently across cases, the automatic generation of references will be facilitated. The following sections illustrate this point.

In the case of compound surnames in personal name headings, the rules specify references from each element of the surname (as entry element) in most cases. Thus:

100 Woodham-Smith, Cecil
400 Smith, Cecil Woodham-

The same principle, in a more restricted manner, is applied to personal name references containing compound surnames, which may result in some additional references:

100 Alonso-Bartol, Gonzalo
400 Bartol, Gonzalo Alonso-
400 Alonso-Bartol Ruano, Gonzalo
400 Ruano, Gonzalo Alonso-Bartol

However, this type of reference is not always made. The following surnames are coded as compound with appropriate indicators (not shown here), but the reference from the second element of the surname is contraindicated by special rules, or for other reasons not uniformly made:

100 Ben-Gurion, David, 1886-1973 (Hebrew)

100 Saint-Aubin, Horace de (French)

100 Jara S., A. Antonio (Spanish)

[As often happens, there are many records in the name authority file that contain a reference from, say, the element following Ben- in a Hebrew name. In some cases the person was not writing in Hebrew, so the language-specific rule was not applied. In other cases, it appears the rule interpretation was simply not observed. The same is true for many "exceptions" to general principles, when the exceptions are of limited application. They frequently fall by the wayside in actual practice. For instance, the actual NAF record for Alonso-Bartol (see above) contains an additional reference from Bartol Ruano, Gonzalo Alonso- , which fits the general pattern, but which is specifically contraindicated by the rules.]

Suppose we regularize the application of the general rule, and refer from the second surname element in all cases. Some of these references may seem curious or objectionable to speakers of these languages, but those speakers probably already know where to look. It is the non-speakers of these languages that may benefit most from routinely providing references from the second surname element in all cases, as we commonly do in the following Portuguese example:

100 Bittar Filho, Carlos Alberto
400 Filho, Carlos Alberto Bittar

Conversely, when the rules stipulate entry under the last element of a compound surname, the reference from the preceding element is sometimes made (Portuguese) and sometimes not (English):

100 Figueiredo, Adelpha Silva
400 Silva Figueiredo, Adelpha

but no reference for:

100 Adams, John Crawford

Since codes for language of heading are not present in the current edition of the USMARC Authority Format, and no substitute for them has been implemented by LC, these variations (in addition to being frequently neglected) do not lend themselves to automation.

In the case of a personal name heading or reference containing initials, the rules specify adding a parenthetical qualifier with the full names, if known. Consider the following example, where both heading and reference are "found forms":

100 Henderson, Edward Paul
400 Henderson, E. P. (Edward Paul)

But if the initials are contained in the heading, an automatic reference is made from the name with the initials filled in:

100 Stevens, J. D. (John Douglas)
400 Stevens, John Douglas

Here again, the rules specify a number of conditions which lead to exceptions to these principles for creating references and formulating qualifiers. For example, if the initial is not a "primary element" of the heading (such as when it represents a middle name), the reference is discouraged:

100 Pritchard, Frederick R. (Frederick Robert)

The requirement that the reference "match" the heading would result in the following qualifier in the reference, even though it is known that the D. represents Douglas. (The heading may have been established without a qualifier because of the existence of a matching old LC catalog heading):

100 Hays, James D., 1926-
400 Hays, J. D. (James D.), 1926-

Further complications in deciding when and how to qualify references can arise. Referring from the different elements of a compound surname or a surname with particles can require adjustments in the qualifier used in the heading, if the name contains initials:

100 Garcia de Miguel, J. M. (Jose Maria)
400 De Miguel, J. M. Garcia (Jose Maria Garcia)
[Garcia added to qualifier]

400 Miguel, J. M. Garcia de (Jose Maria Garcia)
[de not added to qualifier]

Suffice it to say that the rules for when and how to qualify corporate body headings are even more varied than those for personal names.

Since systems can be programmed to recognize the different types of headings reflected in MARC tags (100, 151, 110, 111), the variations in reference practice that correspond to these heading types are less problematic than those which do not neatly correspond to differently coded categories of headings. An example of such a correspondence is found in the fact that references from acronyms for conferences (111) can receive the same qualifier as the heading (in the opinion of some), while references from acronyms for corporate bodies (110) are not qualified.

But even within one tag family, reference practices are not consistent. Some references for headings coded 110 are generated with inversions,

110 Real Academia de Bellas Artes de San Jorge
410 Academia de Bellas Artes de San Jorge, Real

while others are generated with left-truncations:

110 Doktor Wilmar Schwabe G.m.b.H.
410 Wilmar Schwabe G.m.b.H.
410 Schwabe G.m.b.H.

or

110 M.C. Brackenbury & Co.
410 Brackenbury & Co.

Likewise, the rules for creating corporate body references which "fill in" abbreviated forms in headings, require judgments such as whether the abbreviation represents a personal name, and whether the full form is actually documented in sources. Thus,

110 St. Paul's Episcopal Church (Saint Louis, Mo.)
410 Saint Paul's Episcopal Church (Saint Louis, Mo.)

[reference made routinely]

but

110 Wm. R. Prince & Co.
410 William R. Prince & Co.

[reference made only if full form is documented in sources]

The direction suggested here for rule revision, to facilitate the computer-assisted creation of required references is as follows:

  1. As far as possible, attempt to regularize across languages, and particularly within a given MARC tag group, the rules for applying the types of references described in part 2 of this discussion. This may actually have the effect of increasing the number of references made, but decreasing the time required to make them, since they can be generated automatically.

  2. Of course, it will not be possible to eliminate all exceptional practices of limited application. Even so, systems can be programmed to generate a small set of reference types in all cases. It is always easier and faster for the cataloger to delete unwanted references upon manual inspection of the authority record, than to create new ones.

  3. In all cases, respect the cataloger's judgment in retaining or adding references during the final manual inspection of the authority record, which, at least for the foreseeable future, cannot be dispensed with.

B. Core Authority Record

Based on the model of the PCC core bibliographic standards, the idea of creating a "core" authority record has circulated recently. The hope is that, using a core standard, catalogers can save time and thus reduce the cost of authority work.

One of the primary principles on which the PCC Core bibliographic record depends is:

FULL AUTHORITY CONTROL: Access points on all BIBCO records (core or full) will be supported by a national level authority record. All BIBCO participants are also participants in NACO and SACO. Consequently, BIBCO core records can be accepted with no further authority work.

Any discussion of a core authority record needs to take into consideration this principle to make sure that core authority records are sufficiently complete to support the creation and use of BIBCO records with no additional authority work expected from participants. Is it possible, then, to determine a set of data elements for core-level authority records that will provide enough authoritative data to create bibliographic records that depend on them?

Let us suggest that we already have a virtual core-level authority record. The elements that are mandatory or mandatory if applicable are those that are either required by systems to process a record, or elements without which a cataloger would have difficulty determining the appropriateness or accuracy of an authority record.

There are many data elements for which values cannot always be determined when an authority record is initially created. As additional works are cataloged and additional forms of name to be used as cross references are found, these are routinely added by NACO participants. As corporate bodies change names, new authority records are established and linking references added to the earlier records. The same holds true for series authority records.

Requirements for a "National-level" authority record are laid out in Appendix A of the USMARC format for authority data. The chart below (Appendix 2) was taken from the update to this format dated July 1995. The mandatory data elements can be broken down in this fashion:

  1. Leader data that must be system-supplied;
  2. Leader data without which the content of an authority record cannot be interpreted (Leader/05, Leader/06, Leader 17);
  3. 008 fixed field data that identify the rules used for creating the record, heading usage, and status of the record and the degree of completeness/confidence that can be ascribed to it;
  4. Fields that relate to identification of the record and its version (001, 003, 005, 010, 040);
  5. The heading itself (although all the 1XXs are listed as "A", the actual requirement is that one and only one 1XX field is mandatory).

Except for cross-reference tracings, the other fields identified as "mandatory if applicable" depend on what kind of authority record is being created. For series authority records, this data includes 020s or 022s, and only 3 of the note fields: 640, 642, and 670. For name authorities, this includes only the 670 field. Note that all the remaining notes fields (6XXs) and the newly established heading linking entry fields (700-751) and the subdivision heading linking entry fields (78Xs) are all optional fields.

See reference tracings (4XXs) and see-also reference tracings (5XXs) are all identified as "mandatory if applicable". Changing this "national level" requirement to "optional" for a core-level record would mean that authority records would become less useful for end-users and more time consuming for catalogers. If you found a PCC bibliographic record for an item and a cross-reference seemed to be appropriate for one or more of the names, each PCC cataloger might have to check the authority record to see if cross-references had been added, since their presence could not be reliably predicted. This could slow down the process of accepting bibliographic records and greatly increase the time spent updating authority records to add additional information, even if that information was available at the time the authority record was created or last visited. This seems counter to the efficiencies achieved by the shared building of authority records, which is, after all, one of the principles of the whole cooperative cataloging program.

The fields and requirements for the national level authority record are transcribed in the chart below (Appendix 2). We examined each and tried to determine whether it should be mandatory, mandatory if applicable, or optional for a core-level record. In no case did we feel that a mandatory or mandatory-if-applicable national-level element should become optional for a core-level record. However, working NACO participants should review this conclusion.

II. Automation-Based Solutions

A. Implicit References

In this approach, certain types of references are implicit in the authority record and "made" automatically by the local system. For example, for the heading "Smith Jones, John", the reference from "Jones, John Smith" is not entered in a 400 reference field, but is indexed in the local system. Some systems are already capable of this, although most currently are not.

The advantage of this is that the local system will enable a user to search the catalog under some headings which under current rules would need to be added explicitly by the cataloger. This will save the cataloger the time of making these explicit references. It is difficult quantify the time; however if 50% of current references need not be explicitly traced, then approximately 50% of the time catalogers spend making references would be saved.

The disadvantages are several:

  1. For this strategy to be effective all local library systems must be programmed to handle the same implicit references. Some systems currently handle some of the proposed implicit references, but many others do not. In systems which support a variety of searching strategies, it may be confusing to the patron to know which to try for which results. In some systems, for example, a basic author search will retrieve author keywords, which would retrieve names varying in choice of entry element, but in other systems a keyword search must be specified. Since the various systems differ so widely in their search algorithms, and since systems attempt to differentiate themselves for marketing reasons, it may be difficult to convince all of them to invest in the programming necessary to support this approach. There would also be some amount of time involved while waiting for the programming to be done, and for each system to release that feature according to their own time frame, making it difficult to schedule a phase-out of affected cataloging rules.

  2. If certain types of references were made implicit (e.g. automatic truncation and keyword search to enable a search from a less full form or initials to the full form), there may be a problem with excessive false drops. For example, searching WLN under the author keyword Williams J# (truncated initial) yields 1499 headings, including names such as Alexander-Williams, John, and Blakesley, Joseph Williams. Searching exact author Williams, J# retrieved 951 headings. The advantages of these types of implicit references would need to be carefully weighed against the problems of retrieving extremely large sets of headings and/or records.

  3. Authority records are not used only by patrons searching in a local systems, but also by authority control vendors, who must try to match headings against the authority file. If some references were made implicit, authority control vendors would need to create matching algorithms to compensate. However, the first indicator value of 2 in personal name fields (denoting a multiple/compound surname) is not reliable for generating implicit cross references based on permutations of the various elements of the name. Examples include:

    St. Laurent, Louis S. (Louis Stephen), 1882-1973
    Campa S., Valentin (Campa Salazar), 1904-
    B.-Hollbacher, Marianne (Beck-Hollbacher), 1955-
    Smith of Marlow, Rodney Smith, Baron, 1914-

    Furthermore, the first indicator value of 2 in personal name fields is being made obsolete. This will make it more difficult for authority control vendors to generate implicit cross references of this type since they would need to rely on complex algorithms, rather than on indicators, to generate the references. At least one batch authority control vendor used algorithms, in the absence of indicator values, to generate references related to multiple/compound surnames (references which were not actually added to authority records, but rather, pointed to them), and about 5% of the cross references that were created were problematic and needed to be manually located and eliminated.

  4. If this approach to simplifying authority record creation is adopted, it could have the effect of making authority record maintenance more complex and time consuming if it becomes mandatory to eliminate obsolete references whenever a change to an authority record is made.

  5. If some references are implicitly generated by local system search algorithms such that they would not reside in the authority record, the authority records would lose some of their value as a cataloging tool and as an authority control tool.

  6. Unless batch authority control vendors are also willing to undertake creation of implicit cross references, then the value of batch authority control would be diminished. If batch authority control vendors are willing to undertake algorithmic creation of implicit cross references, and if the algorithmic creation of cross references is inherently problematic, the results will be more dramatic than will be the case for local system vendors. In the case of local systems, an incorrect reference would simply appear on the online index. Non-relevant headings can be easily scanned and discarded by a patron. In the case of batch authority control vendors an incorrect cross reference could have the effect of erroneously changing headings in a library's bibliographic file.

(See Appendix 3 for types of cross references that are possible with implicit references.)

The most practical application of implicit references would seem to be of the and/ampersand type. The possible list of symbol/abbreviation-to-word translations is relatively small, though, and each would need to be carefully analyzed for possible adverse effects on authority control processing. Initialisms with and without periods are another type of reference which could possibly be made implicitly by the system, but again this would need to be analyzed for possible problems for authority control processing. Since these are a small percentage of total references, savings in cataloging costs would be minimal. A local system could adopt a keyword search across all subfields in a string, however some false drops are likely to occur and this would be more problematic for authority control vendors to adopt, since algorithms for implicitly creating permuted cross references are unreliable without human review. Automatic truncation from less full or initials to full forms, while useful for searching in a small online system, would generate many false drops in a larger system and would cause major problems for authority control processing.

The value of authority records as a cataloging tool could be diminished if implicit cross references are adopted. Also, the availability and quality of batch authority control processing could be reduced depending on the inclination of batch processing vendors to effectively address the issue of implicit references.

B. Machine-Assisted Creation of References

Another approach to the streamlining of authority record creation is to provide catalogers with machine-assisted aids to create standard references, while maintaining the current practice of explicitly stating these references in the authority record. Many references can be formulated by machine based on the structure of the heading itself or on information in the associated bibliographic record. An example is the double surname (e.g. Smith-Jones, Edward), where the machine can supply a reference from the second element of the surname (Jones, Edward Smith), which spares the cataloger rekeying the information and thus saves time and expense. Machine-supplied headings should be reviewed by a cataloger, who is ultimately responsible for their form and appropriateness.

Several examples of this type of computer program are already in use. One of the first was created by Gary Strawn at Northwestern University Library for use on a NOTIS system. By clicking on a heading in a bibliographic record, the cataloger invokes the program, which creates a new authority record, including the main heading, references and notes. The cataloger reviews the proposed record and edits it as needed. A similar, though somewhat less powerful program has been supplied by OCLC to operate as a macro under their Passport for Windows terminal software. Additionally, in order to provide guidance for vendors and promote the development of better programs, the PCC Standing Committee on Automation is promulgating a standard in this area (Gary Strawn, "Standard for the machine-assisted generation of authority records," 1996).

The advantages of machine-assisted references are several: 1) this approach saves the cataloger's time, which promotes efficiency and reduces the cost of authority control; 2) it is a proven approach, with successful examples in production in the field; 3) it is easy to implement, since it can be done in stages and does not require close coordination between vendors and utilities; and 4) it does not require the reprogramming of existing bibliographic utilities and local systems, as other approaches would.

The disadvantages of machine-assisted references are: 1) the universal availability of this approach would be difficult to achieve, given the large number of local system vendors and bibliographic utilities that would have to engage in essentially repetitive programming; 2) the partial availability of machine-assisted aids creates "haves" and "have-nots"; and 3) the quality of records contributed to the national authority file could possibly decline if libraries do not provide the necessary level of human review. Both catalogers and administrators would need to understand the limitations of the tools and make a commitment to sufficient levels of review.

Recommendations

1) Normalization

Many conflicts in the name authority file stem from inconsistencies in and confusion about applying the rules for normalization, which are designed to establish heading uniqueness. In order to reduce the time spent in searching to prevent conflicts at the time an authority record is created, as well as to reduce the number of name-authority conflicts requiring resolution after records are distributed:

a. Increase catalogers' knowledge of existing normalization rules, especially as they may differ among systems, by making them widely and easily available.

b. Review the rules and consider revisions with an emphasis on 1) character-level normalization and 2) the universe of conflict (i.e. whether the MARC tag number should be taken into account in determining uniqueness).

c. Encourage systems in which authority records are created for contribution to the national name authority file to take normalization rules into account during the creation of authority records, e.g. an automatic check for duplicate headings.

d. Encourage the Library of Congress, the British Library and the bibliographic utilities to review their application of current normalization rules (agreed to in the mid-1980s) and indicate for the PCC and the library community 1) any individual system variations from the rules and 2) if there are variations, indicate the likelihood of their being changed to what the rules call for any time in the near future.

2) Rule Revision

Study provisions for references in AACR2 and the associated Library of Congress Rule Interpretations with the goal of reducing exceptions to the rules for cross references to an absolute minimum. This will both reduce catalogers' effort and simplify the programming of automated tools, although in the short term it may increase the keying done by catalogers at sites without highly automated systems.

3) Core Authority Record

Do not pursue the idea of a core authority record further, since as we have shown, the current national level authority record is a core record.

4) Implicit References

Do not pursue the use of implicit references, which involves the omission of certain types of references in the authority record on the assumption that search engines can be programmed to find such variants. This option is not recommended because the disadvantages in false drops during searching and complications for authority control processing strongly outweigh the advantages. It appears that the degree of uniformity required among all systems in support of this approach is unlikely to be achieved at this time.

5) Machine-Assisted Creation of Records

Support the development and widespread implementation of tools for the machine-assisted creation of authority records, in particular, with simplified rules for cross references (Recommendation 2 above).

Appendix 1:

Normalization Rules

Appendix 2:

National-Level (Full) and Draft Core

Appendix 3:

Types Of Cross References That Are Possible With Implicit References


April 2, 1997

Joint Task Group on Streamlining Authority Record Creation:
Greta de Groat, Stanford (formerly WLN)
Ed Glazier, RLG
Kay Guiles, LC
Rhoda Kesselman, Princeton; Co-chair, Standards
Joe Kiegel , University of Washington; Co-chair, Automation
Dan Miller, BNA
Wilma Minty, Oxford