Application of the variety-generator approach to searches of personal names in bibliographic data bases: part 1. microstructure of personal author names
Journal of library automation
Fokker, Dirk W~Lynch, Michael F
Abstract: Conventional approaches to processing records of linguistic origin for storage and retrieval tend to regard the data as immutable. The data generally exhibit great variety and disparate frequency distributions, which are largely ignored and which entail either the storage or extensive lists of items or the use of complex numerical algorithms such as hash coding. The results in each case are far from ideal. The variety-generator approach seeks to reflect the microstructure of data elements in their description for storage and search, and takes advantage of the consistency of statistical characteristics of data elements in homogenous data bases. In this paper, the application of the variety-generator approach to the description of personal author names from the INSPEC data base by means of small sets of keys is detailed. It is shown that high degrees of partitioning of names can be obtained by key-sets generated from the initial character of surnames, from the terminal characters of surnames, and from the initials.The implications of the findings for computer-based bibliographical information systems are discussed.