Application of the variety-generator approach to searches of personal names in bibliographic data bases: part 2. optimization of key-sets, and evaluation of their retrieval efficiency
Journal of library automation
Fokker, Dirk W~Lynch, Michael F
Abstract: Keys consisting of variable-length character strings from the front and rear of surnames, derived by analysis of author names in a particular data base, are used to provide approximate representations of author names. When combined in appropriate ratios, and used together with keys for each of the first two initials of personal names, they provide a high degree of discrimination in search. Methods for optimization of key-sets are described, and the performance of key-sets varying in size between 150 and 300 is determined at file sizes up to 50,000 name entries. The effects of varying the proportions of the queries present in the file are also examined. The results obtained with fixed-length keys are compared with those for variable-length keys, showing the latter to be greatly superior. Implications of the work for a variety of types of information systems are discussed.