Library Technology Guides

Current News Service and Archive

Press Release: University of California -- Los Angeles [May 21, 2021]

Online Computer Library Center Non-Roman Script Project

About the Online Computer Library Center (OCLC) non-roman script project

For library cataloging, when technology evolved it didn't work for a set of record users. The OCLC non-roman script helps database researchers of languages that do not use the Roman alphabet to work with the languages themselves in native scripts, rather than through the similar-sounding roman alphabet equivalents in English.

Until recently, being able to go back and augment the old non-roman records with original, non-roman script was considered a luxury. Making foreign language records more accessible is an EDI solution and therefore critical. This project provides a vital service in discovery tools.

Questions and Answers

Q. How does the UCLA Library work with OCLC?

A. Over a year in collaboration, UCLA Library catalogers working with OCLC have batch-processed older non-roman records in WorldCat to add the Russian and Armenian original non-roman script.

Q. Why did the OCLC non-roman script project start?

A. To improve discovery of non-Latin materials in the UCLA Library catalog system, this project was initiated by John Riemer, department head at the UCLA Library, to add Cyrillic Script to WorldCat. Riemer helped spur the start of this project by being the first to bring the need to OCLC. Karen Smith-Yoshimura in OCLC Research helped advance this idea within OCLC. This project became a partnership within a year when OCLC responded to the demand for this service from multiple libraries.

OCLC can amend the incorrect catalog data that maps to the non-roman language characters for greatest accuracy and has the ability to quickly process millions of records.

Q. What is involved in the non-roman script project?

A.

  1. Adding Cyrillic script to WorldCat records by automation
    • a. Automated addition is possible for many languages that use Cyrillic script
  2. One-to-one correspondence between Latin script alphabet and Cyrillic script alphabet
  3. Russian was selected first
  4. English language of cataloging only

Q. What is WorldCat?

A. An international database used by libraries to provide for discovery of library materials worldwide. It features 30 million records in 34 non-Latin scripts as of December 2017.

Q. How many Russian and Armenian title language records have been augmented?

A. Over 1.2 million Russian language records, followed by 43 thousand Armenian titles up to early this year, 2021.

Q. Who is on the team that has contributed to this project?

A. From UCLA: Peter Fletcher, Cyrillic cataloger; Sharon Benamou, Hebrew cataloger; Nora Avetyan, Armenian & Persian cataloger; Iman Dagher, Arabic cataloger; Andy Kohler, IT; John Riemer, department head

From OCLC: Jenny Toves & Karen Smith-Yoshimura, OCLC Research; Mary Haessig, Cyrillic cataloger, Contract Cataloging; Brian Baldus, Robert Bremer, Robin Six,& Cynthia Whitacre, Metadata Quality

From the Library of Congress: David Bucknum

Q. Why wasn't this project initiated and done before?

A. When this project got started, some programmers didn't believe this initiative would be feasible.

With the advent of computers, in the West, there were no provisions for cataloging characters in other languages, these were not considered. Non-Latin languages had to be transliterated for a long time.

Since the 80s, Cyrillic has been added to different languages. Records for other languages, Chinese, Japanese, Korean are in better shape than Russian. New publications get Cyrillic-script cataloging from the outset. Going back to older millions of records had been an unattainable challenge.

For non-roman script implementation at OCLC, one had to check all the products from OCLC to make sure they had the interfacing and settings to display the real characters. If something were wrong, little squares showed up where the missing characters should be.

To correct the multiple missing language records, such tasks must be done in large batch loads, since one at a time is very labor intensive. Oracle software helped provide a system solution to the homemade fixing of a language at a time.

Q. What language is next?

A. OCLC is now working on other Slavic languages.

Q. What are the UCLA Bibliographic records by non-roman language?

A. A 2015 snapshot of all of the UCLA Library holdings by language indicates we hold many other non-roman languages beyond Russian and Armenian.

In the next year, there are plans to contribute a journal article for the library literature detailing methodology and lessons learned from Russian and Armenian. This could serve as a guide on how to add non-roman script for other languages.

About the Online Computer Library Center

OCLC, Inc., doing business as OCLC, is an international nonprofit library cooperative "dedicated to the public purposes of furthering access to the world's information and reducing information costs." More than 72,000 libraries in 170 countries and territories around the world have used OCLC services to locate, acquire, catalog, lend and preserve library materials.

About the UCLA Library

Consistently ranked among the top academic libraries in the country, the UCLA Library drives the world-class research, groundbreaking discoveries, and innovation for which UCLA is renowned. Whether on-campus or online, the UCLA Library takes the lead in preserving cultural heritage, making knowledge accessible, and building a library of the future. The UCLA Library system serves students, faculty, and researchers of all disciplines as one Library with many physical locations. Holding twelve million print and electronic volumes, the Library is visited by 3.5 million people annually, with 15 million visits per year to library.ucla.edu. Through the Arcadia-supported Modern Endangered Archives Program, the Library funds preservation and digitization of at-risk materials from around the globe.


Summary: For library cataloging, when technology evolved it didn't work for a set of record users. The OCLC non-roman script helps database researchers of languages that do not use the Roman alphabet to work with the languages themselves in native scripts, rather than through the similar-sounding roman alphabet equivalents in English. Until recently, being able to go back and augment the old non-roman records with original, non-roman script was considered a luxury. Making foreign language records more accessible is an EDI solution and therefore critical. This project provides a vital service in discovery tools.
Publication Year:2021
Type of Material:Press Release
LanguageEnglish
Date Issued:May 21, 2021
Publisher:University of California -- Los Angeles
Company:
Company: University of California -- Los Angeles
Online access:https://www.library.ucla.edu/about/about-collections/cataloging-metadata-center/online-computer-library-center-non-roman-script-project
Permalink: https://librarytechnology.org/pr/26293

LTG Bibliography Record number: 26293. Created: 2021-05-21 13:04:58; Last Modified: 2021-05-21 13:08:40.