Library Technology Guides

Document Repository

Smart Libraries Newsletter

Smarter Libraries through Technology: Linked Data Brings Challenges and Opportunities to Libraries

Smart Libraries Newsletter [April 2020]

Image for Smarter Libraries through Technology: Linked Data Brings Challenges and Opportunities to Libraries

For a decade or more, libraries have been working toward increased adoption of semantic web technologies and linked data. As early implementors of online systems, libraries created their own set of protocols and metadata formats that were optimized for a technical world that pre-dated the web.

The MARC record format was devised in the early days of library automation. At that time computer storage was enormously expensive and network bandwidth was barely a trickle compared to what is available today. This record structure was extremely compact and designed to carry bibliographic information according to the cataloging practices carried forward from physical catalog cards.

The MARC standards proved to be extremely successful in enabling a global ecosystem of shared cataloging, resource sharing, and systems compatibility. All integrated library systems and library-oriented systems using bibliographic or authority records adopted the MARC formats. As a result, libraries can easily import catalog from many different external sources, receive records with shelf-ready materials, and generally exchange bibliographic records for a host of scenarios. An entire global ecosystem has amassed billions of MARC records, distributed in central bibliographic services and in local integrated library systems in libraries. Protocols were also defined for automated search and exchange of MARC records, such as Z39.50. It is ubiquitously implemented in integrated library systems, cataloging services, and systems.

While this global ecosystem has proved invaluable to the library community, it also has major drawbacks. These formats find use only in libraries, while the broader information ecosystem is based on web standards, such as XML and JSON. As semantic web technologies became prominent, RDF and other forms of linked data have gained wide adoptions. Adhering to domain-specific standards, libraries became somewhat isolated from the broader web and the growing universe of information encoded as linked data.

In this context, there have been a variety of initiatives to pivot the library metadata ecosystem to the realm of linked data. These efforts have built on each other for the last decade, leading to an increasing mature linked data ecosystem for libraries. In some cases linked data replaces previous metadata formats and in others provides a bridge from MARC-based systems.

The Dublin Core Metadata Initiative was an early step out of an entirely MARC-based bibliographic universe. This effort, launched in 1995, devised a simplified metadata standard more appropriate than MARC for describing digital resources. Dublin Core was based on a finite number of metadata elements. Fifteen elements were defined in the original metadata set, with the option to create new ones to accommodate the needs of specialized communities adopting the standard. Dublin Core has gained wide use well beyond the library community.

Other projects have worked to provide an alternative able to replace MARC in the library bibliographic ecosystem with new structures based on linked data and semantic web technologies. These efforts emerged out of selected national libraries, major academic or research libraries, and OCLC.

COMET (Cambridge OPen METadata) was a joint project between OCLC Research and Cambridge University in the UK, with funding from Jisc to create and release a large set of bibliographic records derived from the Cambridge University Library catalogs in multiple forms of linked data. This project tested the viability of the multiple linked data technologies to support library bibliographic content.

OCLC has been a consistent proponent of linked data for libraries. The organization has engaged in many projects with a diverse set of collaborators to develop, test, an implement linked data. OCLC has released multiple data products as linked data, including

  • FAST (Faceted Access to Subject Terminology), a scheme based on the Library of Congress Subject Headings, relying on the individual components rather than complex coordinated terms. OCLC released the FAST headings as linked data in 2011. VIAF, the Virtual International Authority File
  • WorldCat Works released as linked data in 2014. 197 million bibliographic work descriptions

The Library of Congress launched a major initiative to create a linked data alternative to MARC through its Bibliographic Framework Transformation Initiative. The aim of this project was to create a new framework for bibliographic information consistent with modern concepts of linked data that captures the content of MARC records, but without the constraints of its legacy formats and encodings. The Library of Congress contracted with Zepheira to collaboratively develop new framework mapped from MARC, now known as BIBFRAME. Since that initial effort, BIBFRAME has continued to mature through ongoing revisions and is generally positioned as the successor to the MARC formats. It is seeing implementation by many different projects in the US and internationally, including experimental prototypes as well as production environments.

These are just some examples of the work that has taken place to move libraries into the realm of linked data. Despite a maturing set of tools, technologies, and viable frameworks, the operational library bibliographic ecosystem remains well entrenched in MARC. With the vast number of MARC-based systems deployed globally and the generally slow cycles of product development, it is difficult to imagine a wholesale shift in the foreseeable future. Recently created systems are naturally designed to accommodate BIBFRAME but must simultaneously support MARC. The massive installed base of more longstanding systems was essentially hard coded for MARC cannot not feasibly be reprogrammed to support BIBFRAME.

Despite the enduring persistence of MARC, BIBFRAME and other linked data frameworks will be a growing component of the global bibliographic infrastructure for libraries. Even though library systems may not soon change to BIBFRAME as their internal bibliographic record structure, linked data plays an important role as a bridge to the web. MARC records can be transformed into BIBFRAME and other linked data syntaxes to achieve benefits such as better representation in web search results or integration with other information services. Also, the internal use of MARC does not preclude the delivery of catalog listings or other resource pages with embedded linked data references, such as and BIBFRAME. In this way it is possible to both leverage the continuity of MARC as a mature and universally implemented standard and to gain new benefits from linked data.

In this issue of Smart Libraries Newsletter, we feature EBSCO Information Services' acquisition of Zepheira, recognized as the leading linked data services firm in the library industry. Zepheira has played an important role for the promotion and deployment of linked data in the library sphere, notably through its work supporting the development of BIBFRAME.

View Citation
Publication Year:2020
Type of Material:Article
Language English
Published in: Smart Libraries Newsletter
Publication Info:Volume 40 Number 04
Issue:April 2020
Publisher:ALA TechSource
Series: Smarter Libraries through Technology
Place of Publication:Chicago, IL
DownloadDocument not available for download
Record Number:25076
Last Update:2022-12-05 14:42:22
Date Created:2020-04-20 13:59:19