Library Technology Guides

Document Repository

Smart Libraries Newsletter

Smart Libraries Q&A: Web discoverability

Smart Libraries Newsletter [May 2018]

by

In addition to providing catalogs and discovery tools, it is also important for libraries to do all that they can to make their materials easily found though general search tools, such as Google, Google Scholar, Bing, Microsoft Academic, and other services. While libraries work hard to implement the best discovery tools that they can offer from their own websites, most users begin their research elsewhere. Libraries can employ a variety of techniques to improve the discoverability of their collection materials and services.

Ideally, information about libraries would be found just as easily on the web as other types of organizations. At the broad level, libraries themselves are well represented in the main search engines. In most cases, a query for the name of a library or for a library near me will return successful results. But the collections of libraries are less well represented than the inventories of commercial establishments. Searching for the title of a book or an article through a general search engine will not usually turn up copies in libraries as much as e-commerce options. The performance of library collections on search engines can be improved by taking proactive measures, though the changes will be at best gradual and uneven. I am not aware of any technique or technology able to provide reliable access to a library's collection through Google and competing search engines.

Library collections are especially difficult to expose to search engines. The basic technologies libraries use to manage and provide access to their resources by default tend to be isolated from the broader information universe. Most library catalogs and discovery services initially were developed without extensive implementation of linked data and without a strong emphasis on search engine optimization. The MARC formats underlying these systems were designed for efficient storage and must be transformed to RDF, XML, or other linked data structures for better interoperability with the broader information ecosystem and for optimized performance in search engines.

A variety of search engine optimization techniques can be implemented to improve the performance of library collections in Google and other search engines.

One of the general principles of positive coverage by search engines lies in having interesting content delivered with a clear and uncluttered presentation. Each item of content needs to be presented visually to human readers and have clean and well-structured coding behind the scenes. Optimal discoverability involves developing web-based resource pages, optimized both for human readers and for computer-based harvesting and indexing services. It is essential to go beyond the presentation of resources for users to ensure that each catalog or resource page exposes the item it describes in ways that search engines can understand and ingest.

A basic level of search engine optimization involves adding metadata to each content page. It has long been standard practice to include basic metadata tags in the header of a web page such as [title] and [description] to properly label the page in a browser and to define the snippet of text that will be shown in search engine results. Additional metadata tags can provide additional citation elements for bibliographic resources. Although Dublin Core was once commonly used for citation tags, those defined by Highwire Press are now preferred. Including these tags on resource pages for bibliographic items also facilitates their use with citation managers, such as Zotero, Endnote, and Mendeley.

Each resource in the collection should be presented through a unique and persistent URL. The canonical URL for a resource should also be simple and clean (for example: https://mylibrary.org/catalog/item/48282/gone_with_the_ wind). Many library catalogs present much more complex URLs for each resource, which may include session keys or other elements that may make it difficult to discern the persistent and canonical URL to reference the resource.

A good search engine optimization strategy will also include generating a sitemap, following the sitemaps.org protocol. Sitemaps inform indexing services about all the unique pages on a website and can help them index a site more efficiently. Google and other search engines do not rely on sitemaps exclusively and will also crawl through all the internal and external links within a website.

Failure to implement standard web practices can also impair discoverability. Google and other search engines increasingly penalize sites that are not friendly to mobile devices. Those that have not implemented https may be considered less trustworthy and not ranked highly in search results.

Resources can also be delivered in ways that provide structured data throughout the body of the page. Coding can be embedded surrounding content elements to provide structure without impacting visual presentation. Schema.org defines a robust set of tags to ensure that each item of content on the page can be correctly interpreted by search engines. (See https://schema.org for additional information and examples.) Enhancing web-based resource pages to incorporate structured data elements, such as schema.org, requires the ability to access the programming or templates of the server delivering them. In the library context, this level of control is more likely to be possible for locally developed resources. In many cases, the library may not have extensive control over the presentation of each item as it is displayed.

These techniques can be implemented to improve the discoverability of library resources. Library programmers or web services librarians may be able to customize the templates or adjust the programming to provide metadata and embed tagging for structured data. This level of access may be possible for discovery interfaces based on open source software or even for a proprietary system, where the vendor enables extensive customization. Some libraries may be limited in implementing them for their core collections because they do not have sufficient control over the way that their content management systems, catalogs, or discovery tools format and deliver resource pages.

A number of commercial services for enhanced discoverability have emerged to assist libraries not able to implement their own technical strategies.

Zepheira specializes in providing services to libraries and other organizations, primarily oriented around linked data. For example, the company worked with the Library of Congress to develop BIBFRAME to represent bibliographic data in linked data as a possible successor to MARC. Zepheira has also created a service to help libraries improve the discoverability of their collection resources. The LibraryLink Network they have developed enables library materials to be represented in search results on the open web. Subscribing libraries provide copies of their MARC records that are converted into linked data using BIBFRAME and published within the library's domain, so that they can be easily indexed by search engines. For more information, see https://zepheira.com/.

Zepheira's LibraryLink services is the basis of products distributed by other library technology vendors, including:

  • SirsiDynix, branded as BLUEcloud Visibility (http:// www.sirsidynix.com/products/bluecloud-visibility)
  • Innovative, branded as Innovative Linked Data (https:// www.iii.com/products/metadata-management/#linked -data)

Demco Software offers its DiscoverLocal service to assist libraries in making their collections and events discoverable through search engine results. This service follows a similar model as that from Zepheira, where a library provides its MARC records, which are then enhanced with geolocation data and converted into linked data, and then exposed to the search engines. The service includes reporting and analytics to enable the library to measure engagement with their services. See http://www.demcosoftware.com/products/discoverlocal/ for more information.

Koios, a relatively new startup, develops technology and services to help libraries market their services through improved presence in web search results. Their current offering Libre Ads is based on a set of services to acquire Google Ads for the library available through Google Ad Grants for nonprofits. More information on Koios is available at https:// www.koios.co/.

Each of these strategies and services can improve the representation and placement of a library's materials and services in search results from Google and other search engines. Don't expect immediate and dramatic change. It will take some time for metadata to be harvested and indexed by the search engines, and even longer to gain top relevancy rankings. But even if the difference is gradual, better exposure in Google and other search engines should result in increased use and impact of a library's collection and services.

Permalink:  
View Citation
Publication Year:2018
Type of Material:Article
Language English
Published in: Smart Libraries Newsletter
Publication Info:Volume 38 Number 05
Issue:May 2018
Page(s):5-7
Publisher:ALA TechSource
Series: Smart Libraries Q&A
Place of Publication:Chicago, IL
ISSN:1541-8820
Record Number:23508
Last Update:2022-11-29 00:23:35
Date Created:2018-05-30 09:31:58
Views:141