Library Technology Guides

Document Repository

Next-generation discovery: an overview of the European scene

.

Copyright (c) 2013 Facet Publishing

Image for Next-generation discovery: an overview of the European scene

Abstract: In this chapter we will provide a brief overview of the features and general characteristics of this new genre of library software, focusing on the products that have been deployed or developed in the United Kingdom and other parts of Europe. Some of these projects include adoption of commercial products from international vendors such as Serials Solutions, EBSCO, Ex Libris, or OCLC and others involve locally-developed software or implementation of open source products.


Dissatisfaction with the online catalogues delivered as part of the library management system sparked the emergence of a new genre of products and services that focus entirely on providing an improved experience in the way that libraries provide access to their collections and services. One of the major trends of this phase of library automation involves a separation between the library management system that provides automation support for the internal operations, such as cataloguing, circulation, serials management and new material acquisitions from the presentation layer facing the users of the library. In this age of decoupled systems, a variety of commercial products and services, as well as projects taken on by library organizations now find use in many libraries throughout the world.

In this chapter we will provide a brief overview of the features and general characteristics of this new genre of library software, focusing on the products that have been deployed or developed in the United Kingdom and other parts of Europe. Some of these projects include adoption of commercial products from international vendors such as Serials Solutions, EBSCO, Ex Libris, or OCLC and others involve locally-developed software or implementation of open source products.

These new catalogue products, or discovery services as they are more commonly known, embrace a more robust approach to interfaces, to content and to service delivery. In these times, where most library users use the web for a wide array of daily activities, it is essential for libraries to provide interfaces that follow the conventions and meet expectation set by other successful destinations of the web. Libraries need to make dramatic improvements in the way that they provide access to their collections and deliver their services, as the current library systems in use reflect an earlier phase of the rapidly evolving history of the web.

These new-generation library discovery services aim to follow the interface conventions that have become well established in the mainstream of the web. E-commerce and social networking sites employ features that allow individuals to easily navigate through enormous quantities of information and take advantage of any services offered. While the mission and values of these other destinations differs substantially, many of the same techniques can find quite effective use by library as they craft their presence on the web.

Some of these basic interface techniques include:

Visually appealing interfaces with finely tuned usability. In contrast to the legacy library online catalogue, modern web interfaces employ designs that can be readily understood and navigated by typical users without any special instruction or training. They guide users through pathways that take them through the site and to the information or service of interest without making them struggle; they allow the user to think about what they want to do without the interface getting in the way.

Relevant results. When users need to search for information, modern interfaces return results ordered so that the most interesting or important items appear first. Search engines and e-commerce sites have set very high expectations with users to expect the items most closely associated with their query at the top of a list. Successful relevancy goes beyond mechanically ordered keyword matching, bringing in many other factors that help identify the items of highest importance or interest. It's often necessary to fold in use data extracted from the behaviour of previous searchers, linking frequency, as well as social content such as end-user ratings or rankings. In a library environment, usage data from the library's circulation system, logs from link resolvers, volume of holdings in local and national libraries, might be some of the factors that help libraries provide meaningful ranking of search result candidates.

Faceted navigation. Even when the system does a good job of ranking results by relevancy, the number of items retrieved often exceeds a manageable limit. Modern search environments usually employ a convention of facets, or clickable links of terms or categories, to help users narrow the results returned into a smaller number of items. Based on the content of the initial result list, the system will pull out categories and terms that when clicked will qualify the result set accordingly. In a library setting, categories of format (book, media, articles, manuscripts), dates, authors, subject headings, genre, or location are examples of facets that can easily guide a user through a broad result set to specific items of interest.

Social features. Successful websites find ways to initiate interactive engagement with their users. Many individuals today expect not just one-way delivery and consumption of information, but to have the ability to provide feedback, rate the quality of items, write detailed reviews, and share interesting items with their online social circle. These social interactions provide website operators important information on the success of their services and to identify the best and worst content within their respective systems.

Comprehensive scope. One of the major difficulties with the way that libraries have traditionally laid out their offerings was the need to work with many different separate systems to find all the material available. Separate interfaces for the online catalogue of the library management system, links to lists of e-journals or aggregated article databases, options for institutional repositories, digital collections, interlibrary lending services forced library users to follow a lengthy and complex process to find information to satisfy their research goals. One of the main goals of next-generation library interfaces focuses on providing a much simpler pathway to finding all the content and services offered. Each of the products and projects employ different techniques to attain this goal, but one of the common approaches involves building a consolidated index, that includes all of the materials that constitute the library's collection. The scope of these aggregated indexes varies from those that focus mostly on the materials held in the library's local collection (print and digital) to a much more ambitious endeavour that also includes representation of all of the individual articles included in subscriptions to e-journals and article databases. The quantity of articles is vast, especially for larger academic libraries and may number in the many hundreds of millions of articles. Yet to provide convenient access to the entire library collection, the aggregated index has emerged as one of the major strategies.

It should be noted that some libraries see continued value in a separate interface for their traditional collection. Such an interface can be offered either through the legacy online catalogue module associated with the library management system or through an instance of a discovery interface scoped to local content.

End-user services. These new generation interfaces must also go beyond basic discovery of what items exist in a library's collections and deliver appropriate services for any given type of content. Within the interface itself, common expectations include personalized features that allow a library customer to login to a personal account for the purpose of establishing preferences such as subject interests or desired points of contact, creating sets of bookmarks, saving items for later reference, exporting items to a citation manager, or related features.

The discovery interface also needs to perform services associated with the web-based online catalogue, including: real-time status of items displayed, ability to place display items currently issued to the user, self-service tasks such as placing holds, viewing any pending fines and to make payments for fines and fees. A simple, though inelegant technique, can involve the discovery interface invoking the web-based online catalogue of the library management system, simply handing off the user to the legacy interface. A more sophisticated approach involves bringing all these user services features from the library management system into the discovery interface, taking advantage of behind-the-scenes interactions, such as APIs (Application Programming Interface) or web services, to offer these services transparently without the jarring hand-off to the legacy catalogue.

Other services of the discovery interface relate to content from other sources, such as viewing of full-text from subscription-based electronic content packages. To support these services, the discovery interface needs to integrate appropriate linking mechanisms, such as through OpenURL link resolvers and pass authentication credentials through proxy servers to enable simple click-and-view capabilities when the full-text of an item is available in the library's collection. In general terms, the discovery interface needs to facilitate delivery of content and to enable all relevant services in addition to the basic task of letting the user know what content exists in the library collection.

Products and Projects

In this section we will summarize some of the major discovery products available, focusing especially on how they have been implemented in Europe. We include commercial products, open source projects and other projects developed by European libraries.

AquaBrowser Library

One of the first products developed as a replacement for the online library catalogue came out of The Netherlands, developed by a company called Medialab Solutions. A multi-media research facility founded in 1990 by Professor M.M. ‘Thijs' Chanowski as a spin-off of BSO Origin Philips. Medialab developed a search engine technology and interface called AquaBrowser that could be applied to many different kinds of information needs. The technology took hold in libraries with Medialab customizing and refining AquaBrowser Library to suit their needs. The product was widely adopted by public libraries in The Netherlands, with as much as 80% shifting to it from their original online catalogues. Through a distribution arrangement with The Library Corporation, AquaBrowser Library found great popularity in libraries in the United States, including many public libraries and some major academic libraries such as Harvard University and the University of Chicago.

AquaBrowser makes use of a proprietary search engine developed by Medialab Solutions, called Igor, to index materials extracted from the library management system and other local repositories. It makes use of stemming and other search technologies to provide advanced search capabilities and offers faceted navigation. A unique feature of AquaBrowser is the “cloud of associations” presented visually on a side bar that allows users to see related concepts represented in the search results and launch a new search. AquaBrowser includes an optional federated search component to provide access to external resources such as article databases. Another optional component, called My Discoveries, introduces a social dimension to the product, including tags pulled in from LibraryThing.

Medialab Solutions was acquired by R.R. Bowker in June 2007, part of Cambridge Information Group portfolio of companies, which also includes ProQuest and Serials Solutions. In 2010, the corporate structure has shifted, with Serials Solutions now taking responsibility for AquaBrowser, along with the Summon discovery service it developed.

The changing corporate associations with AquaBrowser have affected some of the dynamics of its distribution in libraries. The acquisition of Medialab by R.R. Bowker meant the end of The Library Corporation's exclusive arrangement to distribute AquaBrowser in its markets, which sparked that company to develop its own end-user discovery product called LS2; Infor Library and Information Solutions, also a former distributor or AquaBrowser, has since developed its product called Iguana that it hopes to place in many of the libraries using AquaBrowser. Serials Solutions continues to develop and market AquaBrowser with its own plans for future expansion.

Some of the major deployments in the United Kingdom include:

  • National Library of Scotland (http://discover.nls.uk/)
  • Carmarthenshire Libraries (www.carmarthenshire.gov.uk/english/education/libraries/pages/librarycatalogue.aspx)
  • Croydon College Library (http://minerva.croydon.ac.uk/ipoint/)
  • Kingston University London (http://kuaquabrowser.kingston.ac.uk/ABL/)
  • National Library of Wales (http://discover.llgc.org.uk/)
  • Queen Margaret University (http://quest.qmu.ac.uk/qmu/)
  • Birkbeck College University of London (http://aquabrowser.lib.bbk.ac.uk/ablbirkbeck/)

Primo and Primo Central

Ex Libris, an international company based in Israel, specializes in automation products for national, research and academic libraries. The company launched Primo, a discovery product designed specifically for the needs of these types of library. Primo was initially launched in June 2006. The basic design of Primo includes a local index for local content including materials harvested from the library management system. Normalization, indexing and presentation rules can be customized by the library to control the way that search results are retrieved, ordered and presented. Its search facilities, based on the Lucene search engine, produce relevancy-ranked results with tuneable boosting factors that give the library control over the relative precedence of any given type of material in search results.

Primo has been designed to work independently of any library management system. It has been adopted by a large number of libraries using its own Aleph and Voyager library management systems and has also been implemented by libraries that use library management system products from competing companies. Libraries with large and complex collections can take advantage of Primo's capabilities to perform highly customized mapping and relevancy weightings to optimize access for local requirements. While most larger libraries have opted to operate local installations of Primo, Ex Libris also offers a hosted software-as-a-service version.

One area of functionality for these products involves the degree of integration with the local library management system. In addition to being able to display materials from the library management system, including real-time status and availability information, it is also helpful to enable self-service features routinely offered by web-based library catalogues through a patron's account. Functionalities such as viewing materials currently borrowed, performing renewals, placing reservations or making payments for fines and fees can also be provided via the discovery interface. Early versions of Primo would link into pages delivered by the online catalogue for these features. One of the advancements of Primo Version 3 includes the ability to perform such tasks without the need to hand users off to the legacy catalogue interface.

In order to meet the expectations for a discovery interface to offer access beyond the local collections to e-journal subscriptions at the article level, Primo has from the beginning offered an integration with the MetaLib federated search utility to retrieve results from selected products through a secondary search. This approach, while providing some exposure to e-journal content, comes with all the disadvantages of federated search such as slower performance, limitations on the number of remote targets that can be included in a search and shallow numbers of results from each target.

In order to overcome these limitations and have articles available in the initial search with the same level of performance and with full indexing and relevancy rankings, Ex Libris developed Primo Central. In the same way that Primo provides a comprehensive index for content under the direct control of the library, Primo Central aggregates an expansive collection of scholarly articles drawn from the producers of databases and publishers of e-journals.

Primo Central, hosted by Ex Libris, integrates with Primo to produce unified relevancy-ranked search results including both local materials and subscribed articles. Ex Libris has worked to build the Primo Central index by developing arrangements with the major publishers and database providers that offer content to academic and research libraries. While Primo Central is not yet comprehensive, it has reached a critical mass where it represents a very large portion of these materials. Ex Libris continues to expand the Primo Central index through its efforts to develop cooperative partnerships with additional content producers. The Primo Central index has been populated with content that represents the potential universe of content products to which academic and research libraries subscribe. Through a detailed profiling of a library's actual subscriptions, search results can be scoped to not include results for which library users will not have access. Results from its index of local content are seamlessly integrated with those from Primo Central to present to library users results listings that include all materials available to them.

Primo has been implemented by some of the largest libraries in Europe, including:

  • British Library (http://searchbeta.bl.uk)
  • Royal Library and Copenhagen University Library Information Service (http://rex.kb.dk/)
  • University of Oxford (http://solo.bodleian.ox.ac.uk/)
  • University of Manchester John Rylands Library as a software as a service (SaaS) or hosted implementation (http://man-fe.hosted.exlibrisgroup.com/primo_library/libweb/action/search.do).

Encore / Encore Synergy

Developed by Innovative Interfaces, Encore has been widely deployed as the next-generation catalogue for a large portion of libraries using Millennium library management system. Although designed to operate with any major library management system, to date only a small number of libraries outside the Millennium fold have adopted it. Encore embodies a service-oriented architecture and other modern technology components. In addition to the standard set of features common in the discovery services genre, such as relevancy-based results and faceted navigation, Encore's distinctive features include a word cloud for refining results by tag.

Innovative has developed a relevancy ranking technology, called RightResult, optimized for library content. General keyword search engines may not necessarily present results ordered in ways that make sense for library materials. Encore's use of RightResult helps ensure that the items most likely to match the user's query appear near the top of result lists.

Consistent with other discovery interfaces, Encore has been designed to provide access to content beyond that managed within the library management system. Libraries can bring other content sources into Encore such as resource records from Electronic Resource Management, local digital collections, or from any repository that supports OAI-PMH.

Innovative, through an extended product called Encore Synergy, takes a different approach from its competitors in the way that it extends its discovery product to provide access to scholarly article content. Rather than creating a large aggregate index of articles as do products such as Summon, Primo Central and EBSCO Discovery Service, Encore Synergy uses web services to retrieve article content dynamically from selected content sources. The basic concept of Encore Synergy involves inserting a sampling of articles into the initial result list, which then guides users to more comprehensive article search results for those interested in that type of material. Since Encore Synergy does not depend on harvesting, but real-time access to article resources, it avoids some of the difficulties such as gaps in current materials that may not yet be harvested and indexed, or omissions of content from publishers that have chosen not to provide their materials for harvesting. Innovative asserts that Encore Synergy is not based on a conventional federated search model, but a more sophisticated implementation of web services that overcome the issues involved in performance and depth of search results.

United Kingdom implementation examples include:

  • Bangor University (http://encore.bangor.ac.uk)
  • University of Bradford (http://bradfinder.brad.ac.uk)
  • City University London (http://encore.city.ac.uk)
  • University of Exeter (http://encore.exeter.ac.uk)
  • Wellcome Library (http://encore.wellcome.ac.uk)

Sorcer – Civica Library and Learning

Civica Library and Learning, which offers the Spydus library management system that has been implemented in many parts of the world, especially in Australia, New Zealand, Taiwan and other countries in Asia and the United Kingdom.

Civica characterizes Sorcer as a consumer portal for Spydus, replacing the online catalogue with an interface that aims to be a social network for readers. It provides a modern interface and features such as lists of recommended materials, the ability to view and contribute reviews and ratings. Sorcer offers recommendation features such as “people who borrowed this also borrowed,” similar titles, titles by the same author, titles in the same series. The Sorcer interface uses AJAX technology to retrieve additional information as it is needed. Hovering the mouse over a link, for example, launches a small popup box with additional information on that item. Sorcer uses collapsible heading boxes that retrieve and display information when opened, such as holdings and availability, additional details, or a MARC display of the record. When entering a search, Sorcer will offer drop-down suggestions once a few characters have been typed. Tabbed search display, offering options for categories such as “What's New”, movies, books and general interest. Facets on the right side of the page are shown initially only in category headings, but when opened display individual terms along with the number of associated items.

An RSS icon provides a persistent feed for the current search. Sorcer does not offer a relevancy ordering of results, but only title, author, publication date, series, or ratings.

Sorcer has been implemented by the city libraries in Bayside and Glenelg, both in Victoria, Australia but has not yet been implemented by libraries in the United Kingdom. However, new sites are anticipated given that Civica ranks as one of the more successful automation vendors in recent years.

Australian implementations of Sorcer include:

  • Glenelg Library Service http://glenelg.spydus.com/cgi-bin/sorcer.exe/MSGTRN/SORCER/HOME
  • Bayside Library Service http://bayside.spydus.com/cgi-bin/sorcer.exe/MSGTRN/SORCER/HOME

Summon – Serials Solutions

Serials Solutions, a business unit of ProQuest, offers a discovery service called Summon that aims to provide comprehensive access to all materials in a library's collection. Summon, launched in January 2009 initiated a class of products, often referred to as web-scale discovery services, that address the universe of library content, including the vast collections of individual articles represented in subscriptions to e-journals and other content packages.

Consistent with the company's longstanding emphasis on providing libraries with tools to manage and provide access to electronic content, Summon can be seen as first addressing the incredibly challenging problem of developing a service capable of indexing and providing access in the most comprehensive way possible to all the articles to which a library might subscribe and then integrating access to local content. Technologies and methodologies for discovery within local collections had been well established and have been incorporated into Summon as well.

The strength of Summon lies in its ambition to index the largest proportion of content possible representing electronic holdings of any given library's collection. Serials Solutions, as part of the ProQuest, gains access to any of the content represented across the company's broad range of content products. It has also made agreements with an expanding array of other aggregators and publishers to harvest and index their content for inclusion in the Summon index.

The fundamental design of Summon involves ingesting citation or full-text data for the purpose of indexing to support its discovery services, but when users select and item for viewing, print, or download, they link into the original provider. Summon does not republish the content, its purpose lies in providing libraries a discovery service to more efficiently connect its users with the content to which they subscribe. This model, while in many cases bypassing the search interfaces offered by aggregators and publishers, maintains, if not strengthens, the value of the content represented in library subscriptions.

Serials Solutions characterizes Summon as a discovery service neutral relative to the source of the content and that it does not have a built-in bias toward favouring materials associated with ProQuest. Summon also works with all of the major library management systems for integration of locally housed materials and can also ingest content from institutional repositories and digital collections.

Summon finds use in many libraries in the United Kingdom including:

  • University of Huddersfield (http://library.hud.ac.uk/summon/)
  • University of Dundee (http://www.dundee.ac.uk/library/search/llcsearch.htm)
  • University of London Research Library Services (http://external.shl.lon.ac.uk/summon/index.php)

Axiell Arena

Axiell a major provider of automation products to libraries, archives and museums in Scandinavia and the United Kingdom created Arena, an end-user portal that extends beyond discovery of library collections to deliver a managed environment for all of the information and services delivered through a library's website. Where most new generation discovery interfaces replace, or supplement, the online catalogue and other search interfaces, Axiell Arena stands in for the library's entire web presence.

Axiell Arena has been designed to operate with any major library management system, though it is initially positioned to operate with the company's own products including BOOK-IT, Origo, DDElibra, Libra.SE, Libra.FI and OpenGalaxy, the library management system developed by DS and used by over 60 library services in the United Kingdom and Ireland, including the expanding London Libraries Consortium. Axiell's involvement in the UK stems from its initial collaboration with DS in February 2008 to create a new end-user portal, and its eventual acquisition in April 2008 and transformation in to Axiell Limited in 2009.

Arena follows established methodologies for interacting with a library management system, extracting content, displaying real-time status and availability status, enabling services such as viewing items on loan, performing renewals and other features with new-generation library catalogues. Arena differs in its more ambitious scope to extend beyond resource discovery and attempts to address all of the functions of a library website.

Examples of libraries in the United Kingdom implementing Axiell Arena include:

  • Doncaster Council Libraries (http://library.doncaster.gov.uk/web/arena)

Infor Iguana

Another European-based supplier of library management systems, Infor Library and Information Solutions has created a new product called Iguana, which it positions as a “marketing and collaborative user interface” for libraries. Iguana has been designed to serve as the library's complete web presence, providing not only discovery services, but a complete set of tools to build a library's website, using current web technologies, an orientation toward engendering collaboration with library users focusing on strengthening relationships. On one level, Iguana includes the expected functionality of modern discovery environments, consistent with the other new-generation library catalogue interfaces. It approaches the way that it offers the information and services surrounding the discovery interface in a way that helps the library market and promote its collections and services. As a comprehensive library portal product, Iguana provides the ability for a library to configure, customize and manage its entire website through a set of tools that do not require extensive technical knowledge.

One of the early adopters of Iguana is the Breda Public Library in The Netherlands: (http://www.bibliotheekbreda.nl/iguana/www.main.cls).

EBSCO Discovery Service

EBSCO, a major content aggregator, entered the discovery services arena in January 2010 with the release of its EBSCO Discovery Service. The company's EBSCOhost family of content products ranks as one of the most widely used platforms for access to articles for libraries. EBSCOhost includes many specialized search features, tailored indexing and controlled vocabularies designed for optimal search and retrieval as well as integrated presentation of full-text articles within many of its offerings. The EBSCO Discovery Service extends the EBSCOhost platform to provide access to content products not part of its own offerings and to the library's local collections.

EBSCO Discovery Service, like other products within this class, aims to build a comprehensive index of all the articles represented by any given library's subscriptions. Access to all of the materials represented within its own EBSCOhost products gives it quite a head start. The company has extended the scope through agreements made with many other content providers. OCLC, for example provides a representation of its massive WorldCat, in exchange for selected EBSCOhost data that can be loaded into WorldCat.

In addition to all the remote content from EBSCOhost and third party content providers, EBSCO Discovery Service also uses the standard techniques for harvesting content from the local library management system, with real-time availability and status display and other local repositories.

EBSCO also offers a product, announced in January 2011, which allows libraries to use the EBSCOhost interface as their local online catalogue. For libraries that rely on EBSCOhost as their primary platform for access to articles, this product will allow them to simplify their environment. Their local catalogue would appear simply as one of the selections offered to their users in the EBSCOhost interface. This online catalogue product does not include the third-party content represented in EBSCO Discovery Service.

Some of the libraries in the United Kingdom implementing EBSCO Discovery Service include:

  • University of Liverpool (http://www.liv.ac.uk/library/)
  • Bournemouth University (http://www.bournemouth.ac.uk/library/resources/mySearch.html)

OCLC WorldCat Local

OCLC, the library membership organization, offers the WorldCat bibliographic service, originally created for the purpose of collaborative cataloguing. Based in Dublin, Ohio in the United States, OCLC has grown into a global organization. OCLC's involvement in Europe has expanded through its acquisition of PICA, the former European-based library and information systems supplier. The WorldCat bibliographic database has grown to massive size, with over 220 million titles represented in early 2011 from libraries in 170 different countries. WorldCat has continually evolved in functionality, providing services such as resource sharing and interlibrary loan.

Beginning in about 2007, OCLC launched WorldCat Local as an interface that could supplement or complement a library's own online catalogue. This product starts with the massive WorldCat database, includes filtering and scoping capabilities to replicate the functionality of an online catalogue, including linkages to the underlying library management system to show current availability and status information. When used as a discovery interface for a library, WorldCat Local would be configured to favour the library's own holdings, showing them first in result listings, followed by holdings in nearby libraries. This approach allows users to learn about resources beyond those available in the local library, presenting additional materials that might be available from interlibrary loan services or other resource sharing arrangements. With the expectation that discovery tools also provide access to articles, OCLC has expanded WorldCat with an increasing body of articles that can be made available to users associated with libraries participating in WorldCat Local.

Beginning in 2009, OCLC launched a further expansion of the functionality based on the WorldCat platform to include circulation, acquisitions and license management of electronic resources in a new product called Web-scale Management Services. This product would essentially eliminate the necessity to operate a library management system. A small number of libraries began using Web-Scale Management Services as early adopters in late 2010. In Europe, BibSys, which provides library and information systems to Norway's university libraries, college libraries, a number of research libraries and the National Library, announced that it will use OCLC's Web-Scale Management Services for its new library system.

WorldCat Local has been adopted by over a thousand libraries, primarily in the United States. Many of these include OCLC member libraries that have taken advantage of the WorldCat Local Quick Start programme that allows them to make use of the product without additional cost, though without the detailed synchronization between the local library management system and WorldCat performed for those that use the full version.

Libraries in the United Kingdom that have adopted WorldCat Local include:

  • York St. John University (http://yorksj.worldcat.org/)

OCLC's Touchpoint

In addition to WorldCat Local, OCLC's also offers an ‘end-user discovery service' called TouchPoint. TouchPoint is designed to offer an integrated multilingual interface for both physical and digital contentvu that can be integrated with any library management system.

Two key European implementations include:

  • Swissbib: (http://www.swissbib.ch/)
  • Leuphana University Lüneburg (http://t5240ldm-24.gbv.de:18080/TouchPoint/start.do)

SirsiDynix Enterprise

SirsiDynix, a global library management system vendor based in Provo, Utah in the United States, offers Enterprise as its strategic discovery interface product. Launched in July 2008, Enterprise, employing a service-oriented architecture, relies on the GlobalBrain search and retrieval technology from BrainWare. GlobalBrain was created as an enterprise-class search platform for unstructured data from organizations in many different industry sectors, offered both as a standalone product or embedded in other applications. SirsiDynix has tailored the GlobalBrain to library data as the search technology underlying its end-user products, including both the Enterprise discovery platform and its Portfolio digital asset management system. (See: http://www.brainware.com/search.php). BrainWare and SirsiDynix share corporate ownership by Vista Equity Partner.

Beyond its role as a discovery interface for access to library collections, Enterprise can also serve as a content management platform for information created by the library about its services or specialized topic areas. Content can be organized into groupings designed for different types of audiences. Carnegie Mellon University, for example, uses Enterprise not only as its primary discovery interface but also to manage its library website.

Although Enterprise has been designed to operate with any major library management system, to date it has been implemented primarily by its own sites running either Symphony or Horizon.

Enterprise offers the standard features of a new generation library interface included relevancy-based search and retrieval, faceted navigation, and the ability to index content beyond that of the local library management system.

The technology platform for SirsiDynix Enterprise also forms the basis of the company's digital library platform, released in November 2010, called Portfolio. Portfolio can be used to manage and present customized interfaces for digital collections.

Libraries in the United Kingdom implementing SirsiDynix Enterprise include

  • Blackpool Libraries (http://libraries.blackpool.gov.uk/client/default)
  • Libraries of the British Museum (http://libraries.britishmuseum.org/client/default)
  • Greenwich Library Service (http://gren.ent.sirsidynix.net.uk/client/default)

Open Source Discovery Products

In addition to the discovery products produced and licensed by commercial companies, open source products have been created by libraries. As open source software, these products can tested, customized, extended and implemented by libraries without licensing costs and without restrictions imposed by a vendor. While there are no direct fees for the use of the software, libraries using open source discovery products will incur costs related to any technical work that needs to be performed to implement and customize the software or to create new features not already available. Many libraries using open source software for their discovery interface may choose to engage external consultants or support firms. Libraries considering using open source discovery interfaces will need to balance the expense of local development and support with the fees paid for a commercial product along with the value of increased control and flexibility gained. While specialized support firms have emerged focused on open source library management systems, none have yet emerged offering comprehensive discovery services based on any of the open source products.

For libraries considering using open source software there is an increasing number of information sources on the web to help you get started:

  • SCONUL Higher Education Library Technology (HELibTech) wiki has a special page dedicated to Open Source: (http://helibtech.com/Open+Source)
  • Open Source Software solutions for libraries mailing list: (https://www.jiscmail.ac.uk/cgi-bin/webadmin?A0=LIS-OSS)
  • Open Source System for Libraries: (http://oss4lib.org/)
  • The software developed as part of the JISC LMS programme which funded projects for enhancing library management systems: (http://code.google.com/p/jisclms/)

The open source Apache Lucene and SOLR search engine frameworks lie at the heart of many of the library discovery products. Not only do open source products such as VuFind and Blacklight make use of these components, but proprietary products do so as well, including Summon and Primo. The open source licence associated with Apache software allows them to be integrated into higher-level commercial products that are not themselves distributed under open source terms. Lucene has proven itself as a highly scalable search engine technology for library discovery systems, capable of indexing hundreds of millions of documents. SOLR extends Lucene with the ability to automatically generate facets, greatly reducing the effort in developing a complete discovery application.

Two of the major open source discovery products available include VuFind and Blacklight.

VuFind

VuFind was originally developed at Villanova University in the United States, in an effort led by Andrew Nagy. It relies on Apache Lucene and SOLR for its underlying search engine, with its interface programmed in PHP. VuFind offers a complete set of features, consistent with those seen in the commercial discovery products. Though Nagy has since left Villanova to serve as Market Manager of Discovery Services for Serials Solutions, Villanova continues its development with other personnel and in particular, Demian Katz. Demonstrating the ability to blend use of commercial and open source products, Villanova has licensed Summon from Serials Solutions and offers the deep article-level Summon index through the VuFind interface. VuFind had been implemented by many academic and public libraries in the United States, including Marmot Library Network in Colorado, the Consortium of Academic and Research Libraries in Illinois, Auburn University and by other libraries internationally such as the National Library of Australia. VuFind has an active international developer community, exemplified by the VuFind 2.0 conference in Villanova last year (http://vufind.org/wiki/vufind_2.0_conference) where the VuFind 2.0 Roadmap was developed: (http://vufind.org/docs/roadmap2.pdf).

Some of the libraries in the United Kingdom and Ireland are implementing VuFind include:

  • The London School of Economics (https://catalogue.lse.ac.uk/)
  • Swansea University (https://ifind.swwhep.ac.uk/) implemented VuFind as a discovery layer that includes its own collections as well as the collections of the libraries of Swansea Metropolitan University and University of Wales Trinity Saint David.
  • National Library of Ireland (http://catalogue.nli.ie/)
  • Other European projects based on VuFind include:

    • eBooks-on-Demand (EOD), the trans-European digital document delivery service, service use VuFind for their search interface (http://search.books2ebooks.eu/)
    • The German Common Library Network (Gemeinsamer Bibliotheksverbund, GBV), use VuFind to provide access to a large collection of scientific and technical resources, including both open access and restricted content (http://finden.nationallizenzen.de/).
    • The Bielefeld University Library in Germany has created a resource, called BASE, the Bielefeld Academic Search Engine, which uses VuFind to provide access to a large collection of open access content (http://www.base-search.net/), currently indexing over 25 million documents.

    VuFind Case Study: eBooks-on-Demand by Silvia Gstrein

    The eBooks-on-Demand (EOD) network provides a trans-European digital document delivery service for end-users from all over the world. Currently the EOD network comprises more than 30 libraries offering their holdings for the digitisation on demand service from 12 European countries, (http://books2ebooks.eu/partner.php5). An “EOD” button linking to the order form is placed at the respective metadata record of each book provided for the service.

    Currently the EOD button is placed in some 45 library catalogues, a total made up of the large number of libraries already part of the network including those maintaining more than one library catalogue. From a users' point of view, it is very inconvenient and time consuming to browse individual catalogues of participating libraries in isolation. In addition, sometimes the various catalogue front ends work in an idiosyncratic way and the EOD button is not always implemented identically. Above all, the catalogues in the network use a wide variety of languages for their front-end interfaces. These languages are not always ones that users are familiar with.

    These issues contributed to our idea of creating a common starting point for browsing as many of the books offered for digitisation on demand via EOD as possible. Potentially, in the medium term, this tool should also allow for searching those books already digitised by our participating libraries.

    By the end of 2009, following an investigation of a variety of software packages, including VuFind (http://vufind.org/), Blacklight (http://projectblacklight.org/), LibraryFind (http://libraryfind.org/), Scriblio (http://about.scriblio.net/), and SOPAC (http://thesocialopac.net/), we decided on the software to use for setting up such a common discovery interface. Finally the choice was made in favour of VuFind – a widely used open source software, easily adaptable to our needs and already supporting multiple languages in the front end – a really important feature in a trans-European project. Moreover, the strong community of developers was of great help in setting up and customizing the interface –another reason which contributed to our decision on the software. A list of other organizations currently using or testing VuFind can be found at: (http://vufind.org/wiki/installation_status)

    Finally, by the end of 2010, the current implementation of VuFind was made publicly available: http://search.books2ebooks.eu/. This already included some 1,4 million records imported from about 10 libraries – a vast bulk of pre-1900 books for sale via the digitisation on demand service. In cooperation with the libraries involved, we agreed on the following record metadata formats and importation methods into the “EOD search interface”:

    • Currently accepted metadata formats: MARC21 or MARCXML
    • Possible import interfaces: Harvests via OAI-PMH (preferred) or batch upload via FTP

    So far, we have received encouraging feedback as well as a growing degree of awareness from both users and libraries. With the help of XML site maps, the books can easily be found via the Google search engine. This is also reflected in the increasing numbers of hits, including increasing number of users' clicks on the EOD buttons of single records.

    And yet, there are various details which need improvement in the near future:

    • An automated update mechanism at pre-defined intervals for retrieving changes and new records from the participating library catalogues.
    • Support for the import of MAB records, a format widely used in German speaking libraries. Therefore a transformation based on already existing concordance tables (http://www.d-nb.de/standardisierung/formate/marc21.htm) needs to be written which can be used for mapping the conversion of MAB to MARC.
    • Finally, a solution is still to be found for integrating records of digitised card catalogues (also called IPACs). In some cases where the digitised card catalogue has been processed through OCR, it might be possible to import results from an automated metadata comparison with online library catalogue records.

    BlackLight

    Blacklight follows a similar approach to VuFind using Lucene and SOLR, though it uses Ruby on Rails as its development framework instead of PHP. Development originated at the University of Virginia as a platform for access the Nineteenth-century Scholarship Online (http://www.nines.org/), which was further developed into a discovery interface for the library's SirsiDynix Symphony library management system and a variety of other repositories and collections.

    Currently there are no known implementations of BlackLight in Europe. However, some investigation work has been undertaken as part of the JISC LMS programme for Enhancing Library Management Systems (http://www.jisc.ac.uk/whatwedo/programmes/inf11/jisclms.aspx)

    At the University of Hull, an investigation into the possibility of using the Blacklight discovery interface to extended the library search environment to cover both the library catalogue and the local institutional repository. Further information about the project is available at: (http://code.google.com/p/jisclms/wiki/blathull) and (http://blacklightathull.wordpress.com/).

    Funded under the same programme, the CReDAUL project (Combing Resource Discovery for the Universities of Sussex and Brighton) has been testing Blacklight alongside VUFind to evaluate which system offers the best option for creating a combined web catalogue for the libraries of Brighton and Sussex universities. Further information on the project is available at: http://credaul.wordpress.com/

    In the United States, the libraries using Blacklight include:

    • Johns Hopkins University: https://catalyst.library.jhu.edu/
    • Stanford University: http://searchworks.stanford.edu.

    In addition to the discovery products mentioned above that find use in Europe, additional products have been developed and implemented in other international regions. Some of these include:

    BiblioCommons

    BiblioCommons developed by a Canadian company of the same name, brings many concepts of social networks to the library catalogue. Its interface includes most of the now-standard features, such as single search box, results ordered by relevancy, faceted navigation, visual enhancements and includes a number of advanced search options. BiblioCommons has been implemented by many libraries in Canada, including the public library systems in Edmonton, Ottawa and Vancouver and the Chinook Arch Regional Library; in the United States by the public libraries in Boston, Seattle, Santa Clara County; and by the Christchurch City Libraries in New Zealand.

    eXtensible Catalog

    The eXtensible Catalog, a research project launched in April 2006 by the River Campus Libraries of the University of Rochester, with funding from the Andrew W. Mellon foundation, has created a number of tools that complement the development of discovery products and services. The main outcomes of the project include a set of connectivity tools, including toolkits for the Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) and for NISO Circulation Interchange Protocol as well as the XC Metadata Services Toolkit. This toolkit offers utilities for the transformation and clean-up of metadata as it is extracted from repositories, such as library management systems, and loaded into discovery services. The eXtensible Catalog project has also created the XC Drupal Toolkit that provides a discovery interface with customizable faceted navigation based on content from repositories and the library website. Though the toolkits created by the eXtensible Catalog have been used by many projects, no libraries have yet placed the full set into use as their primary discovery interface.

    SOPAC

    Originally developed at the Ann Arbor District Library in Michigan under the lead of John Blyberg, SOPAC makes use of the open source Drupal content management system as the basis for a new generation library catalogue rich in social features. To date, no major SOPAC installations have taken place in libraries outside the United States.

    Library-based developments

    In addition to implementation of commercial and open source products, many library organisations have created their own new-generation catalogues or discovery projects.

    Beluga

    The Beluga project, underway at the State and University Library in Hamburg, Germany, serves as an example of a locally developed discovery interface that aims to provide contemporary features such as relevancy-based results, faceted browsing, and blended content for images and enriched bibliographic information, as well as a social dimension of user-supplied tags, reviews, and other content. The interface presents a single search entry point, uses Apache SOLR for indexing and offers visualizations that help users interpret results, such as graphs that summarize the types of materials represented. The Beluga interface interacts with the underlying library management systems to show current availability status and offers users suggestions for similar materials. Though the features and functionality of Beluga are similar to those in other commercially provided or open source discovery products, the project stands out through its emphasis on user-centred design and the usability studies performed to shape its development. Beluga currently spans the content of six academic and school libraries in Hamburg with combined holdings of six million titles. The project was led by Anne Christensen from its inception in November 2007, with Jan Frederik taking responsibility for the project in January 2011, as Beluga moves from its beta testing phase to a full production environment. http://beluga.sub.uni-hamburg.de/

    Data Wells

    In Scandinavia and other parts of Europe, the concept of large consolidated indexes representing broad repositories of content have come to be called “data wells” and a number of projects have embraced the concept. In the same way that some commercial companies have worked on collaborations and business relationships with publishers, aggregators, and other content providers to gain access to a broad array of resources for the purpose of indexing in a discovery platform, some library organizations have engaged in similar activities, often with a more specific geographic or academic focus. Some of the projects that include a discovery interface extended through a data well include Summa and Ting in Denmark, and Libris in Sweden.

    E-LIB Bremen

    E-Lib Bremen (http://www.suub.uni-bremen.de/), is the electronic library of the State and University Library in Bremen, Germany. It provides an integrated search across 32 million objects including the library catalogue and digital collections of the library. Of the digital collections an estimated 80% are full-text documents. The search engine has been fully integrated into the library website. The system is built on a search engine product called CIXBase (http://cixbase.dyndns.org/CiXbase/cixdocs/). Further information about the project is available at: (http://elib.suub.uni-bremen.de/projekt_elib_en.html)

    OpenBib

    The University of Cologne in Germany has implemented a search environment called Kölner UniversitätsGesamtkatalog, or KUG (http://kug.ub.uni-koeln.de/), based on an open source discovery interface called OpenBib, originally developed by Oliver Flimm. KUG encompasses a broad range of content, including the materials from the catalogues of 190 associated colleges and institutes and external collections, totaling over 11 million titles. The OpenBib interface includes most of the features now expected in new-generation catalogues such as faceted navigation, recommendations of related materials, tag clouds, and user-supplied tags. Search results can be separated according to target resource or combined. The technology components underlying OpenBib include the Apache Web service, Perl, MySQL and Xapian, an open source search engine (http://xapian.org/).

    Summa

    Statsbiblioteket, or the State and University Library, based in Århus, is one of the national libraries in Denmark. It has been engaged in a project to provide a discovery environment that includes the holdings of libraries throughout the countries as well as a large repository of scholarly articles. The initial Summa project included the creation of its own data well. The project has took a new turn in mid-2010 when the Statsbiblioteket entered into a development partnership with Serials Solutions, involving licensing Summon to integrate its deep index of articles with the interface created for Summa.

    TING.concept

    Another effort to create a new discovery environment for Denmark that includes a data well component has been taken on by an organization called the TING.concept, as a collaboration between the public libraries in Aarhus and Copenhagen and DBC an organization that provides various products services to libraries in Denmark, including content description or indexing services and library automation infrastructure. The consolidated index, or data well, will be created by DBC. The platform for the TING.concept is based on open source components including Apache Lucene and SOLR, Fedora Commons, PostgreSQL, and Drupal. Ting was implemented in the two original library partners in 2010, with new libraries, including Randers Bibliotek joining in 2011. (See: http://ting.dk/)

    UniCat

    UniCat (http://www.unicat.be/), a union catalogue of 14 million records, from nine major academic libraries in Belgium, is based on a proprietary search and retrieval product, called LibHub, developed by Salam Baker and supported by SemperTool, a Danish software development firm. The system is populated through regular transfers of files of MARCXML records from each of the libraries. UniCat provides broad discovery services across the holdings of the participating library and integrates with the national and local interlibrary loan systems. SemperTool has also developed electronic resource management products and has developed custom digital library implementations in Africa and the Middle East.

    Based on the consolidated index approach of modern discovery services, UniCat provides traditional union catalogue functionality. Its interface offers some basic faceted navigation, with limiters for holding library, language, resource type and publication year, but not for authors or topical terms as is typical for next-generation discovery services. An availability pop-up shows what library holds a selected title, with a deep link into the local interface to show full bibliographic details, circulation status, and shelf location.

    Conclusion

    The realm of discovery systems has transformed the ways that libraries interact with their users on the web. These tools have made considerable progress in closing the gap between the way other destinations on the web function and the interfaces libraries offer for access to their collections and services. In this chapter we have described range of modern alternatives to the clumsy online catalogues created in a previous phase of library technologies. The library discovery products available today routinely offer features such as fast, relevancy-based searching, the presentation of facets for guiding users through results, contemporary interfaces with user-centered design and attractive presentation. More importantly the scope of these interfaces has broadened beyond the traditional realm of the library management system to increasingly encompass growing amounts of electronic scholarly content at the article level and collections of digital objects. The state of the art of discovery services continues to advance through both improvements in technology and new opportunities to aggregate content. Much of the current movement among discovery projects involves partnerships to index ever-increasing bodies of content and in deeper indexing of the full-text of the materials, not just citation metadata. We are also seeing an interesting mix of products and projects including commercial services produced by some of the giants of the automation industry as well as open source projects with very broad-based communities of developers and implementers, as well as a few initiatives taken on by individual libraries or consortia.

    Today, despite the availability of many different options for next-generation catalogues or discovery tools, the majority of libraries continue to rely on traditional online catalogues. We anticipate, however, a tipping point to take place where legacy catalogues will rapidly become displaced by the modern alternatives, such as those mentioned in this chapter. As we look forward a few years, we can anticipate that traditional library catalogues will be mostly eclipsed; few library management system vendors will continue to develop these modules and they will grow more obsolete year-by-year. Today, most libraries that have implemented discovery services continue to rely on the legacy online catalogue as an advanced search tool for the library's print collection. As discovery products gain these advanced search and browse capabilities, fewer libraries will feel the need to maintain their legacy catalogues. Open source and proprietary discovery products will continue to prosper: some libraries appreciate the freedom and flexibility inherent in open source software, other libraries will depend on the efforts of commercial organizations to create and maintain massive consolidated indexes of licensed and open access scholarly content. Consistent with general trends in computing technology, increasing proportions of libraries implementing discovery solutions will depend on consortially-hosted or cloud-based infrastructure rather than operating their own installations on local hardware platforms.

    Discovery services, as the primary experience that libraries present to their users, represent the most critical component of a library's infrastructure. This chapter has described the major products and projects, presenting libraries interested in implementing a new discovery interface with a variety of alternatives.

    Permalink:
    View Citation
    Publication Year:2013
    Type of Material:Chapter
    Language English
    Published in: Catalog 2.0: The future of the library catalog
    Issue:2013
    Page(s):37-64
    Publisher:Facet Publishing
    Place of Publication:London
    Notes:This is a preprint of a chapter accepted for publication by Facet Publishing and should provide an electronic link to the publisher’s website (www.facetpublishing.co.uk)
    Record Number:18357
    Last Update:2015-09-11 15:16:53
    Date Created:2013-09-25 16:15:04