EBSCO Information Services has made a significant grant to the open source Koha ILS project in support of an ambitious set of enhancements and extensions. The grant was awarded to Koha Gruppo Italianio, an organization devoted to the promotion of Koha in Italy. The development work of the enhancements covered in the grant will be carried out through other Koha support firms, including Catalyst, based in Wellington, New Zealand and ByWater Solutions. This grant signals strong support toward open source library software that complements EBSCO's partnerships with companies offering proprietary ILS products.
Koha Gruppo Italiano initiated a grant request to EBSCO in May 2014; it was granted on Feb 11, 2015. The specific monetary amount of the grant was not disclosed, but the tasks funded by the memorandum of understanding were itemized in the press announcement and described below. The initial proposal requested funding for one from a selection of enhancements. According to Senior Vice President Scott Bernier: “We considered their proposal and decided that we wanted to support Koha in an even more significant way and decided to fund all of their enhancements.”
Koha finds use in all regions of the world, implemented by an estimated 7,000 to 9,000 libraries and rising. As open source software that has been implemented by individual libraries and is promoted or implemented by regional or national library authorities in many areas of the world, tracking globally the exact number of libraries using Koha is difficult. In the United States, at least 855 library systems, representing 1,568 individual branches, have implemented Koha. Of this number, 41 (5.6%) have collections larger than 200,000 items; 353 (50.9%) have collections between 20,000 and 200,000; and 299 (43.1%) have collections smaller than 20,000 items. Many different types of libraries are represented, including 507 public, 104 academic, and 84 in K-12 schools. These numbers, taken from the libraries. org directory, may not be entirely comprehensive.
The April 2014 issue of Smart Libraries Newsletter provided coverage of Koha, including “History and Background of Koha.”
The Koha Community
Koha is developed through a globally distributed community of developers. OpenHUB, a resource that analyzes open source software projects, characterizes Koha as having a “very large, active development team” with 88 individuals contributing new programming code in the last year; and 322 developers since the initiation of the project in 1999. OpenHub reports the Koha development team is in the largest 2 percent of projects it tracks.
Many libraries, especially in the developing world, implement Koha through the efforts of their own personnel, often assisted by members of the broader community via e-mail discussion lists and the Koha IRC channel. Dozens of commercial and nonprofit organizations around the world provide services to support libraries implementing Koha and that contribute to its development. One of the original developers of Koha, Chris Cormack, continues work with Koha through Wellington, New Zealand-based Catalyst, a large technology services company involved with open source software in many business sectors. Cormack reports that Catalyst employs more than 235 personnel, including 182 developers, though only a small number are devoted to Koha. In the United States, ByWater Solutions ranks as the leading support provider for Koha with at least 538 clients spanning 892 library facilities. Equinox Software, though primarily involved in services related to the open source Evergreen ILS, also provides support for Koha. BibLibre provides support services to libraries in France and has been one of the companies most involved in software development. PTFS Europe provides services related to Koha in the United Kingdom and in Europe. Koha has become a one of the preferred automation system in public and school libraries in Argentina and is gaining adoption in academic libraries. The Universidad Nacional de Córdoba has been the focal point support, advocacy, and development of Koha in Argentina. Dozens, if not hundreds, of other companies in many countries provide services in some way related to Koha.
The Koha Gruppo Italiano was formed in February 2012 to promote and support Koha throughout Italy. The organization focuses on advocacy, awareness, and fundraising more than technical support or development. With funding gained, it works with other Koha support firms, including ByWater Solutions, Catalyst, and BibLibre for the execution of technical projects. The organization has organized a number of events to introduce libraries in Italy to open source software, including VuFind and DSpace in addition to Koha. Individuals involved in the organization include:
- Franziska Wallner of the American University of Rome, which implemented Koha in 2006,
- Stefano Bargioni of the Pontificia Università della Santa Croce (Koha since 2011),
- Diego Ramirez of Pontificia Università della Santa Croce,
- Sebastian Hierl of the Arthur and Janet C. Ross Library of the American Academy in Rome, which implemented Koha in 2013 and relies on hosting and support services from ByWater Solutions.
Koha Gruppo Italiano has been active in raising interest and resources for the extension of Koha to include Elasticsearch since February 2014.
Koha Technology Stack
The Koha ILS has been evolving in functionality and architecture since its initial version created for the Horowhenua Library Trust, a relatively small three-branch library system north of Wellington, New Zealand. The original version was a relatively basic database-driven automation system. In the 15 years since its introduction, Koha has been developed into an integrated library system with features and functionality at a comparable level with many of the proprietary products. Koha continues to evolve in its technical architecture to sustain use by larger and more complex libraries.
The codebase underlying Koha is large. According to the GitHub stats for its public repository (http://git.koha-community .org/), the project currently totals 9,398,192 total lines of code, distributed over 6,196 program files.
Koha manages its data through the MySQL open source relational database management system, is programmed in Perl, relies on the Apache web service, and runs on the major implementations of Linux. Koha currently includes support for Zebra, an open source search engine developed by Index Data. Zebra includes native support for library-oriented standards including MARC and Z39.50. It is a relatively lightweight infrastructure module that easily co-exists within the same server as the other components of Koha.
The earliest versions of Koha relied entirely on MySQL to deliver search results. LibLime, one of the early commercial support and development firms for Koha, began work in December 2005 to incorporate Zebra from Index Data to support the MARC database and search components of Koha. (Lib- Lime has since been acquired by PTFS and focuses on its own LibLime Koha ILS, which forked from Koha in about 2009.) As Koha reaches into larger and more complex libraries, there has been considerable interest in an alternative search technology for greater scalability and additional features.
Outside the scope of the grant, architectural improvements are being developed to improve Koha's performance in processing transactions. As noted, Koha is written in Perl, which interprets each script into a computer's native machine language in real time rather than being compiled in advance. Each script is also normally executed as a new process, adding system overhead. In an implementation with a high transaction load, these factors can lead to slower performance. To improve performance, an additional layer, called Plack, is being introduced that significantly reduces processing overhead and increases performance. Plack support has already been implemented selectively. Some support and hosting providers have already incorporated it in the online catalogs of their production systems. Delivering the staff interface via Plack is also underway with completion expected in the coming weeks.
New Search Architecture Based on Elasticsearch
One of the most ambitious tasks supported by the EBSCO grant will be the extension of Koha to incorporate Elasticsearch (http://www.elasticsearch.org/) as a new option for search and retrieval. Zebra has been a very pragmatic indexing engine and search component for Koha, but does not necessarily offer the performance levels and features of alternatives such as Apache SOLR or Elasticsearch, which were developed for very large scale applications across many different industry sectors. Zebra, which requires fewer hardware resources to operate, was originally developed for library-specific applications, with direct support for MARC record formats.
Elasticsearch and Apache Solr are the two most popular indexing and search servers for large-scale applications that involve retrieval of information from large content repositories. Both offer a powerful set of features and are offered as open source software. Elasticsearch and Apache Solr rely on Apache Lucene as part of its indexing and search infrastructure. Solr is used by index-based discovery services, including Ex Libris Primo and ProQuest Summon.
The use of Elasticsearch with Koha will also provide more capabilities related to the creation of facets to narrow search results and in improved relevancy in the presentation of search results. Elasticsearch finds use in many large-scale projects, such as for The Guardian, processing more than 40 million documents per day; GitHub, indexing for all the repositories; and many others (see http://www.elasticsearch.org/case-studies/).
The enhancement of Koha to use Elasticsearch will be implemented as an optional installation configuration. Many smaller libraries may not require the additional capabilities of Elasticsearch and may prefer not to have to manage a more complex set of components that come with its use. Koha with the Zebra search component can be managed easily on a single server.
Work on the implementation for Elasticsearch in Koha was already underway prior to the award of this grant. Currently one developer is working on the programming to accomplish the integration, with to others assisting in testing, according to the Universidad Nacional de Córdoba's Tomás Cohen Arazi, an active participant in the Koha community of developers and the Release Manager for Koha 3.20.
Other New Features
The grant will also fund the development of the ability for patrons to browse the contents of the library's collection according to author, title, subject, or call number. Browsing according to these categories has been a standard feature in major ILS products. Koha has previously offered the ability in its advanced search to limit queries by specified fields, but has not offered structured browsing.
Other enhancements covered include the development of a “MARC to RDF crosswalk” and the ability to support other forms of metadata besides MARC21 to describe resources. This capability will be facilitated through the implementation of Elasticsearch, according to Sebastian Hierl, member of Koha Gruppo Italiano and librarian for the American Academy in Rome.
Extending the Koha Patron API
The scope of the grant includes enhancing the functionality of the Koha APIs related to patron-oriented tasks, which benefits Koha in a variety of ways. EBSCO will benefit by improved integration of its EBSCO Discovery Service in libraries using Koha. EBSCO works with almost all of the major ILS products, including Koha, to integrate its EBSCO Discovery Service. This integration can take the form of either a full replacement for the online catalog of the system or as a supplementary article-level index queried through the interface of the online catalog provided with the ILS. EBSCO has not developed its own ILS, but has worked to bolster the exposure of EBSCO Discovery Service through partnerships with a broad range of ILS providers, including both proprietary and open source products.
One example of many libraries that have used EBSCO Discovery Service in conjunction with Koha include the Hammermill Library of Mercyhurst University. EBSCO reports that there hundreds of libraries using Koha that also subscribe to EBSCO Discovery Service, many of which are fully integrated. EBSCO has also provided support for the Kuali OLE project. The company became a Kuali Commercial Affiliate in April 2013. The scope of Kuali OLE does not include the provision of a discovery interface. In addition to its general support, EBSCO facilitates the integration of Kuali OLE with EBSCO Discovery Service as one of the interface options available. As seen in its partnerships with proprietary providers and open source projects, EBSCO supports a library technology ecosystem or infrastructure that gives libraries options to choose different discovery interfaces with any given integrated library system. This separation between resource management products and discovery services opens options for libraries that may have different choices for these two different domains. The ability to mix and match discovery and management products depends on a robust set of APIs to enable interoperability among diverse systems.
This grant by EBSCO to the Koha project reflects an interesting set of dynamics between one of the largest companies in the industry, involved in a wide range business activities related to providing content, services, and technology to libraries, and a broad-based open source project. As Koha continues to find use in a growing and diverse set of libraries, EBSCO gains recognition within that global community, which includes libraries that may also be current or potential subscribers to its products and services.