Library Technology Guides

Document Repository

TIND Technologies and Invenio: A New Model of Automation for Research Libraries

Smart Libraries Newsletter [August 2015]

.

Abstract: Invenio, developed under the auspices of CERN is open source software created to address the major areas of involvement for libraries associated with research organizations. The software can be used to support institutional repositories, manage digital assets or multimedia content, provide digital preservation, manage data sets associated with research projects, and to manage library materials. Invenio has been most widely implemented to support institutional and disciplinary repositories but is seeing adoption to support these other activities. As interest in Invenio expanded beyond the community of institutions closely affiliated with CERN, it established TIND Technologies as a new commercial company to promote and provide professional services.


Invenio, developed under the auspices of CERN is open source software created to address the major areas of involvement for libraries associated with research organizations. The software can be used to support institutional repositories, manage digital assets or multimedia content, provide digital preservation, manage data sets associated with research projects, and to manage library materials. Invenio has been most widely implemented to support institutional and disciplinary repositories but is seeing adoption to support these other activities. As interest in Invenio expanded beyond the community of institutions closely affiliated with CERN, it established TIND Technologies as a new commercial company to promote and provide professional services.

Many libraries serving research institutions focus the majority of their efforts on their collections of electronic resources, documents, and digital objects, with print materials representing a decreasing level of interest. Although physical materials continue as an essential component of these library collections, they may not necessarily rank as the center of its existence. The traditional integrated library system was initially conceived to manage print collections and has evolved over time to also handle other types of materials. Invenio offers a new approach of building a library management system based on a repository platform in contrast to the established integrated library systems.

Functionality

Invenio was originally developed to provide an institutional repository of scientific papers for CERN and other institutions but has expanded in scope of functionality. Designed to be flexible and adaptable to many resource management needs, the software has been the foundation for additional modules to addressing other types of use. Core functionality includes the classification, indexing, curation, and access to documents. Core functionality centers on the management and dissemination of documents, with its initial implementation as the CERN Document Server, one of the key sources of pre-prints and research reports for high-energy physics. It includes modules for the ingestion, classification, description, search, and access for documents. Support for OAI-PMH enables Invenio to efficiently harvest documents from other repositories and to disseminate its contents to other services. Invenio supports more than 23 different languages.

Serving the interest in content beyond documents, Invenio can manage images, video, and other forms of digital media. Through customized interfaces and plug-ins, users can search and view or play multimedia.

Invenio also provides support for digital preservation, providing workflows and features that follow the OAIS (Open Archival Information System) reference model. The system is able to calculate and validate checksums associated with each digital object and provide multiple layers of security for shortand long-term storage.

In addition to individual bibliographic items or digital objects, Invenio also includes functionality to help organizations manage the data sets associated with research projects. Providing services in support of the management of research data has been a key area of interest for academic and research libraries in recent years.

Beginning in 2011, developers extended Invenio to provide the basic capabilities of an integrated library system for the circulation and management of physical library materials. The original functionality of the product included providing an online catalog in tandem with a separate ILS. The BibCirculation module enables libraries such as CERN to shift away from the management of these materials in a separate application. Invenio uses MARC21 as its native record structure and includes an editor for working with this format. The software can export records in many other formats, including Dublin Core, MARCXML, JSON, or any type of delimited file. Bibliographic records can be produced and modified individually, or they can be imported in batch.

Invenio does not yet have specialized functionality for electronic resource management, link resolution, or an article-level discovery service. The simplified library automation model may also not include some of the detailed functionality present in integrated library systems, especially in the areas of acquisitions, serials management, or circulation and resource sharing for institutions with large collections, multiple campuses and services points, and complex loan policies. Invenio has primarily been implemented for library management in institutions that serve researchers and not in those with large student populations.

Development History

Invenio was originally developed in support of the library at CERN, a major research organization specializing in high-energy physics. CERN may be known best in recent years for its work to build and perform research through the Large Hadron Collider. The CERN library provides support to the researchers affiliated with CERN and in the dissemination of the documents and reports produced by CERN and related organizations.

CERN initially automated its library using a commercially provided integrated library system. In 1983 the library became one of the earliest implementers outside of Israel of the Aleph ILS from Ex Libris. Given that the World Wide Web was invented at CERN by Tim Berners-Lee, the library was interested in making resources available in this emerging environment. In 1993 the IT group supporting the CERN library created a Web-based interface for Aleph called weblib. This technology was extended in 2002 to create a full document repository, called the CDS Document Server, to provide a more efficient management and distribution mechanism for the documents created at CERN and the many requests it received for copies. The software was originally known as CDS Ware (CERN Document Server), and was later renamed to Invenio, reflecting its development and use by a community of institutions beyond CERN.

As time progressed, Invenio found use to support a broader set of activities at the CERN library, leaving Aleph in place primarily to manage the book collection. Beginning in 2011, Invenio was extended with library management features to more fully integrate the library's technical infrastructure and to relieve it from having to maintain Aleph. This move both saved the expense to the library in fees paid to Ex Libris, but it also enabled a simplified and more efficient environment for management and access of library materials.

Extending Invenio to manage library materials was accomplished through the creation of a new module called BibCirculation. This development was accomplished through an 18-month project carried out by Joaquim Jorge Rodrigues Silvestre, working in the Information Technology Department of CERN. A complete description of the context, the development process, and the capabilities of the module were described by Silvestre's Master's Thesis (An Integrated Library System on the CERN Document Server. April 2010. Master's Thesis in Computer Science Engineering. Universidade de Evora).

Below is the timeline of CERN and the development of Invenio.

  • 1954: CERN established (European Center for Nuclear Research)
  • 1983: CERN Library implements Aleph to manage its print collections. One of the first European libraries to implement Aleph from Ex Libris
  • 1989: Tim Berners-Lee invents the Web at CERN
  • 1993: CERN Preprint Server made available on the Web as an institutional repository
  • 1996: CERN Library Server (weblib) launched to also include books, periodicals and other materials, functions as a front-end interface for CERN Aleph installation
  • 2000: CERN Document Server (CDS) launched for multimedia material and internal documents, building on weblib
  • 2002: CDSware released as open source for use by other institutions
  • 2006: CDS Ware renamed to CDS Invenio (Release 0.9x)
  • 2010: Release of Invenio 1.0
  • 2011: library management features added to Invenio.
  • 2013, May: TIND Technologies established
  • 2014: Release of Invenio 2.0

Technical Components and Architecture

Invenio has been in development for more than a decade and has seen some evolution in the technical architecture and components. The software comprises more than 40 modules, which can be selectively implemented depending on the scope of functionality required. Invenio provides all public and staff-oriented functions through Web-based interfaces. The application makes use of a relational database to manage bibliographic records and operational data. Databases supported include MySQL, PostgreSQL, SQLite, and MongoDB. The application is primarily programmed in Python.

SQLAlchemy provides object-relational mapping and other services between the underlying database and Python.

Version 1.0 and earlier of Invenio use MARC21 as the underlying bibliographic record format. Records will be stored in JSON with version 2.0. The user interface for Invenio makes extensive use of JavaScript and the jQuery libraries, with Bootstrap planned for future versions. Other technical components include HAProxy for load balancing and the Apache Web server. Invenio can optionally be configured to use SOLR for indexing and retrieval. A change from SOLR to Elasticsearch is also planned for version 2.0. The current production version of Invenio is 1.2.

Invenio has been created as a server-based application where each instance requires its own copy of the codebase. It is not a multi-tenant platform, as would be expected for new software services created today. Each instance of Invenio can be expected to scale to support large collections with millions of records. The modules of Invenio include RESTful APIs to communicate with external applications or scripts.

TIND Technologies has developed its own hosting infrastructure for efficient deployment and maintenance of Invenio. The company is able to test and validate fixes and software updates and push approved changes to the software to all of its clients simultaneously.

TIND Technologies Launched to Provide Invenio Services

CERN operates a Knowledge Transfer office to identify opportunities and to create spin-off companies to advance software or other intellectual products created by the organization. Knowledge Transfer partners with an entrepreneurship program of the Norwegian University of Science and Technology. CERN spin-offs seek investors to develop a company and share a portion of the revenues generated.

As external organizations began making use of the Invenio software, CERN continually received requests for information, help with implementation, and other types of assistance. CERN IT planned to continue to develop the software, but was not in a position to devote resources to provide external support. This apparent demand for services surrounding Invenio drove Knowledge Transfer to consider the creation of a spin-off company to fill this niche.

TIND Technologies AS was formed as a commercial CERN spin-off company in 2013, with Kenneth Hole and Alexander Nietzold as its co-founders. The new company works closely with the Invenio development team at CERN, with the Knowledge Transfer group, and with the CERN Library.

CERN has established TIND as the exclusive organization for providing professional services for Invenio. It offers TIND access to the development team for Invenio, gaining important expertise. TIND primarily supports the software as developed at CERN, but has in-house technical personnel, several of which came to TIND from CERN IT, for creating and maintaining its hosting platform, for conversion and installation, customization, configuration, and other services. Any individual or organization can download, modify, use, or redistribute Invenio as it is offered under a GNU open source license, but only TIND has direct access to the Invenio development team, beyond the mailing lists, documentation, and other resources generally available to anyone. TIND is licensed to use the CERN brand and logo and returns a portion of revenues as royalties to CERN.

Because Invenio is available as open source software, the primary business model for TIND lies not in license fees, but for support services. TIND has developed a robust hosting infrastructure able to efficiently deploy and maintain instances of Invenio through software as a service (SaaS). The company provides hosting services for any type of implementation using Invenio, with libraries representing the largest portion.

Principals of TIND include:

  • Alexander Nietzold, Managing Director
  • Kenneth Hole, Product Manager and Project Manager
  • Fredrick Carlsen, Software Architect
  • Audun Bjørkøy, Technical Director

The company employs a total workforce of eight individuals. It is based in Trondheim, Norway and has offices in Geneva and Paris. TIND Technologies is owned by its employees and does not rely on capital from external investors. TIND has also received some government grants, primarily from Norway.

Current Implementations

TIND is a relatively new company, but has established itself as a provider of services to a variety of organizations. Institutions that have implemented the library management system from TIND include:

  • United Nations Office of the High Commissioner for Human Rights
  • United Nations Office in Vienna

Those working with TIND for repository or research data management include:

  • The Institute of Applied Mechanics of the Czech Academy of Science
  • University of Applied Sciences in Western Switzerland

For more information:

  • TIND Technologies: tind.io
  • Invenio: invenio-software.org
  • CERN: home.web.cern.ch
Permalink:
View Citation
Publication Year:2015
Type of Material:Article
Language English
Published in: Smart Libraries Newsletter
Publication Info:Volume 35 Number 08
Issue:August 2015
Page(s):2-5
Publisher:ALA TechSource
Place of Publication:Chicago, IL
Company: TIND
ISSN:1541-8820
Record Number:21456
Last Update:2020-10-29 08:14:53
Date Created:2016-04-06 06:56:17