As I visit libraries around the world and throughout the country, I'm constantly impressed with their unique and special materials as much as with their collections of published materials. Often when I visit a national library, I regard myself as fortunate when it shows off some of its most valuable and revered treasures. These special collections include artifacts produced by scholars, literary figures, government leaders, or other prominent individuals or organizations that made important contributions to society. These manuscripts, photographs, correspondence files, and physical objects have long been the focus of archives, museums, and the special collections units of libraries and other cultural institutions. Many also manage rare and valuable printed materials, ranging from incunabula, rare editions, early newspapers or periodicals, and other published works too valuable or fragile for general access in a library's circulating collection.
One organization with which I am familiar deals with both traditional library materials and with archival collections. It uses the terminology "published versus unpublished" materials to differentiate these two categories, with the library part of the organization managing published materials and its archives specializing in unpublished works. While there might be some gray areas, these categories have been helpful to me as I think about the similarities and differences between libraries and archives or the other special collections units within libraries.
In recent years, many of these special collections have also had to extend their efforts into the digital realm. Since the creation of content has been performed for a few decades now through some form of computer technology, it's not at all surprising that these special collections are now seeing increasing proportions of their acquisitions arriving in the digital form, and the digitization of extant materials for access and preservation has become a routine activity.
Computer-oriented materials that might find their way into a special collections unit can be incredibly challenging to manage, to enable access, and to preserve. Some of the materials acquired into special collections today may have been created during the earlier phases of the computer era. They might be files stored on media that has long been considered obsolete. It's easy to imagine that a retiring professor, for example, will turn over early article drafts, correspondence, and data stored on floppy diskettes that were widely used in the 1980s. But a library today would not likely have computers available that could physically read these files, much less handle the word processing formats created by the software of that time. The various combinations of computer storage devices and file formats used by authors over the decades present some very challenging problems for those needing to gain access to the information they contain today. Dealing with such materials often takes special equipment and technical expertise, performing activities that might be characterized as digital archaeology.
Websites and other dynamically generated content repositories present another set of difficulties to an archivist. The legacy of a scholar or researcher might, for example, include a database-driven website. An entity of this nature might consist of a physical server and multiple software components in addition to the actual data and textual content. Each of these components may have a very limited practical life span relative to the very long time frames in which archives aim to preserve them. How can custom-written software as well as the standard software components be maintained in perpetuity to provide access to the digital legacy of these resources? Some data might have been created through proprietary software that may require license keys for ongoing access.
It may be even more complicated to deal with digital heritage of recent vintage. Today, an increasing amount of content is created through cloudbased services rather than through local computing and storage devices, raising very difficult questions about how to preserve them for posterity. How does an archive or special collections unit process a collection that includes materials created and stored on services such as Google Docs or with scholarly correspondence conducted through Gmail? Or datasets that reside on Amazon S3 cloud storage? As cloud computing gains wider adoption both by individual scholars and educational institutions, archives will face interesting challenges to ensure access and to preserve this content for the long term. How do things such as social media fit into archival collections? Would the Facebook page of an activist or scholar be available for inclusion? In addition to the technical ability to transfer content out of a cloud service into some kind of an archival environment, there may also be legal constraints, depending on the terms of service and business practices of the organization operating the service. It may require the intervention of legal experts as well as technologists to ensure the longevity and access of content created in this era of cloud computing. It does seem that the cumulative generations of computer technologies present incredibly difficult problems that will need to be addressed by current and future librarians and archivists.
Archival collections have an interesting set of requirements for the technology that might be used to support their management and access. Some areas of concern include metadata management, physical and digital preservation, tools for discovery and access by end users, and automation of the acquisition and management of materials. There are a variety of products available that specialize in the automation needs of archives. Some of the archival management products I have come across recently include Adlib and Calm, both available from Axiell; ArchivesSpace (archivesspace .org), created through the combination of Archon and the Archivists' Toolkit and is now supported by LYRASIS; and CollectiveAccess (collectiveaccess .org), as well as products from Eloquent Systems, Inc.; Cuadra Associates, Inc.; and other commercial and nonprofit organizations.
Through a couple of projects in which I've been involved over the last year or so, I've become more acutely aware of the specialized requirements of managing the collections and operations of archives that differ from the realm of libraries with which I have much more experience. Some of the more basic areas of specialized functionality for archives include a more hierarchical approach to descriptive metadata and different business processes for the acquisition of materials, as well as the need for digital asset management and preservation.
When processing new collections, an archivist needs flexibility regarding the levels of description. A traditional collection might include boxes of manuscripts, correspondence, photographs, and other materials. It's important to have the ability to selectively describe collection materials, depending on their relative importance and the resources available. Archivists may create a single record for a collection, may generally describe each box, or may produce inventory lists of each folder in a box. Archivists may also choose to individually describe each manuscript or photograph and produce digital representations. The metadata record structure and any associated search or presentation software must have the ability to support this multilevel approach, representing its hierarchical context. The relatively flat approach of MARC records as used in an ILS may not necessarily adapt well to this aspect of archival collections. Metadata standards widely used in archives include EAD (Encoded Archival Description) and ISAD(G) (General International Standard Archival Description). Images associated with an archival collection may also be described with Dublin Core, VRA Core, or other metadata formats appropriate for digital objects.
Digital Asset Management
Archives involved with digitized or born-digital materials will benefit from a digital asset management system. Digitizing a collection of photographs or manuscripts can quickly produce thousands or even millions of images, quickly exhausting the practical limits of manual management. It's important to have solid technical infrastructure for the storage of the digital objects, for the creation and management of descriptive and technical metadata, and for their retrieval and display. For digitized manuscripts or textual material, useful features include the ability to organize or package the images consistently with the originals, as well as the ability to search and retrieve individual images. Digital asset management, depending on the magnitude of the collection, might be a feature built into other archival management systems or a dedicated application. If a separate digital asset management system is implemented, it is often necessary to build in linkages from other catalog or management systems.
One vital, but often neglected, activity for special collections involves ensuring the preservation of these materials for future generations. Implementing a technical infrastructure able to provide a stable and secure environment to house and manage digital objects can be relatively expensive. Such an environment would include data management processes with rigorously maintained and tested backup procedures, offsite redundancy, and other proactive measures to ensure that all files can be retained or recovered regardless of human errors or technical failures. But this kind of disaster recovery is just one component of a digital preservation environment capable of preserving content into the distant future. Digital preservation strategies need to include a set of processes that carry content forward even as the media formats, file structures, and access software become obsolete in future generations of technology. The trusted digital repositories created through a digital preservation strategy capture additional layers of metadata in order to be able to describe the technical characteristics of a digital object so that future generations would be able to either migrate it into currently supported file or media formats or to emulate the software needed to access its content. I see the most critical aspect of digital preservation as the enduring commitment from an organization to make the recurring investments that will be needed to ensure that these materials never become abandoned or obsolete. Long-term digital preservation requires a continuous, uninterrupted chain of curatorial activities that continually aligns the content of a prior generation into current technology.
The tasks involved in the acquisition, management, and access of archival materials differ substantially from that of library collections. Libraries purchase or license content from publishers or other content providers to give access to their users, either physically or electronically. How libraries manage physical materials that are purchased for permanent ownership differs in many ways from electronic resources, which tend to be licensed for access during a set term. Archival materials involve yet an entirely different business process and processing workflows. Individual items or entire collections of materials may be received as gifts, may be purchased from collectors, or may be acquired through other negotiated circumstances. Materials may need to be inventoried, appraised, insured, assessed, or treated in order to ensure their optimal storage environment for preservation. While archival pieces tend not to circulate to patrons in the same way as regular library materials, they still require inventory control and tracking since they might be loaned to other museums or archives, be placed into exhibitions, or be brought out of the collection from time to time for access by researchers. Since archives manage valuable and irreplaceable materials, they can benefit from automation systems that provide robust asset management capabilities.
Comprehensive Resource Management
One of the major themes in library automation, especially for research and academic libraries, involves the emergence of a new genre of library services platforms that, among other characteristics, takes a more comprehensive approach to managing resources. The initial focus for these systems has been in offering a single platform that's able to support the management of library materials across different formats, including physical materials and electronic resources. Bringing together the relevant functionality of the traditional ILS and electronic resource management has been a major accomplishment.
It is not unusual for a library to have an extensive special collections or archives division. Most of the academic, research, state, national, and even some larger public libraries deal with archival materials. It is my observation that in most of these libraries the special collections or archives rely on a separate technical infrastructure. The institution may manage these collections with general purpose databases or spreadsheets. They may also use specialized archival management products such as the ones I mentioned earlier. Archival management systems, whether locally crafted or acquired from a vendor, require considerable overhead, both in terms of initial setup and ongoing operations.
Bringing special collections materials into a library's discovery environment for its patrons is often quite a challenge. Many libraries have implemented processes to incorporate collection-level records into their catalogs or discovery systems that point to finding aids or more granular access tools. Others have done more extensive integration that even includes good representations of the hierarchical nature of archival records.
Is it necessary for libraries to operate separate management systems for their special collections? I see a plausible case for extending the concept of comprehensive resource management a notch further than the current paradigm to also include archival materials. I have noted some substantial differences that apply to the management of archival collections in terms of expectations for hierarchical record structures, a specific domain of metadata standards, processing workflows, and asset management. Yet, the genre of library services platforms embraces flexibility as a fundamental concept. Products in this category generally offer the ability to handle a variety of different types of metadata that come into play in describing and managing physical and electronic resources, and some are beginning to include support for digital asset management. They accommodate the fundamentally distinct business rules and processing workflows between purchased and licensed materials. It seems reasonable that bringing in the management and discovery of archival materials could be a possible next phase in the development of library services platforms, more fully realizing the ideal of unified management and discovery across all aspects of a library. I see this as a positive move, allowing a library to bring its most interesting and valuable collections into its strategic environment for access and management rather than treating it as an isolated silo on the side. Although significant development might be required to accommodate the distinctive characteristics of these materials and the exacting requirements of archivists, expanding the scope of library services platforms could potentially offer benefits to libraries as well as open up opportunities for system developers.