Library Technology Guides

Document Repository

Smart Libraries Newsletter

Smart Libraries Q&A: Research data management

Smart Libraries Newsletter [June 2021]


What skills or knowledge should libraries wanting to get involved in research data management develop in staff?

In recent years academic libraries have increased their involvement with research data management. Providing services to help manage data sets related to scientific research will benefit researchers, strengthen the strategic position of the library within the institution. These services align with the core expertise of library professionals.

The skills or expertise needed by a library professional will depend on multiple factors. Some roles may involve assisting researchers with procedures related to applicable policies for data. In other scenarios, a data scientist in the library may have a role in data manipulation or analysis.

Preserving data lies at the heart of scientific research. Experimental processes and data analysis have to be reproducible. While library professionals will not likely be involved in data collection, they may play a role for preserving data sets for validation of analysis or for follow-on research.

An academic library ideally will have defined the scope of services it provides for research data support. It should be clear to researchers what level of service they can expect, and the library must plan how to allocate resources or acquire the corresponding technology tools. As a library become involved in research data services, it will do well to be sure that the resources are reasonably scaled according to anticipated interest by researchers. Although this type of service can improve the standing of the library when executed well, it could be a source of frustration if not executed well.

The Research Data portal for the Purdue University Libraries serves as a good example of a well-defined set of services.1

Skills in this area fall into policy and planning. Librarians will want to thoroughly review existing policies and services offered through the institutional office of research to avoid overlapping services and to identify any gaps or supplementary services that could be offered through the library. Any proposed services will need to be carefully negotiated with all stakeholders. Researchers and administrators will naturally be protective of their data assets. Libraries will want to tread carefully as they propose new services.

One of the areas of potential library involvement involves the data management plans required by most funding agencies. These plans describe the policies, procedures, and platforms that ensure the preservation of research data and its availability to other researchers as may be required by the terms of a grant. The National Science Foundation, for example, describes its data management plan requirements.2 Librarians interested in this aspect of research data management would need to have a detailed understanding of the data management plan requirements of each of the major agencies that funds research in the institution and develop templates or other tools to facilitate the preparation of this section of a grant proposal.

Libraries may also be involved in providing the technical and administrative infrastructure to fulfill aspects of the plan. In many institutions the library will operate a data repository that provides secure and controlled access to research data and the associated metadata for each dataset. A research data repository may be an extension of an institutional repository provided for local copies of scholarly articles, preprints, or other research outputs. Skills and training related to data repositories would address managing the repository and understanding the metadata formats and vocabularies used to describe datasets.

Some libraries may purchase specialized research services applications. Ex Libris markets its Esploro environment to libraries supporting a wide range of services supporting institutional research.3 Elsevier's Pure4 and Kuali Research [5] are usually sold directly to the university's office of research.

Research datasets can require massive digital storage. Libraries offering services for research data repositories will need to acquire and manage large-scale digital storage, including capacity for redundancies needed for disaster recovery and long-term preservation. Related skills include familiarity with digital preservation frameworks such as the OAIS Reference Model ( or pragmatic approaches that ensure long-term availability of datasets held in the repository.

Some library professionals may take a more active role with the data itself. Such a “data librarian” would work in many different contexts. Some of the possible applications would include:

  • Collection and analysis of data related to the library's own systems and services.
  • Assisting researchers with public data sets or with proprietary data sets the library might license for the institution's research or teaching activities.
  • Specialized geospatial data sets.
  • Data manipulation or analysis in collaboration with faculty research or student projects.

Working with data at this level requires more advanced expertise as well as related technologies. Some libraries may opt to have a data scientist on staff holding advanced degrees in statistics or other analytical fields or with related experience in the business sector.

Library professionals interested in more involvement with data management and analysis may be interested in learning programming languages such as R, an environment for statistical computing, graphics, and analytics. Some may prefer general programming languages such as Python or Ruby on Rails.

Some researchers may use more traditional statistical tools such as SAS or SPSS. The choice of programming tools will depend on what the institution's research community is using. Library professions working with data may also work with analytics engines such as Oracle Business Intelligence or Tableau.

I anticipate that research data management will continue to be one of the fastest growing specializations of academic libraries. Gaining new skills in this area may prove to be a fruitful direction of professional development and career advancement. The options and opportunities range from an introductory level of building awareness of institutional policies and practices to working toward a dedicated role as a data librarian.


  1. “Research Data,” Purdue University, https://www.lib
  2. “Dissemination and Sharing of Research Results - NSF Data Management Plan Requirements,” National Science Foundation, /dmp.jsp
  3. “Esploro: Showcase research work and expertise,” ExLibris, /esploro-research-services-platform
  4. “Pure: The world's leading Research Information Management System, Elsevier, /solutions/pure
  5. “Kuali Research: More research less burden,” Kuali,
View Citation
Publication Year:2021
Type of Material:Article
Language English
Published in: Smart Libraries Newsletter
Publication Info:Volume 41 Number 06
Issue:June 2021
Publisher:ALA TechSource
Series: Smart Libraries Q&A
Place of Publication:Chicago, IL
Record Number:26346
Last Update:2022-11-23 01:06:53
Date Created:2021-06-14 10:40:50