The University of California's Digital Library (CDL) and its partners today (Oct. 2) launched DataUp, a free data management tool.
Researchers struggling to meet new data management requirements from funders, journals and their own institutions now can use the DataUp Web application and a Microsoft Excel add-in to document and archive their tabular data.
"DataUp will change the way scientists do their work, making it easy for them to manage and preserve their spreadsheet data for future use," said Bill Michener, principal investigator for the DataONE project.
Scientific datasets have immeasurable value, but they are useless without proper documentation and long-term storage. Data sharing also is strongly encouraged in the scientific community but is not the norm in many disciplines, including Earth, ecological and environmental sciences. DataUp addresses these issues.
CDL partnered with the Gordon and Betty Moore Foundation, Microsoft Research Connections and DataONE to create the DataUp tool, which is free to use and creates a direct link between researchers and data repositories. CDL also announces today that the DataUp project has been contributed to the Outercurve Foundation's Research Accelerator Gallery.
The DataUp add-in operates within a program many researchers already use: Microsoft Excel. The Web application allows users to upload tabular data in either Excel format or comma-separated value (CSV) format. Both the add-in and the Web application allow users to:
- Perform a "best practices check" to ensure data are well-formatted and organized
- Create standardized metadata, or a description of the data, using a wizard-style template
- Retrieve a unique identifier for their dataset from their data repository
- Post their datasets and associated metadata to the repository.
Although hundreds of data repositories are available for archiving, many scientific researchers either are unaware of their existence or do not know how to access them. One of the major outcomes of the DataUp project is the ONEShare repository, created specifically for DataUp, where users can deposit tabular data and metadata directly from the tool.
An added advantage of ONEShare is its connection to the DataONE network of repositories. DataONE links existing data centers and enables users to search for data across participating repositories by using a single search interface. Data deposited into ONEShare will be indexed and made available by any DataONE user, facilitating collaboration and enabling data re-use.
"DataUp is uniquely positioned because it improves the quality and documentation of data in Microsoft Excel, the tool of choice for many researchers who would otherwise not participate in data preservation initiatives," said Matthew Jones, director of informatics at UC Santa Barbara's National Center for Ecological Analysis and Synthesis. "Scientific synthesis will benefit tremendously from the infusion of these small but information-rich data sets from Excel into the DataONE ecosystem of shared data."
CDL envisions the future of DataUp directed by the participating community at large. Interested developers can expand on and increase the tool's functionality to meet the needs of a broad array of researchers. Code for both the add-in and Web application is open source and participation in its improvement is strongly encouraged.
About the University of California Curation Center (UC3) at the California Digital Library
UC3 is a creative partnership bringing together the expertise and resources of the University of California. Together with the UC libraries, we provide high quality and cost-effective solutions that enable campus constituencies — museums, libraries, archives, academic departments, research units and individual researchers — to have direct control over the management, curation and preservation of the information resources underpinning their scholarly activities. For more information, visit www.cdlib.org/services/uc3/.
About Microsoft Research Connections
The program collaborates with and supports the work of the world's top academic researchers and institutions. It establishes partnerships to advance the state of the art in computer science and develop technologies that fuel data-intensive scientific research. By connecting leading researchers around the world, Microsoft Research Connections aspires to accelerate the scientific discoveries and breakthroughs that respond to some of the world's most urgent global challenges. Fellowships, grants and awards from Microsoft Research Connections help to inspire the next generation of computer scientists and the broader research community.
About the Gordon and Betty Moore Foundation
The foundation is committed to making a meaningful difference in environmental conservation, patient care and scientific research. Gordon Moore, co-founder of Intel, and his wife, Betty, established the foundation in 2000 to create positive outcomes for future generations. The Moore Foundation focuses on that goal around the world and in the San Francisco Bay Area. For more information, visit www.Moore.org.
DataONE serves as the foundation of innovative environmental science through a distributed framework and sustainable cyber-infrastructure, meeting the needs of science and society with open, persistent, robust and secure access to well-described and easily-discovered Earth observational data. It is supported by a $20 million award from the National Science Foundation's DataNet program. With coordination nodes at the University of New Mexico, University of California, Santa Barbara and the University of Tennessee, DataONE is a collaboration of universities and government agencies teamed up to organize and present vast amounts of diverse, inter-related but often heterogeneous scientific data.
About the Outercurve Foundation
The Outercurve Foundation is a not-for-profit foundation providing software IP management and project development governance that help organizations develop software collaboratively in open-source communities for faster results. The Outercurve Foundation is the only open-source foundation that is platform, technology and license agnostic. For more information, contact firstname.lastname@example.org.