Three renowned researchers in digital humanities and computer science are joining forces with the Library of Congress on three inaugural Computing Cultural Heritage in the Cloud projects, exploring how biblical quotations, photographic styles and "fuzzy searches" reveal more about the collections in the world's largest Library than first meets the eye.
Supported by a $1 million grant from the Andrew W. Mellon Foundation awarded in 2019, the initiative combines cutting edge technology with the Library's vast collections to support digital humanities research at scale. These three outside researchers will collaborate with subject matter experts and technology specialists at the Library of Congress to experiment in pursuit of answers that can only be achieved with collections and data at scale. These collaborations will enable research on questions previously difficult to address due to technical and data constraints. Expanding the skills and knowledge necessary for this work will enable the Library to support emerging methods in cloud-based computing research such as machine learning, computer vision, interactive data visualization, and other areas of digital humanities and computer science research. As a result, the Library and other cultural heritage institutions may build upon or adapt these approaches for their own use in improving access to text and image collections.
"We know there's so much more treasure to be unlocked here at the Library and in institutions across the country and around the world, and technology can help us understand these collections even more," said Kate Zwaard, director of digital strategy at the Library of Congress. "We want to generate evidence that we can use to make decisions and invest in our future, and we also hope that this work is useful for others to use and adapt in their own organizations."
The three experts beginning work this month are:
- Lincoln Mullen, associate professor at George Mason University in the Department of Art and Art History and director of computational history at the Roy Rosenzweig Center for History and New Media, will research "America's Public Bible: Machine-Learning Detection of Biblical Quotations Across LOC Collections via Cloud Computing." Dr. Mullen is no stranger to Library collections, having won first place in the 2016 Chronicling America Data Challenge.
- Lauren Tilton is an assistant professor of digital humanities at the University of Richmond in the Department of Rhetoric & Communication Studies and is co-director of Photogrammar and the Distant Viewing Lab. Tilton's project, "Access & Discovery of Documentary Images" will examine approximately 250,000 images from five early 20th century photography collections. The project will look for ways computer vision methods could be improved to better consider context and enhance discovery.
- Andromeda Yelton, a software engineer and professionally trained librarian, will research "Situating Ourselves in Cultural Heritage: Using Neural Nets to Expand the Reach of Metadata and See Cultural Data on Our Own Terms." Yelton's project will create an interactive data visualization that clusters conceptually similar documents, helping users who only have a rough idea of the items they're looking for. The project will use a searching capability that utilizes machine learning and "fuzzy search" to help users discover and navigate Library collections.
In addition to the generosity of the Mellon Foundation, the Computing Cultural Heritage in the Cloud initiative is bolstered by the Library's significant investments in IT modernization. In late 2020, the Library transitioned to a hybrid hosting environment consisting of multiple secure physical data centers and the cloud. The initiative will test a cloud-based approach for interacting with and exploring digital collections as data.
The public can follow along with these experiments on the Computing Cultural Heritage in the Cloud page at LC Labs.
Through experimentation, research, collaboration, and reflection, LC Labs works to realize the Library's vision that "all Americans are connected to the Library of Congress" by enabling the Library's Digital Strategy. LC Labs is home to the Library of Congress Innovator in Residence Program; has nurtured experiments in machine learning and the use of collections as data; and incubated the Library's popular crowdsourced transcription program By the People. Learn more and subscribe to our monthly newsletter at labs.loc.gov.
About The Library of Congress
The Library of Congress is the world's largest library, offering access to the creative record of the United States — and extensive materials from around the world — both on-site and online. It is the main research arm of the U.S. Congress and the home of the U.S. Copyright Office. Explore collections, reference services and other programs and plan a visit at loc.gov; access the official site for U.S. federal legislative information at congress.gov; and register creative works of authorship at copyright.gov.