Library Technology Guides

Document Repository

Smart Libraries Newsletter

Smart Libraries Q&A: Voice Recognition

Smart Libraries Newsletter [October 2019]


How can voice recognition software be used in libraries?

Voice technologies, now commonplace in daily life, have interesting possibilities for use in libraries. Library applications and information systems can be enabled to respond to voice commands, providing valuable assistance for persons with disabilities as well as helping library personnel avoid repetitive motion injuries. Speech recognition has become reliable and accurate thanks to years of research and development along with remarkable advances as smart speakers and other voice activated appliance have entered the consumer market. For most libraries, however, voice technologies have not yet reached the point of general use but are more in the realm of experimentation or early deployment.

Speech recognition can be used instead of keyboards or touch pads for entering a search query or making other service requests. This capability can be helpful for those that have limitations in motor skills or for those that just find voice commands easier. Speech recognition for entering commands does not necessarily help those that are blind or have limited eyesight. To help these individuals, the interface needs to employ text-to-speech technologies to speak the results. Fortunately, these technologies have become increasingly capable, less expensive, and can be adapted for library use. This capability also appeals to users acclimated to using voice commands on their home devices.

Voice recognition technologies, speech-to-text, and textto- speech technologies can be used in combination with APIs from a library's catalog or discovery environment to create a conversational interaction that enables library patrons to search, select results, and make requests. This kind of conversational tool would also rely on a platform able to parse and interpret voice commands and to assemble the appropriate actions based on requests and responses from the APIs involved. While the needed components are generally available, so far there have not been many off-the-shelf products that enable them. Voice commands are not yet a common feature for library mobile apps, web-based catalogs, or discovery services.

Talking to the Library Via Consumer Voice Services

Smart speakers or digital assistances, such as Amazon Alexa and Google Home, have rapidly become common household appliances. Both platforms enable the creation of customized activities. Alexa has a well-established process for creating custom Skills, and Google offers the ability to create custom Actions via its Developer Console.

Enabling one of the consumer services, such as Amazon Alexa or Google Home, can be accomplished with moderate technical difficulty. These services do not natively interact with library catalogs, but this capability can be created as custom activities. A script would need to be developed to implement the conversational interactions with the user and to interact with the API of the library catalog or integrated library system to fulfill the requests. Users would need to be aware of and to enable the custom action in order to talk to their library's service.

Many libraries have created custom Alexa Skills or Google Actions. A recent search of the listings of the custom skills shows a few dozen library services available. Most seem to be related to searching library catalogs and making basic requests, such as placing holds.

For those libraries without the technical capacity to develop their own custom services on the commercial networks, there are consultants or service providers available. Pellucent Technologies, for example, can assist a library in the creation of Alexa Skills and Google Actions to allow patrons to use smart speakers to make requests addressed to their library's catalog.

Communico offers a suite of patron-facing services, including a discovery interface and other website plug-ins. The company has enabled a custom Alexa Actions, which allows patrons to use voice commands to learn about library events, hours, to search the catalog, or to check account details. Some library-oriented services have already implemented voice activation. Overdrive's Libby app, for example, has been integrated with Google home, though only in a limited way. Library patrons with a Google Home device, using their library credentials, can connect to the OverDrive Libby app to perform some tasks, such as searching for a book, getting a recommendation, borrowing a title, or placing a hold. It's not yet possible to listen to books or audiobooks from OverDrive via this Google action. To listen to an audiobook on a smart speaker, one would need to pair it with a mobile device with the Libby app using a Bluetooth connection. Hoopla has enabled its service on Amazon Alexa devices for full control and streaming of audiobooks.

Making use of the commercial smart network services to make connections with library services raises some concerns regarding patron privacy. These platforms form part of a commercial ecosystem optimized for advertising and ecommerce and for collecting consumer behavior data. The way that these services interact with individuals, especially in regard to capturing and collecting personal information, may not be consistent with library values and patron privacy policies. Some of the interactions may transpire anonymously, but it is difficult to know to what extent that voice recognition and other tracking mechanisms could tie search queries and results returned to specific individuals. Any services involving patron sign-on to make requests or accessing account details would be especially problematic. The voice recording and transcripts for these interactions, which are equivalent to ILS system logs for patron requests, would be handled according to the practices of Amazon or Google and not under the control of the library. These records are likely kept indefinitely, not anonymized, and may be shared with or sold to other entities.

Even without custom actions or skills, the placement of smart speakers in library or educational contexts should be carefully reviewed from a privacy perspective. These devices may increasingly be able to associate conversational requests to specific individuals as their capabilities for voice recognition improve. Recent generations of these devices include cameras and video screens that come with already highly advanced capabilities for facial recognition. The potential for these devices to identify individuals making requests or accessing content are substantially different from public computers provided for patron use, where interactions are performed via keyboard. Although most library patrons will see these devices as novel, possibly useful, and innocuous, there may be some patrons with heightened privacy concerns who may see these devices as intrusive.

Fortunately, there are other ways for libraries to deliver voice-activated services that will undoubtedly be expected by patrons without entanglements with the consumer and advertising networks. Voice recognition, artificial intelligence, and natural language processing technologies can also be incorporated into library services via platforms independent of those ecosystems.

Integrated Technologies to Power Library Services

Library-specific services or apps with voice activation are quite early in the development stage. One application that I have come across is called Libro, developed by ConverSight. Libro can be integrated with library catalogs, event calendars, or other library systems to provide voice control and responses for specific tasks. Some of the actions available include searching the library's catalog, placing holds, and performing patron account tasks, such as renewing items, listing fees due, listing library opening hours, and managing holds. It can be used with the library's calendaring or event management system to enable patrons to search for events of interest. Patrons can access these voice-enabled capabilities through the Libro mobile app customized for the library or through a customized skill delivered via Amazon Alexa. Libro has not yet been widely implemented. The Bartholomew County Public Library in Indiana has worked with ConverSight to help test their app but have not yet put it into general use. The Iowa State University Library has begun an early limitation of Libro.5

The product is based on the company's “Conversational Insights and action platform,” which enables a variety of services, including natural language processing, conversational modelling, machine learning for enhanced conversation optimization, and multiple messaging channels. The platform also provides custom analytics and API integrations for diverse local systems.6 Applications based on the ConverSight platform has been created for multiple industries, including retail, transportation, manufacturing, warehouse management, and education. Libro has been developed specifically for libraries. The company behind Libro, Conversight, which is also known as ThickStat, was founded as a new startup in June 2017 by Ganesh Gandhieswaran. Gandhieswaran serves as President and Gopinath Jaganmohan servers as the company's Chief Technology Officer.

The Libro technology operates independently of consumer services such as Alexa, Siri, and Google Home. Although the service can also be accessed via Alexa, it does not rely on it for its core capabilities. Separation from the commercial ecosystem has important implications for the privacy and security of patrons as they use such a service.

Voice technologies today for library applications are in a relatively early phase. Libraries, however, may have an interest in pursuing this capability in order to meet patron expectations given that voice commands are increasingly pervasive. In the same way that libraries have worked hard to make their resources accessible via mobile devices, voice interactions seem to be the next wave. Developers of library systems and services may find opportunities for innovation if they are able to design and deploy voice services in ways that enhance engagement with library patrons, address accessibility, and respect patron privacy.

View Citation
Publication Year:2019
Type of Material:Article
Language English
Published in: Smart Libraries Newsletter
Publication Info:Volume 39 Number 10
Issue:October 2019
Publisher:ALA TechSource
Series: Smart Libraries Q&A
Place of Publication:Chicago, IL
Record Number:24774
Last Update:2022-11-29 10:00:32
Date Created:2019-12-21 11:51:03