Library Technology Guides

Document Repository

Smart Libraries Newsletter

Smart Libraries Q&A: Protecting patron privacy

Smart Libraries Newsletter [April 2020]


What changes can we implement to ensure our catalog and discovery services are protecting our patrons' privacy? How can we help ensure our patrons' confidentiality in regard to reader privacy, reading statistics, and digital information access?

Safeguarding the privacy of patrons as they use library resources is one of the basic values of the library profession. In the context of library catalog and discovery services protecting privacy involves several different areas of concern.

A typical session that includes a patron using a library search tool must be treated with the same degree of care as the data related to the borrowing of a physical item from the library. The need to protect circulation records is well accepted and the related technical and operational processes are followed by almost all libraries that lend materials to the public. When an individual searches for content on a library catalog or discovery interface, it involves personal data of equal or higher sensitivity than circulation. The network address, browser cookies, or other identifiers can easily be resolved to the personal identity of the searcher. The text entered into the query box, identifying content of interest, is transmitted across the internet from the web browser to the servers processing the search. The search results returned to the user and items selected, links to resources, or even specific items read or downloaded become part of a package representing a very sensitive transaction between an individual and the library.

Given the sensitivity of these data, there are steps that should be taken to ensure patron privacy. The first and most fundamental action is to use encryption. If the transaction is not encrypted, it can be captured by unknown third parties using readily available network eavesdropping software or equipment. Configuring the service to use the HTTPS protocol provides end-to-end encryption that cannot be penetrated. All ecommerce sites depend on this protocol to protect credit card numbers and other sensitive financial data. Configuring library websites or the servers running catalogs and other search services today should be considered an essential requirement. Chrome and other web browsers currently display prominent warnings for any site that continues to run the unencrypted HTTP protocol. I have been tracking the use of HTTPS vs HTTP on library websites for the last several years. Recent scans of all of the public library websites in the US reveal that about 15 percent still use HTTP; about 6 percent of academic libraries use HTTP.

Other measures can be taken to protect data possibly stored on the search service and falling within the bounds of patron privacy. Almost all search services create logs or other types of records for each transaction. These logs support important functions such as statistical reporting and analytics. To ensure privacy, it is essential to anonymize these records. This can be accomplished by truncating IP addresses to identify only users' network or domain and not a specific device. It is important to ensure that all copies of the transaction be anonymized, including raw web server logs in addition to transactions captured in databases within the application.

I would recommend that libraries regularly perform a technical review of their systems to assess the treatment of personally identifiable data within their internal systems. This review would include inspection of log files, backups, and application logs to verify that they are handling these data elements as expected. This privacy review could be incorporated into periodic reviews that the organization would perform regarding its backup and disaster recovery processes. Many search applications enable logged-in users to view their history of searches or items checked out or downloaded. This feature inherently stores data that associates content use with a specific user. Such data could be accessed through a security breach or a legal demand. At a minimum, patron profile data and associated transaction histories should be encrypted as a barrier to access by an unauthorized intruder. Given the vulnerability possible with storing content items in a patron's profile, libraries can limit this feature to individuals that give specific agreement. Many library services will have opt-in or opt-out options controlling the storage of this type of data. An opt-out by default policy would ensure that data is not saved without patron permission. This issue requires alignment between a library's privacy policy and the technical configuration of all the applications within its service portfolio.

Many, if not most, of the search services offered by the library will be implemented on the technical infrastructure provided by an external vendor. The major indexed discovery services such as EBSCO Discovery Service, Ex Libris Primo and Summon, and WorldCat Discovery services are almost always deployed this way. Socially oriented services such as those from BiblioCommons tap into an even greater set of personalize data, likewise hosted on vendor-provided infrastructure. In these cases the library must work closely with the vendor to ensure that the technical operation of the service matches their expectations in regard to the treatment of personally identifiable data, opt-in or opt-out retention of search history; and that the vendor's privacy policies and the technical behavior of the system matches the library's own privacy policies.

Limiting the collection of personal data can be counter to the interest of the library in delivering personalized services and in performing detailed analytics on the usage of its services. Expectations regarding these capabilities are set by the commercial environment that puts massive effort into extracting all possible personalized data from both online and in-person activities. Libraries, consistent with our distinct interest in protecting private data, cannot necessarily replicate the full extent of personalization and targeted marketing seen in the commercial arena. It is possible, however, to build effective services based on anonymized data, category and demographic data, as well as opt-in personalized data. This difference in values means that any marketing and analytics services used by libraries needs to be built around a different set of assumptions than those developed for the commercial arena. That requirement does not necessarily mean avoiding commercial customer relationship management or marketing engines but populating them and using them in ways that respect the library's privacy policies and practices.

View Citation
Publication Year:2020
Type of Material:Article
Language English
Published in: Smart Libraries Newsletter
Publication Info:Volume 40 Number 04
Issue:April 2020
Publisher:ALA TechSource
Series: Smart Libraries Q&A
Place of Publication:Chicago, IL
Record Number:25078
Last Update:2023-01-25 10:44:27
Date Created:2020-04-20 14:26:24