At Vanderbilt--as Iím sure it is at other libraries--we focus much of our attention on finding ways to provide remote services to patrons who donít physically visit the library. Last month, I described some of the issues and options related to providing reference services to remote users. In this monthís column, Iíll discuss some of the issues associated with connecting remote users to those electronic resources that must be restricted to the libraryís own clientele.
Libraries have a growing base of users who want to take advantage of our information resources from their homes, offices, and dorm rooms, or from any part of the world as they travel. One of the key challenges to providing services to these patrons involves identifying them as valid library clientele and enabling them to gain access to licensed resources that arenít part of the "free" Web. In more technical jargon, in order to offer access to these services, we must deal with issues of authentication, authorization, and proxy.
Libraries routinely license content from publishers and database producers for significant fees. These resources are often not freely available to anyone on the Web, but only to those who have valid subscriptions. Publishers implement various mechanisms that block access to those who arenít known subscribers. The most common of these methods involves IP address restrictions.
Restriction by IP Address
All computers connected to the Web have an IP address. Weíre all used to seeing these addresses written as four decimal numbers separated by periods. The IP address of my computer, for example, is 126.96.36.199. Publishers have the ability to control who may use their products by maintaining access lists of allowed IP addresses. When someone attempts to get into an IP-restricted resource, the system simply looks up his or her IP address. If it finds the address in the list, the individual is allowed in. Otherwise, the user is rejected.
Itís not always necessary to deal with IP addresses one by one. These access lists also use network address ranges. IP addresses are organized in a way that easily identifies whole networks of computers. All computers on our main campus, for example, begin with 129.59. This way, we can use the abbreviation 129.59.*.* to represent all the computers on the main part of our network. When purchasing content from an information provider, the usual procedure involves submitting a list of IP addresses or IP networks so that it can be added to the access-control list for that subscribed resource.
Iíve been following authentication and authorization for a long time. For at least the last 5 years, there has been much discussion that IP authentication isnít sustainable and will soon be superseded. Many have said (myself included) that more sophisticated technologies--such as Public Key Infrastructure (P1(J)--will surely emerge to replace primitive and cumbersome IP authentication. However, there are few signs of any movement away from IP authentication. While more advanced authentication systems like PKI offer a more robust authentication environment, implementing and managing these new systems is complex and expensive.
So, it appears that IP restrictions may continue to be the common approach used by libraries and information providers for quite some time. Weíd do well to find methods to make this approach work in the best way possible.
Problems with IP Authentication
On one level, this arrangement works well enough. When we purchase a subscription to a resource, we give the provider a list of valid IP addresses to place in its access list. Users who attempt to access the service get through as long as their computer falls within the set of allowed IP addresses.
The fundamental problem with this scheme is that there are many circumstances in which library users work from computers that arenít associated with the organization that sponsors their access to licensed content. When people use the Internet from their homes or offices, theyíre likely getting that access through an arrangement with an Internet Service Provider (ISP). Their computerís IP address falls within a range associated with the ISP. When connecting to the Internet through an ISP, the computer has no guarantee of getting the same IP address for each session. Since ISPs serve a wide range of customers, thereís generally no way to correlate the IP addresses of a given individual with his or her library.
In the academic environment, most universities and colleges are moving away from providing their own dial-in connections to their networks. With dial-in services, the users who connect from their homes and offices receive IP addresses that are part of the universityís IP addressing schema. But operating a dial-in service is expensive and requires constant upgrades- especially as modem technologies evolve. ISPs are generally in a better position than universities to offer these services. The advent of high-bandwidth Internet access through cable modem service and DSL technologies makes ISP access to the Internet even more appealing than institutionprovided service.
Proxy Servers Fill the Gap
So, given that fewer and fewer of our remote users have IP addresses that we can identify--and given the reality that information producers continue to rely on IP addresses for access-what can a library do to facilitate access to these services for our remote users? The approach that seems to work best involves the use of a proxy server.
Proxy servers, at least in this context, allow a remote computer to access IP-restricted resources. This works by altering the remote computerís network traffic in such a way that it appears to any P-restricted service as having an IP address that falls within the allowed range.
Before a proxy server will perform this task, it has to be sure that the user is associated with the institution. Most organizations have some sort of usemame and password scheme to identify their patrons. A university might have a campuswide authentication system used for e-mail and other network services. These schemes may use Kerberos, Radius, or Windows NT domain servers for the actual authentication routines. A library automation system can, in some cases, be used as an authentication server. Many libraries have automation systems that maintain usernames and passwords for renewing books, paying fines, or making ILL requests. Innovative Interfaces, Inc. offers a module for its INNOPAC service called Web Access Protocol that can be used for this purpose. In order to make a proxy service work for controlling access to library applications, some type of username/password authentication scheme is needed.
The traditional proxy server model requires users to adjust the setting in their Web browsers. They must program their browsers in advance with the proxy serverís IP address and port settings. Once set and activated, the browser passes its traffic through the proxy server. To initiate the session, users are prompted for their username and password. The proxy should be activated only for access to restricted resources. With most Web browsers, enabling and disabling proxy services is a manual process that takes a few steps.
Problems with Proxy Servers
This model of using a standard Web proxy server was how we initially provided authenticated remote access to our restricted electronic resources. For the most part, this method worked, but there were some significant problems. While experienced Internet users had no trouble following the few steps required to set up their Web browsers to use the proxy service, there were some who found the process too complex. The steps made the settings vary between Netscape Navigator and Microsoft Internet Explorer and among version numbers-and even vary according to whether the service was running on a Macintosh or a PC. Writing simple, easy-to understand instructions that covered each of these permutations was a challenge. Another problem we encountered was that some users would set the proxy settings and leave them enabled permanently. This doesnít necessarily cause problems for the users, but does mean that all their Web surfing is channeled through our proxy server, adding to the overall load.
But the most difficult problem arises when a user needs to interact with multiple proxy servers simultaneously. We found that some of the major ISPs in our area require that their customers go through a proxy server maintained by the ISP. Disabling this proxy setting would cause problems with services offered by the ISP. It was especially complex for the user to have to use one set of proxy settings for library resources and another set for ISP-related services.
EZproxy to the Rescue
The problems we encountered are typical of the traditional Web proxy server. Fortunately, there are other approaches. Ideally, a proxy service would automatically step in when itís needed and wouldnít require users to change their Web browser settings.
After some investigating, we found all of those characteristics in a product called EZproxy from Useful Utilities of Glendale, Arizona. EZproxy is a URL-rewriting proxy server that offers an approach thatís well-suited to the problem of providing remote access to library resources. It operates between the userís Web browser and the restricted service, intercepting both the Web browserís requests and the pages returned by the Web server. From the perspective of the restricted Web service, the requests come from an authorized IP address and are accepted as valid. All the URLs on the Web pages that returned from the restricted service are rewritten as theyíre displayed on the userís Web browser with the base address of the EZproxy server. This rewriting is necessary so that the links will function correctly--passing through the proxy server--if they are clicked on.
With EZproxy, the initial links to the restricted resources must be written in such a way that they go through a script that directs them to the EZproxy host. For example, to connect to ProQuest--one of our most popular restricted resources--the patron uses a link (http://proxy.library.vander bilt.edu:2048/login?url=http: //www.umi.com/pqdauto) instead of the native URL (http://www.umi.com/pqdauto).
In addition to using the new form of the URL for links to the restricted resource, an entry needs to be made in the EZproxy configuration file.
Once the resource has been properly configured, users will be able to access it easily. If they access it from an address thatís within the libraryís IP range, they are taken directly to the resource. When itís used from outside the libraryís IP-address range, EZproxy will then prompt them for their username and password. If entered correctly, they are passed on to the resource. EZproxy will maintain the connection, adjusting all the Web pages so that theyíll work correctly.
As EZproxy provides access, it logs each use into a file that can be utilized to create statistical reports. These statistics contain information that allows the library to monitor the use of specific electronic resources. This information is especially helpful in making renewal or cancellation decisions for these resources.
The main advantage of using EZproxy is that library users donít have to make any changes to their browsers to access restricted resources. All they need to do is provide their username and password when asked. While the library must perform some up-front work, itís well worth the effort in order to create a more user-friendly environment.
EZproxy is very inexpensive, with a list price of $495 for a one-server license. It will operate under either UNIX or Windows NT. The supported authentication methods include Radius, LDAP, UNIX login accounts, the INNOPAC Patron API, or a local file of usernames and passwords.
Automating the Libraryís Work
As I noted above, to use EZproxy the restricted resourcesí URLs have to be adjusted. That part of our libraryís Web site is driven by a database in which the pages are generated dynamically. For each resource the library subscribes to, thereís a database record that includes its name, URL, description, subject fields, and other metadata. Since the pages that provide access to these resources are already generated dynamically, it was easy to enhance the script to alter the URL to the form needed by EZproxy. We created these dynamic Web pages to provide easy access to the libraryís electronic resources, including the ability to find resources by subject categories, by keyword searches of titles and descriptions, and through browsing alphabetical lists. The ability to seamlessly integrate EZproxy for remote users was an added benefit.
If youíre interested in looking at our database-driven access to electronic resources, go to www.library.vanderbilt.edu/heard/edatabases.shtml.