Introduction
The Library now has numerous subscriptions to networked electronic resources, mounted locally (ERL) or accessed via the Internet. Generally, licences restrict access to these resources to authorised users, these being some subset of the Library's total customer base, typically staff and enrolled students. Compliance with these restrictions is effected by means of one or more authentication methods, described below. Each of these authentication methods has its drawbacks, and we have found that no single solution meets all of our needs. In particular, authentication poses significant challenges to the provision of access for remote (dial in) users.
This paper presents an overview of the current options for access to electronic resources.
Authentication methods
Several methods are available to providers for authenticating access to a resource:
IP number (location)
The most common method of authentication is to limit access to machines within a particular range of IP numbers - effectively, this is limiting access to a given location (e.g. "within the University"). Each machine connected to the Internet has a unique number, the IP number, which defines its location within the Internet. Within a particular network, for example within the Library, IP numbers will be grouped together, with all IP numbers starting with the same stem. In the case of the Barr Smith Library, we have two networks, one for staff and one for public machines. The IP numbers for public machines all begin with the stem 129.127.206. - with a final number uniquely identifying each individual machine. Thus a resource could be restricted to just the machines in our public network by specifying an IP range as 129.127.206.* (the asterisk indicating "all").
This method is easily implemented using standard Web server tools, and can provide transparent authentication to large communities. The major drawback is that it restricts access to a particular location, and cannot permit access to users outside of the defined network (such as dial in users).
Login (ID/password)
The second very common method of authentication is by requiring a login, whereby the user is required to enter a valid ID and password before being allowed further access. There are two basic ways in which the login process might be implemented. The first uses an HTML form for login and CGI scripting to validate the ID/password; once validated, the script will then typically refer on to additional scripts to provide content. The second way uses the Web server built in authentication options to present a login dialog to the user; after successful validation, the user is allowed access to a set of otherwise restricted web pages.
The second method is very simple to implement, requiring no CGI scripting, but lacks flexibility. The first method, because it is scripted, provides a far greater degree of flexibility and can thus be tailored to suit an endless variety of needs. This method is sometimes coupled with JavaScript and other methodologies to provide even greater control.
The major drawback with this method is the need to distribute login details to (potentially) large numbers of users. This is an administrative burden, particularly when passwords change, or when the user community changes (as it does with student populations).
Referrer URL
A third option, not much used yet but with some potential, is known as Referrer URL. In this method, the resource provider accepts as valid any connection originating from a known web page, the Referrer URL. (The HTTP protocol provides for request headers to include the location of the page containing the link being requested.) If the referring page is certified to the provider as being only available to valid users, then the provider can accept any connection originating from that page, secure in the knowledge that the user has already been authorized by the subscribing institution.
Thus the onus for authentication is placed with the subscriber rather than the provider. This is a good solution in so far as it may be easier for the subscriber to provide strong authentication for its users, using existing authentication schemes. For example, this library uses the barcode and PIN from its library system. Use of LDAP would also be possible as this is already in place for email access.
The major drawback with this method is that it is not yet widely accepted. A less important drawback is the need for the user to pass through an extra web page (the referring page) in order to gain access.
Types of users
For this library, we may divide users (in terms of access) into three types, each with unique problems to be solved:
- In-house users: those accessing resources from public workstations within the library.
- Local: those accessing resources from outside the Library, but within the University network, such as academic staff from their own office.
- Remote: those accessing resources from outside of the University network.
Of these three categories, the easiest to deal with are "local" users, since they are generally easily authenticated by IP number, and thus require no special treatment when that method is used by the provider. Where authentication is by login, local users generate an administrative burden in so far as login details must be distributed and records maintained.
Remote users are more difficult. If the provider uses IP authentication only, then remote users are effectively excluded, since it would be impossible to define IP ranges to cover all the possible ISP accounts which remote users might use. If the provider offers authentication by login, remote access becomes possible, but again the library is faced with the burdens already discussed for that method.
In-house use has additional challenges, in so far as the library is open to the public. Ideally, access through in-house public workstations would be authenticated locally, before allowing access to remote resources (which would then most likely be authenticated by IP). In practice, this is difficult, due to the nature of the PC, which is designed to be "personal" rather than "shared".
Putting the pieces together: Solutions
Experience so far suggests that there is no single, easy solution to all of the above scenarios, and a mix of solutions is therefore necessary.
Scripted logins
Where authentication is by login, it is sometimes possible to hide the login details from the user by scripting the login process locally. Access to the local (cgi) script can be authenticated by the local web server (using either IP or login). This approach has been used successfully with FirstSearch and some other databases.
Unfortunately, some providers use non-standard login methodologies which preclude local scripting. Providers for whom scripting has not proven possible include ABIX, Cochrane and IHS.
Another complication preventing scripted login occurs where multiple logins are provided in order to control access by additional criteria, e.g. department. An example of this is Current Contents, where we have different logins for different University departments, for statistical purposes. It would be possible to have multiple scripts for the different logins, but the work involved would reduce the value of scripting,
Access details through secure pages
A second solution to the problems of login authentication is to present the login details in a local web page, and make that web page available only to valid users by using local web server authentication. This approach has been taken with a number of databases: ABI/Inform, ABIX and Cochrane, among others. Access to these pages is allowed only after entry of a valid library barcode and PIN.
This is a solution which works well for both local and remote users, where access to the resource is available by login.
The main drawback with this idea is the lack of transparency: the user sees the resource login details, and must record these for entry at the remote site. The user is effectively required to "login" twice - once to access the local page, and again to access the remote resource.
Proxy server
The above solutions are only viable for resources authenticated by login. Where authentication is by IP only, remote access is simply not possible - unless one uses a proxy server. With this solution, a user configures their browser in such a way that all requests pass through a proxy server, which makes the request to the resource site on the user's behalf, after first (optionally) validating the user. Because the proxy server is within the IP range required by the provider, access is allowed even though the remote user is actually outside of the required range.
This library has been able to set up a proxy server without difficulty, using a standard Pentium PC running Linux and the Squid proxy software.
The main drawback of this solution is that not all ISPs will necessarily allow the use of a remote proxy server, although those few we have tested do. (One ISP required us to use a different port -- 3129 instead of the standard 3128 -- because they were using the same port for their proxy server.)
Another restriction is that some providers specifically prohibit the use of proxy servers in their licences.
Conclusion
Experience has shown that, in an ideal world, we would simply and easily solve all access problems by means of a single, transparent solution -- either a proxy server or referrer URL. However, the current world is far from ideal, with many providers using many different methods of authenticating users, requiring the library to cater for multiple methodologies. While there are some things the library can do to make access seamless and authentication transparent for the user, there are still some limitations on what can be done. In the worst case, the library has no recourse but to provide passwords to the user.
There is some hope that the referrer URL model will catch on with more providers, this being desirable as the method with the least overhead for the library. However, as more players enter the market, there will inevitably be a mix of sophisticated and simple authentication methods in use. We will probably not see uniformity for a long time, if ever.