For the past several years, the Computers in Libraries conference has offered an evening program in which a panel of speakers gives a lighthearted treatment of technologies in transition. Informally dubbed the "Dead Technologies Session," this event gives library technologists an opportunity to state their opinions on current trends in hardware, software, and systems. As a member of the panel at the Computers in Libraries 2000 conference in March, I vented my concerns about the lack of adequate bandwidth that is available to support library applications, and about some recent developments that exacerbate the problem. In this column I'll continue with that theme.
High-Quality Internet Access
Lately, one of the issues that concerns me is the need for libraries to have sufficient Internet bandwidth to support the information services they provide to their users. As libraries use their resources to subscribe to more Internet-based content, it only makes sense that we ensure there are adequate delivery mechanisms in place. Will library users be pleased with our efforts if we offer a rich array of content, even if access to that content is slow and unreliable? One of my responsibilities with the library system here at Vanderbilt is to ensure that the technical infrastructure is in place to support the various electronic resources we offer.
Most libraries have become increasingly dependent on the Internet. We subscribe to a plethora of full-text resources, citation databases, electronic journals, etc. We depend on inbound Internet access to allow our users to reach our online catalogs and electronic resources from their homes and offices. Libraries today spend a significant portion of their annual budgets on subscriptions to Internet-based content. With such a big investment, it's strategically important for institutions to provide sufficient bandwidth for these services.
It would be great if libraries could calculate their anticipated bandwidth requirements, factor in anticipated growth, and simply increase the speed of the connection from their Internet service providers accordingly. Librarians generally like the idea of planning ahead to meet anticipated demand. Unfortunately, life on the Internet is not that simple. Most libraries don't have their own private connections to the Internet-they usually share the one that's provided by their parent organizations. Academic libraries generally share the bandwidth of the campuswide Internet portals; special libraries rely on access from their corporate or agency networks; many public libraries connect through a municipal network. Shared bandwidth is a reality that we deal with, but one that makes it impossible to guarantee adequate service for all parties.
Until recently, even within these shared environments, anticipating the amount of Internet capacity or at least reacting to current usage levels was a fairly manageable process. My organization, for example, has incrementally raised its Internet bandwidth from a 1.5-MB/sec. capacity in 1996 to the 22-MB/sec. circuit we have now. Although we've never been able to supply bandwidth ahead of demand for too long a period, recent events have had a major impact on our Internet access. We're not alone-colleges and universities all over face the same challenge. Multimedia computing has hit the Internet in a big way. Let's take a look at the current issue and then consider its wider implications.
Bandwidth-Hungry Applications
MP3 audio technologies have caused a revolution in the music industry. Before MP3, digital audio was too bulky and cumbersome for consumer-oriented applications. But MP3 stores and compresses music tracks and other sound recordings in a form easily managed by personal computers and portable playback devices. While the music industry struggles to adapt this technology in ways consistent with its interests, there has been an increase in MP3's other uses. In particular, a company called Napster has stirred up a lot of controversy.
Napster (http://www.napster.com) offers a system for distributing MP3 files on the Internet, using what it describes as an "integrated browser and communications system." In theory, artists can place samples of their work within the Napster system for promotional purposes. In practice, Napster users download tracks of commercial CDs without the copyright holder's permission. These pirated audio tracks are then available for Napster users worldwide. It's generally known that the vast majority of the content available through Napster consists of pirated commercial recordings. While Napster and the Recording Industry Association of America (http://www.riaa.org) debate the ethical and legal issues of this system, it's wreaking havoc on campus networks.
Napster follows a client/server distributed architecture, which provides a highly efficient means for distributing digital audio. One of the more disturbing aspects of its use is that many of the servers comprising this distributed system are unknowing participants. Once you download the Napster software and begin to build a collection of online music, your own computer operates as a server for other Napster users. Although it's possible to disable this feature, many users are entirely unaware that their computers are being used in this way.
The widespread use of Napster has a dramatic effect on campus networks. Some universities have observed that this activity consumes up to half of their Internet bandwidth. Napster absorbs incoming bandwidth as its users throughout the Internet take advantage of the faster computers and speedy Internet connections generally available at universities. At the same time, the students' voracious appetite for free music devours outgoing bandwidth.
Few technical solutions exist that allow research and academic applications to be given priority treatment on a network-in other words, to guarantee them adequate bandwidth, and allowing what's left for recreational use. Although the concept of "quality of service" (QoS) exists theoretically, its implementation is problematic. Determining the relative priority of an application is a difficult administrative decision-all users consider their work to be most vital to the organization. QoS is also difficult to implement technically, especially since so many applications are Web-based and virtually impossible to distinguish on the network level.
Many college and university networks have implemented filters that block Napster, which is fairly predictable in its network behavior. Its traffic is mediated through servers at napster.com, and it uses a particular TCP port (6699) by default. The routers that control the flow of data on a network can be programmed to turn away all Napster traffic: incoming, outgoing, or both. Unfortunately, a variety of workarounds have been devised to circumvent any network administrator's attempt to block Napster. These workarounds include programming the Napster client to use other TCP ports or to go through a proxy server based at another institution. As network administrators begin to filter Napster traffic, Napster users are becoming more aggressive in defying these efforts.
When Napster and similar systems overload the organization's Internet connection, other applications that rely on the Internet can come to a standstill. Those of us in the libraries especially seem to suffer when the Internet connection bogs down.
The Challenges to Come
My view is that the Napster and MP3 phenomenon is just the beginning. Multimedia computing is taking root on the Internet, and while Napster is vulnerable to legal and ethical challenges and can generally be filtered out technically, the next wave will likely be completely legal and much more impervious to technical restraint. The shortlived "Gnutella" (http://www.nullsoft.com) project, for example, shared many of the characteristics of Napster, except that it followed more of a peer-to-peer network architecture, bypassing Napster's reliance on a central server-which made it much more difficult to circumvent. AOL, the parent company of Gnutella's creators, shut down this project after only a few days of activity. Other applications and services will surely take hold in the near future that will make a tremendous impact on how we use the Internet.
The Napster experience is a wake-up call-we must be prepared to deal with Internet-access issues much differently than in the past. Interest in streaming audio and video seems relatively low now, but are we prepared for the day when most or all of our network users become regular consumers of these technologies? Are we ready for the day, for example, when staff throughout a university discover Internet radio as a great way to hear background music while they work?
Just as software developers apparently assume limitless computing power as they write the latest generation of "bloatware," Internet applications seem to be taking a similar course. As more consumers have fast Internet access in their homes through cable modems and ADSL, services proliferate that take advantage of these speedy connections. Unfortunately, these applications don't scale well on large organizational networks. A service that works just fine for a single user on a cable modem connection can have a devastating impact when hundreds or thousands of individuals use such a service on a large network. The overall demand for bandwidth per user will likely increase at a much faster rate than we have experienced to date.
Actions or Reactions?
What can we in libraries do to ensure high-quality connectivity for our costly Internet-based content? We can't control the dynamics that rule the Internet, but we can make a difference if we plan, assess, and lobby.
One thing I'm sure that we can't do is control the forces at play. It seems that the will of the majority can prevail on the Internet in ways that are more controlled than in non-virtual societies. Few organizations outside the corporate world are able to monitor or restrict recreational computing within their networks. I've heard that many corporations ban all non-work-related use of the Web. In an educational institution, such action would not likely be tolerated. There may come a time though, when the overhead of such activities becomes so heavy that even universities will have to make hard choices about what kinds of computing can be performed within their limited resources.
Libraries can plan their acquisitions of Internet-based resources in coordination with proportional increases in bandwidth. Back when we relied heavily on networked CD-ROM-based resources, I was always reminding librarians to consider hardware and software requirements, both in terms of availability and cost. If they ordered a database that would ultimately reside on the CD-ROM network, we asked that they let us know so we could plan for additional storage. The same considerations apply today for Internet-based resources. The library should have at least a general assessment of the bandwidth needed to support its resources. There are, unfortunately, many factors to consider in calculating the bandwidth consumed by library-supplied resources. Few vendors of Internet-based resources provide the detailed usage statistics needed to measure network utilization. As much as possible, though, libraries should understand the bandwidth requirements when they purchase major electronic resources and share this informati on with those in charge of ensuring Internet access.
Libraries might actively assess the health of their organization's Internet access. Put yourself in your users' place. Regularly test-drive at least a sampling of the resources you provide. Take note of the perceived performance of these resources. Talk to reference librarians who use these resources frequently. If some resources seem slow, perform diagnostics to understand why. Make frequent use of the "traceroute" network utility to understand the path that data takes from your library to your major suppliers on the Internet. Traceroute will also show the performance of each intervening router or exchange point along the way, allowing you to identify any major bottlenecks.
Be prepared to lobby for increased bandwidth to support the library's operation. Make sure you are able to document the dollars spent per year on Internet-based content, and the importance these resources have on the mission of the organization.
As we experienced with Napster, life on the Internet is unpredictable. Efforts aimed at detailed planning can become quickly overwhelmed by fast-sweeping trends. Sometimes the best we can hope for is to survive from one crisis to the next.
How are we surviving at Vanderbilt? In reaction to the increased demands associated with Napster, the university increased its Internet connection from 15- to 22-MB/sec. We are currently blocking the outside world from gaining access to MP3 files from our internal networks. Efforts are underway to educate students on the legal and ethical issues related to the duplication of copyrighted material. I suppose we're holding our own this time around, though I'm sure this is just the first wave of many in terms of competition for Internet bandwidth.
Marshall Breeding is the technology analyst at Vanderbilt University's Heard Library and a writer and speaker on library technology issues.