Making information available on the Web is one of the most satisfying activities that library professionals can do. I've been fortunate to be able to create a number of Web resources that others seem to find valuable. In this column, I want to talk about lib-web-cats, a database of libraries that I've been working on for the last few years. Read on to learn about the information available in this database, the search-and-retrieval features it offers, and the technologies involved in creating this resource.
Pertinent Information
lib-web-cats aims to facilitate Web access to libraries and to provide basic information about them. Many Web directories provide listings of library Web pages and others list library OPACs, but no major directory includes both. I also needed a resource to help me track the library automation systems that libraries use. Thomas Dowling's Libweb (http://sunsite.berkeley.edu/libweb) is the major resource for library Web pages and Peter Scott's WebCats (http://www.lights.com/webcats) lists library OPACs. lib-web-cats takes its name from its interest in providing both library Web sites and online catalogs.
I've attempted to supply a reasonable amount of information for the libraries represented. For each library, lib-web-cats includes a record with fields for its parent institution, the URL for its Web site, a link to its online catalog, library type, geographical location, mailing address, the automation system currently in place and when it was implemented, any automation systems that were previously used, the number of volumes in the collection, the number of annual circulation transactions, and any major affiliations relevant to the library, such as its membership in the Association of Research Libraries, or any regional consortia or networks. There's also a note field for providing additional information about the library and a hidden note field where I put information about the library for my use that doesn't display to the public. Each record also includes a field for its creation date, the date last modified, and a unique record number.
What Libraries Are Represented?
There are currently over 4,700 libraries represented in lib-web-cats. Not all of the records in the database have information in each of the fields described above, but all have the library's main URL, its library type, and geographical location. A large percentage includes the URL for the online catalog and the library's current automation system. Although I would like the database to be more international, it's currently dominated by North American libraries--3,410 in the U.S. and 247 in Canada. I've attempted to include as many national libraries as I could track down. The remaining libraries represent about 80 other countries.
Building the Database
The information in lib-web-cats comes from a variety of sources. I've manually entered the majority of the current information by working through lists of libraries provided elsewhere on the Internet. When I find such a list, I check to see if any libraries are missing from lib-web-cats, and then create records as needed. As I enter information for each record, I connect to the library's site and gather as much information as is readily available. The library name, institution, and URL for the Web page are always fairly simple to find. It's often a challenge to find details such as the library's mailing address because many libraries don't place this information in any obvious location. In most cases the library will include a link to its online catalog, and by now I can easily recognize which library automation system underlies it.
I also provide a Web form through which libraries can submit their information for inclusion in lib-web-cats. I generally receive a few of these submissions each week. As the use of lib-web-cats grows, the number of libraries submitted this way also increases.
Finding Libraries
I've created several options for accessing the information in this database. The basic search page allows the user to simply key in the name of the library in a search box. Pressing the search button will search both the library name field and the institution name. Many libraries, such as the Jean and Alexander Heard Library where I work, have been named after important benefactors. When searching for a library, the general public may not be familiar with such names and will likely search by the institution's name (such as "Vanderbilt"). In lib-web-cats, either form will yield successful results. Names for public libraries can also be tricky. Some are named for cities, some for counties, and some for benefactors. For these libraries, I generally assume the institution has the same name of the county or city where it's located.
The basic search page also includes the ability to search by library type and geographic location. You could, for example, use this method to find all the university libraries in a given state or country, or find all the libraries in a given city.
Also on the basic search page is a shortcut that will bring up a listing of all the Association of Research Libraries members and a link to the Library of Congress. These are by far the most sought-after libraries on the Internet, and I wanted to make access to them as simple as possible.
In addition to the basic search page, I've created a page to allow users to browse through lists of libraries by geographic region. While it's possible to get the same information from the queries on the basic page, some individuals find browsing through lists more comfortable. Even though the browsable version is presented as a page of links, each link is coded to execute a query from the database--there are no static, pre-built lists. This geographic browse page includes a link at the bottom of the page that can be used to generate a report of the database showing the number of libraries represented by each country.
An advanced search page provides an additional set of query combinations. Along with the options available on the basic query page, users can search by library automation system, previous automation system, and affiliation. I designed this page to help me pursue my interests in library automation. Through a basic query, I can easily find the libraries that use any given system. But by formulating a query with the previous and current automation systems, I can view data on system migration patterns--at least within the libraries represented in the database. Through this type of search, you could, for example, list the libraries that have migrated from NOTIS to Voyager.
Presentation of Results
Regardless of the query method used, lib-web-cats will return a list of libraries that includes the institution's name, the library name presented as a link to its main Web page, geographical location, a link to its online catalog, and its current automation system. At the end of each line is a "details" link that displays the full record for the library. The full record presents all of the library's publicly available information.
Using lib-web-cats
lib-web-cats is most commonly used by the general public as a Web-based library directory. There are hundreds, if not thousands, of sites on the Internet (mostly other library sites) that include lib-web-cats as a means of finding other library Web pages and catalogs. lib-web-cats generally gets about 10,000 hits a month. The use of this resource has steadily increased over time. There have been some sudden bursts of activity such as in June 1999 when it was picked by Yahoo! Internet Life as the "Incredibly Useful Site of the Day" (http://web1.zdnet.com/yil/stories/useful/0,4921,2280965,00.html). The fact that so many people find this site helpful is my main motivation in making it available free of charge and keeping it up-to-date.
This resource also lends itself to more specialized applications. Libraries that are considering implementing a new library automation system can use lib-web-cats to find reference sites about different systems. Using the collection size and annual circulation, a library can find comparable sites using the system under consideration. Since the database contains both current and previous automation systems, it can be used to track trends in how libraries migrate from one to another.
Technology Behind the Scenes
lib-web-cats, like the "Problem Tracking System" I described in the March 2000 issue of Information Today (The Systems Librarian, page 58), is a Web-enabled database that uses the DB/Text Works and DB/Text WebPublisher software from Inmagic, Inc. DB/TextWorks is a text-oriented database, and is well-suited for the type of information found in lib-web-cats.
The Web-based applications that I've created at Vanderbilt, including lib-web-cats, rely on a dual-server approach--the basic Web server and the database application server. All the static HTML pages, such as the main query page, the browse search page, the advanced search page, and the site submittal form, all reside on a Novell NetWare server running the Netscape Enterprise Web Server. NetWare has proven itself in the Vanderbilt libraries as a highly reliable and efficient Web server. Since all library staff members use NetWare as their main file-server platform, using this same platform for the library Web server makes it easy for Web authors to create their pages.
The database component of lib-web-cats relies on Windows NT. DB/TextWorks and DB/Text WebPublisher work on a Windows NT server running Microsoft's Internet Information Server (IIS). This is the only operating system that can be used with the Inmagic components--no Unix or NetWare versions are available. The DB/Text WebPublisher application operates as a virtual directory within the IIS environment, and is the middleware that communicates between the DB/TextWorks databases and the Web server.
Running the static pages on a different server than the database transactions is easy, and this configuration distributes the load across multiple servers, giving us faster performance, enhanced security, and increased flexibility. While NetWare makes a great Web server, few Web-oriented applications run on it natively. Windows NT, on the other hand, excels as an applications server.
Using the Windows-based DB/TextWorks interface, the process of creating the database structure and the templates for making records, performing queries, and displaying results lists and records is simple. I used the Windows interface to input information into the database. It was a quick process to cut and paste data from Web pages into records by toggling back and forth between DB/TextWorks and a Web browser.
Inmagic's DB/Text WebPublisher application makes it possible to offer a DB/TextWorks database on the Web. With this component enabled, you can make templates for Web use in much the same way as those for the Windows interface. Templates for web use in much the same way as those for the windows interface -- Templates need to be constructed to display the initial list of results from a query, and then for the full display of a record as it's selected from that list. You can use the graphical interface to choose fonts and formatting options to get started quickly, but for more sophisticated effects it's possible to specify the exact HTML coding you want before and after each field, and for the header and footer of each page.
You can create query pages either through DB/TextWorks' Windows interface or by hand with a text editor. I generally prefer the latter approach since I like the flexibility of building my own interface and combining multiple query options per search page. The key to building HTML-based query pages for DB/Text WebPublisher lies in a good knowledge of HTML forms, and in the specific values and variables needed by DB/Text WebPublisher. These are not particularly well-documented, but can be learned through a careful study of queries automatically generated by DB/TextWorks and DB/Text WebPublisher.
The form for submitting new sites to lib-web-cats was created outside the DB/TextWorks and DB/Text WebPublisher environment. The submission page itself is a basic Web page with an embedded form that prompts you for the necessary information. This form posts its information to a Perl script that resides on the same server as DB/TextWorks and DB/Text WebPublisher. The Perl script parses the information passed from the form and checks for any required fields. If the needed fields are missing, an error message is generated and passed back to the user. If all the needed information is present, the script formats the information and writes it as a delimited text file, which is then saved to the NT Server and e-mailed to me. The script also sends the captured information back to the user, and writes a message of thanks for contributing to the database. When I receive the e-mail, I know there's a record that needs to be added to lib-web-cats. To incorporate the record, I use DB/TextWorks' record-import feature.
I've recently acquired and implemented the DB/Text Interactive WebPublisher, which automatically imports records into a DB/TextWorks database. This new option makes the process of adding new records to the database truly automatic. All I do now is review the records added each week to check for duplicates and update any incomplete information.
Future Development Plans
Although the development of lib-web-cats is mostly complete, I continue to see the need to add additional features and continually expand the database. Some of the new features that I would like to add might include an additional browse interface. The site would also benefit from additional help files and query pages written in other languages. In addition to browsing by geographical location, it might be helpful to have browse pages for type of library or automation system. The main opportunity for enhancing the resource lies in expanding the database itself. I anticipate increasing the number of international libraries in the database.
Finally, I invite the readers of this column to visit lib-web-cats and I welcome any feedback that you might offer. Please check the database for some of your favorite libraries and use the site submission form to let me know about any libraries you fail to find. Also please let me know about broken links or any other incorrect information.
Marshall Breeding is the technology analyst at Vanderbilt University's Heard Library and a writer and speaker on library technology issues.