Serials Solutions has added a new level of sophistication to its Central Search federated-search offering. Last month, the company integrated a results-clustering feature; the new functionality, licensed from Vivísimo, transforms a simple listing of results into an interface that offers labeled groupings, which allows the user to quickly hone in on items of interest.
The Cluster Cadre
One of the downfalls of a federated-search environment involves the large sets of results returned by the search targets. Although it’s great to be able to retrieve results from multiple information resources via a single search, delivering results in a single, long list can be unwieldy for the searcher to navigate. Even with duplication deleting and sorting, the results can be difficult to interpret and make the process of pinpointing interesting and relevant results tedious. Thus, organizing the results in clusters of related items can enable users to more easily identify and view the results best associated with their research topics.
The clustering approach developed by Vivísimo has similarities to faceted navigation, which is beginning to gain interest in libraries. Clustering is built on automatically created groupings based on textual simulations in the results.
Faceted navigation works on the basis of categories assigned within the metadata records, and facets can usually be implemented hierarchically so users can click through multiple facets to incrementally narrow the results to a manageable set. Whether accomplished through clustering, hierarchical facets, or other approaches, delivering a better, friendlier search environment for library users is a hot topic these days in the library field.
Vivísimo’s Verve: Search
Vivísimo is one of the leading companies in search technology. Its flagship product is the Velocity Search Platform, which consists of three layered components: the Velocity Search Engine, the Velocity Content Integrator, and the Velocity Clustering Engine. The company’s technologies power the expansive FirstGov.gov Web site and are widely used by corporations, government agencies, and research organizations.
Based in Pittsburgh, Vivísimo, the company, began as a group of scientists formerly associated with Carnegie Mellon University. The firm was formally established in 2000 and operates with funding from the National Science Foundation, Innovation Works (a Pittsburgh-based venture capital firm specializing in technology start-up companies), and other private investors.
The Vivísimo Clustering Engine performs an analysis of search results as they are delivered from the remote targets to organize them into groups and determine appropriate text labels for each group. The clustering algorithms work only with the information provided in the results and do not rely on manually created taxonomies or controlled vocabulary lists. This clustering feature, powered by the Vivísimo Clustering Engine, will be provided to Central Search subscribers without additional cost.
Users can get a feel for how Vivísimo’s clustering technology works by trying the Clusty search engine at http://clusty.com. Clusty applies clustering to general Web searches, and it provides users with an opportunity to see this search approach in action.
Fast Forwarding Federated Search
Vivísimo’s Clustering technology enhances Central Search and provides a rapid path to advancement for Serials Solutions, a relative newcomer in the federated-search domain. The product was debuted at the January 2005 ALA Midwinter Meeting in Boston—while some other companies, such as WebFeat, MuseGlobal, and Ex Libris, have been developing and marketing federatedsearch products much longer.
Serials Solutions created Central Search through its own development efforts, though some components were licensed from other companies. Besides the recently integrated Vivísimo component, the system uses translation technology licensed from WebFeat.
Other companies involved in providing information resources and technologies to libraries have also taken interest in the Vivísimo technology. In January 2006, Swets announced a new federatedsearch product, the SwetsWise Searcher, based on metasearch and clustering technology from Vivísimo. In June of this year, Ex Libris announced it had licensed the Vivísimo Clustering Engine and reported plans to integrate this technology into the upcoming Version 4 of the company’s MetaLib federatedsearch product.
Central Search complements other applications offered by Serials Solutions, providing library customers with an integrated suite for the management of and access to collections of electronic resources. Other components include AMS (Access & Management Suite), Article Linker, and ERMS (Electronic Resource Management System). All these products build on the vendor’s foundation product, a comprehensive database for e-journal titles and holdings. Mastery of holdings data has been the core competency of Serials Solutions from its inception, and in the arena of e-content management, the quality of the holdings data distinguishes a product as much as the capability of the software.
Central Search, like the other members of the Serials Solutions suite of products, follows the vendor-hosted model. Previously called ASP (Application Service Provider), this approach is now referred to as “Software as Service,” and it represents one of the major information- technology trends.
Software as Service involves accessing a product on a server managed by the vendor, saving the organization from expense and complexities involved in installing, configuring, and maintaining the software, operating system, and hardware associated with a given technology product. Given the complexity of establishing connection configurations with each of the search targets, the hosted approach is especially attractive in federated-search offerings. The disadvantage lies in more limited ability to customize the application and integrate results into local portal environments, but Serials Solutions offers an XML API (application programming interface) to address this disadvantage.
Among the federated-search products marketed to libraries, some follow the Software as Service model and others rely on local installations. WebFeat, like Serials Solutions, offers its federatedsearch product in the Software as Service model. MetaLib (from Ex Libris) and MuseGlobal are offered to libraries as locally installed applications.
Additionally, in an interesting turn, Frank Bilotto, formerly head of publishing at Vivísimo has joined MuseGlobal as its new VP for Publishing and Digital Media.