Library Technology Guides

Document Repository

Smart Libraries Newsletter

Yewno Advances as a New Type of Discovery

Smart Libraries Newsletter [April 2017]

by

The genre of index-based discovery services has become well established, with the majority of academic libraries adopting products such as Ex Libris Primo, Ex Libris Summon, or OCLC's WorldCat Discovery Service. These products operate through indexing vast quantities of full text, metadata, articles, book chapters, and other content items that represent the body of scholarly content of interest to libraries. They make use of indexing and search technologies, such as Apache SOLR, Elasticsearch, or proprietary equivalents, to provide a broad-based search environment. This environment is able to retrieve and rank search results based on the relevancy of full text matching, assigned subject terms, or other factors. EBSCO Discovery Service gives heavy weighting to subject terms assigned by domain experts to optimize relevancy. These discovery services make use of facets and other text-oriented tools to enable searchers to navigate through search results to identify items of interest.

A new product has recently launched called Discover from Yewno for Education, which takes quite a different approach to the discovery of scholarly and educational resources. Yewno Discover (a play on “you know”) provides a new type of discovery environment, which presents a visual search experience that enables researchers and students to explore and select information resources based on concepts rather than keywords. It also allows users to discover information resources as they traverse the connections or relationships among those concepts.

Yewno has developed an intuitive visual interface that can be easily understood by all types of searchers. It relies on a very sophisticated set of technologies, which the company describes as combining computational linguistics, machine learning, and graph theory.

User Interface

Yewno Discover positions itself as offering a next-generation discovery environment based on a fundamentally different set of principles than the current slate of discovery products. Yewno Discover presents a visual interface and is based on concepts rather than terms or keywords. The interface includes a search box, which is used to enter an initial set of query terms, a visual graph in the middle pane, with a definitions box and a context bar on the right. Once the searcher enters a query, the visual graph representing the concepts contained within the documents matching the search appears, with the original terms in the center. The definitions box and context bar are populated based on an initial rendering of the search results and change dynamically based on the current context and selections.

The central concept node will be surrounded by related concepts with lines to show how they relate to others. The design is based on a highly interactive environment where the searcher can click on anything they see. The searcher can single click on a node to understand its specific network of connections, or can double click on any node to expand it and explore its underlying concepts. As the user explores and navigates through the concept graph, chains of related concepts are built, continually displaying the paths of connection. Clicking on the lines connecting nodes will present the concepts held in common.

At any point, the searcher can click on the context bar to view lists of documents relevant to the currently selected context. The list presents documents, titles, and excerpts. When an individual article is selected, the searcher is linked to its publisher's platform for viewing or downloading. Consistent with other types of discovery environments, metadata and excerpts of the documents are available within the interface, but the documents themselves are fulfilled by the original publisher. Access to documents is also contingent on library subscriptions, with other payment or request options offered for items not covered by institutional license arrangements. It is important to keep in mind that the concepts presented through the interface may not be literally represented in the text of the documents. Yewno Discover presents concepts it has extracted from each document, not the keyword or terms that may literally appear. This approach can be especially helpful when working across disciplines that may use different terms to refer to the same concept.

Underlying Technology

Yewno has developed what it calls an inference engine that ingests documents and derives semantic data, which the search and retrieval capabilities present through its visual interface. As with conventional search products, documents are ingested into its platform. But instead of building textual indexes, as technologies such as Apache SOLR or Elasticsearch would, Yewno adopts algorithms from the realm of computational linguistics and machine learning to identify and extract concepts contained within its entire, interdisciplinary content set. Yewno projects those into a multi-dimensional semantic space and ultimately builds a semantic network of associated concepts. The platform is currently optimized for multidisciplinary scholarly content but can also be applied to other content domains or business sectors.

Content Addressed

Yewno depends on content ingested into its inference engine. The company has developed arrangements with a variety of publishers that contribute content for ingestion that can then be discovered via the Discover interface. This content is provided on the basis that the platform will facilitate discovery and link users back to publisher sites for fulfillment. Yewno is able to provide publishers with data describing the documents access via its interface and the concepts that led to document views.

Some of the major content providers currently partnering with Yewno include: Taylor & Francis, SpringerNature, JSTOR, Cambridge University Press, Oxford University Press, PNAS, Alexander Street, MIT Press, Stanford University Press, Association for Computing Machinery, Stanford University Press, Island Press, the National Academy of Sciences, Wiley, PubMed, Gale, SPIE, Annual Reviews, the Institute of Physics, the Association for Microbiology, and BioOne as well as repositories of open access content such as bioRxiv, arXiv.org, MIT's DSpace repository, and Harvard University's DASH repository. Yewno plans to continually expand its publisher partners. Yewno recently announced that it has analyzed and ingested over 100 million texts into its platform. Although specific numbers are not disclosed, the index-based discovery services, such as Primo, Summon, EBSCO Discovery Service, are estimated to address between 1.5 and 2 billion content items. It will be interesting to track whether Yewno is able to expand its coverage to match those of the established discovery products.

Early Adopters

Although Yewno has been in development for about two years, it remains at a relatively early stage in technology development, content acquisition, and marketing. The company made its public debut of Discover at the June 2016 ALA Annual Conference in Orlando, FL.

An extensive beta trial ran from April to December 2016 across beta test sites including a variety of large universities, smaller colleges, and a national library. Libraries currently working with Yewno include Stanford University, the Bavarian State Library, New York University, The Harvard Library, Stonehill College, University of Oxford, University of California, Berkeley, and MIT Libraries. Yewno is actively marketing Discover to other educational institutions in the United States, Canada, Germany, and the United Kingdom.

Yewno Company Background

Yewno was founded in 2014 by Ruggero Gramatica, an entrepreneur and business executive with a PhD in Applied Mathematics and a background in technology across the telecommunications and biomedical sectors. The company emerged out of the vision of Gramatica and Michael Keller, the Vice Provost, University Librarian, and Director of Academic Information Resources at Stanford University, to create a concept and inference engine able to address information discovery across disciplines and in multiple domains.

Gramatica had previously been the CEO of Thermametrics (formerly mondoBIOTECH), a small biotechnology firm involved in the research problem of identifying peptides that could be used in the treatment of rare diseases through the review of scientific literature. The company's original approach was based on work performed by human experts who would work though scientific papers to identify potential associations. Gramatica led the development of a new algorithmic framework able to ingest tens of millions of citations and abstracts from PUBMED and other biomedical databases in order to algorithmically identify associations across biological entities and construct hypotheses of various mechanisms or actions. This framework platform was able to identify dramatically higher numbers of potential treatments for rare diseases than was possible through the manual work of a team of scientists. Gramatica maintained ownership to the intellectual property, which was perpetually licensed to Thermametrics.

Michael Keller, a member of the Board of Directors of Thermametrics, invited Gramatica to Silicon Valley to lead a pilot of a new Inference Engine based on Gramatica's original mathematical framework. The framework was applied this time to develop a similar platform for the broader realm of scholarly literature that Search&Learn used for the biochemical domain (as related in a presentation to the February 2017 Charleston Conference).

The first phase of the company saw the initial development of a new technology infrastructure and interface prototype, which ultimately became the Yewno discovery platform. During this time, they were also making agreements with publishers to supply content for ingestion. Following this initial two-year period of self-funding by its founders, Yewno was able to attract funding from investors to expand its operations. In November 2015, Yewno secured a $10 million investment from Pacific Capital Group headed by Silvio Scaglia. In November 2016, the company secured a second round of investment with an additional $5 million from Desmond Shum of GO VR LLC and an additional $1.5 million from Pacific Capital, closing the Series A fund raising with a total of $16.5 million.

The company currently has around 30 employees and is based in Redwood City, CA with a presence in London. Key personnel include co-founder and Chief Strategy and Business Development Officer Ruth Pickering, Chief Data Scientist Haris Dindo, Chief Operating Officer of Education Franny Lee, and Director of Product Development Ray Shan. Franny Lee had previously led SIPX, another start-up out of Stanford University that created a platform for helping universities reduce costs in course materials, including advanced copyright management functionality. Lee joined ProQuest as General Manager and VP of SIPX in April 2015 when SIPX was acquired by ProQuest and worked there until November 2016. SIPX is now part of the product portfolio of Ex Libris, a ProQuest Company.

The company has also developed a product for publishers and content curators, called Yewno Unearth, which provides a topical hierarchy of their current catalog. Yewno Unearth offers more detailed information to inform their future content acquisitions operations, help sales and marketing efforts, and provide more granularity to strengthen search capabilities within their own platforms.

Leveraging the versatility of the algorithmic framework, which in the past was developed and successfully applied by Gramatica in the analysis of economic and financial cycles, Yewno is developing implementations of its technology optimized for other business sectors beyond the current Yewno for Education product, including finance, legal, and life sciences.

For more information, see http://yewno.com.

Permalink:  
View Citation
Publication Year:2017
Type of Material:Article
Language English
Published in: Smart Libraries Newsletter
Publication Info:Volume 37 Number 04
Issue:April 2017
Page(s):2-5
Publisher:ALA TechSource
Place of Publication:Chicago, IL
Company: Yewno
ISSN:1541-8820
Record Number:22562
Last Update:2022-12-05 15:04:07
Date Created:2017-05-02 17:50:07
Views:185