PINGAR Search Interface Scores Highly in a Controlled User Study
Posted by admin on August 3, 2011 RSS Icon RSS
 

Pingar API provides crucial tools for search applications that score highly with end users. This has been confirmed in a detailed qualitative study that put Pingar alongside other publicly available systems to the test.

Bioscientists are some of the most prolific publishers. MEDLINE, which is the largest repository of research in life sciences, indexes thousands of articles daily and currently contains nearly 20 million citations. To keep up with the research around the world, scientists use PubMed – a search engine that provides access to MEDLINE. On occasion they use Google for search in general biomedical texts.

medline_growth-modified.gif

In collaboration with Dr. Anna Divoli, Pingar conducted a user study to take a closer look at what search systems bioscientists use, which search tools are actually needed and how these tools should work. In this study, 10 practicing bioscience researchers at the University of Chicago were interviewed using side-by-side comparisons of search results returned by several publicly-available systems and two Pingar prototypes. 

Bioscience and general search engines included in the study
Pingar’s two search engine prototypes for the biomedical domain were built using the Pingar API for semantic analysis of queries and documents and Apache Solr for full-text indexing and searching. We indexed the 85,000 articles in the Open Access PubMed dataset for this purpose. The prototypes allowed us to test Pingar’s tools for generating query expansions, related searches, keywords, summaries and taxonomy mapping, as well as Solr’s built-in facetted search and snippet extraction features.

The evaluation also included two systems primarily used in the biomedical domain (PubMed) and general search (Google), as well as additional publicly available bioscience search engines:

  • GoPubMed, a commercial search engine that provides facetted refinement of PubMed search results using the original hierarchy of structured vocabularies (Gene Ontology and MeSH);
  • Semedico, a research prototype searching in PubMed with facets containing MeSH and UniProt terms grouped into 20 categories defined by biologists.
  • NextBio, a commercial search engine on PubMed and clinical trials enhanced with features from gene, tissue, disease and compound ontologies.

We also included Bing, which handles queries and result ranking somewhat different to Google.

Interface features and how these were evaluated by participants
The initial questionnaire indicated that scientists are often unhappy with their search experience.  They feel pigeon-holed by autocomplete suggestions, overwhelmed by the refinement options and intimidated by expert search tools. Hence, we decided to study each feature of a search interface separately:

query-expansion-anon.png

  • Autocomplete (autosuggest) provides dynamic search suggestions as the user types the query;
  • Query expansions suggest alternative query terms when users’ guesses result in only a few or incorrect matches;
  • Facetted refinement helps to narrow down results based on a dimension (facet) of the searched item;
  • Related searches include suggestions leading to new searches;
  • Results preview helps judge relevance of search results and can include document title, year, authors, snippet preview, summary and keywords.

We used scientists’ search queries to create screenshots of these features as they are returned by different systems. The systems’ logos were removed to keep the comparison fair (see illustration). We asked the scientists to rate the overall usefulness and aesthetics of how each system handled each feature and to rank the systems in order of preference.

Results of the user study
Overall, participants told us that aesthetics are important (and need to be “good enough” to use a system) but what really matters is the content. Their answers also show that some interface features are more important than others. For bioscientists, the most important parts of a search interface are a well executed document preview and the facetted refinement. Related searches are least important because scientists rarely like clicking on suggestions that lead to new searches. Keywords, particularly specific ones, were well received, as well as simple interfaces that don’t overwhelm with suggestions.

facetted-refinement-comparison.pngDifferent systems ranked top for different features. The two systems that did particularly well with the participants were: Google for autocomplete and related searches, and Pingar for query expansions and facetted refinement. In terms of document preview, scientists prefer snippets when searching for facts, and summaries for other searches. Users have also positively responded to Semendico’s query expansions and Pingar’s related searches when searching for facts. Our participants disliked autocomplete suggestions that contain categories and other information (GoPubMed, Semendico, NextBio), and they also didn’t find useful when facetted refinement has too many (Semendico) or too few (PubMed) choices.

Facetted search using automated metadata generation by Pingar
The most valuable outcome for Pingar is bioscientist's positive responses to our prototypes with dynamic facetted refinement options. These were automatically computed using the Pingar API’s taxonomy matching service and the Obofoundry biomedical taxonomies from the top search results. This means that each query contained a unique combination of relevant facets (see illustration). In previous posts, we have discussed positive aspects of using taxonomies for organizing documents and automatic metadata population using the Pingar API. This study scientifically proves that users benefit from systems that utilize our technology in real life settings.

The results of the study are discussed in our submission paper to the Human-Computer Interaction and Information Retrieval workshop (HCIR-2011), held at Google in Mountain View, California in October.

Comments:

You must be logged in/registered to leave a reply. Login/Register »
 

Explore Pingar


Share Points CIO Apache Solr BizSpark