The Open Information Systems Journal

2011, 5 : 1-7
Published online 2011 May 10. DOI: 10.2174/1874133901105010001
Publisher ID: TOISJ-5-1

The Popularity of Articles in PubMed

L. Smith and W.J. Wilbur
Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, NCI, USA.

ABSTRACT

The PubMed search engine displays query results in reverse chronological order, which is appropriate for users interested in the latest publications. The purpose of this paper is to use machine learning to order documents by popularity, or the predicted frequency that an article is viewed by the average PubMed user. Other research on general search engine usability has applied machine learning to order documents by their relevance to a given query. The approach here takes a global view of popularity across all users in a given time period, independent of their information need. An effective method for learning popularity from clickthrough data is identified, and a novel measure of success in this task is proposed. The resulting model shows that the topic of an article has the largest single influence on its popularity, and its publication date has a strong secondary influence. Possible applications and extensions are discussed.

Keywords:

Clickthrough data, document ranking, schema less, machine learning.