The Open Information Systems Journal

2009, 3 : 1-8
Published online 2009 April 13. DOI: 10.2174/1874133900903010001
Publisher ID: TOISJ-3-1

Using Windmill Expansion for Document Retrieval

Shao Fen Liang , Paul Smart , Alistair Russell and Nigel Shadbolt
School of Electronics and Computer Science, University of Southampton, SO17 1BJ, UK.

ABSTRACT

SEMIOTIKS aims to utilise online information to support the crucial decision–making of those military and civilian agencies involved in the humanitarian removal of landmines in areas of conflict throughout the world. An analysis of the type of information required for such a task has given rise to four main areas of research: information retrieval, document annotation, summarisation and visualisation. The first stage of the research has focused on information retrieval, and a new algorithm, “Windmill Expansion” (WE) has been proposed to do this. The algorithm uses retrieval feedback techniques for automated query expansion in order to improve the effectiveness of information retrieval. WE is based on the extraction of human–generated written phases for automated query expansion. Top and Second Level expansion terms have been generated and their usefulness evaluated. The evaluation has concentrated on measuring the degree of overlap between the retrieved URLs. The less the overlap, the more useful the information provided. The Top Level expansion terms were found to provide 90% of useful URLs, and the Second Level 83% of useful URLs. Although there was a decline of useful URLs from the Top Level to the Second Level, the quantity of relevant information retrieved has increased. The originality of SEMIOTIKS lies in its use of the WE algorithm to help non–domain specific experts automatically explore domain words for relevant and precise information retrieval.

Keywords:

Information retrieval, query expansion, retrieval feedback, humanitarian demining.