Abstract:
The common approach of using clusters of similar documents for ad hoc document retrieval is to rank the clusters in response to the query; then, the cluster ranking is transformed to document ranking. We present a novel supervised approach to transform cluster ranking to document ranking. The approach allows to simultaneously utilize different clusterings and the resultant cluster rankings; this helps to improve the modeling of the document similarity space. Empirical evaluation shows that using our approach results in performance that substantially transcends the state-of-the-art in cluster-based document retrieval.
Cluster-Based Document Retrieval