Unsupervised Query Reformulation through Latent Concept Induction in Large-Scale Heterogeneous Information Retrieval Environments

Mikhail Petrov

Go Back Research Article August, 2023

ISCSITR - International Journal of Scientific Research in Information Technology

Unsupervised Query Reformulation through Latent Concept Induction in Large-Scale Heterogeneous Information Retrieval Environments

Mikhail Petrov

Abstract

In large-scale heterogeneous information retrieval (IR) environments, user queries are often semantically ambiguous or structurally sparse, limiting retrieval effectiveness. This paper proposes a novel unsupervised query reformulation framework based on latent concept induction (LCI), which learns implicit semantic structures from retrieved document sets. Unlike supervised approaches, the proposed model autonomously uncovers latent concepts via document co-occurrence and context propagation techniques. Experiments on TREC and ClueWeb datasets show significant improvements in mean average precision (MAP) and normalized discounted cumulative gain (nDCG) over baseline and supervised models. The proposed LCI framework enhances retrieval effectiveness without requiring annotated query reformulation data, making it scalable across domains and languages

Keywords

query reformulation latent concept induction unsupervised learning information retrieval semantic matching large-scale retrieval

Document Preview

Download PDF

Details

Volume 4

Issue 2

Pages 1-6

ISSN 1551-1116

Unsupervised Query Reformulation through Latent Concept Induction in Large-Scale Heterogeneous Information Retrieval Environments

Abstract

Keywords

Cite this publication

QUICKLINKS

CONTACT US