Active Learning for Semi-Supervised Clustering Framework for High Dimensional Data

Pavithra M

doi:10.32804/IRJMST

Go Back Research Article August, 2019

International Research journal of Management Science and Technology

Active Learning for Semi-Supervised Clustering Framework for High Dimensional Data

Pavithra M

Abstract

In certain clustering tasks it is possible to obtain limited supervision in the form of pairwise constraints, i.e., pairs of instances labeled as belonging to same or different clusters. The resulting problem is known as semi-supervised clustering, an instance of semi-supervised learning stemming from a traditional unsupervised learning setting. Several algorithms exist for enhancing clustering quality by using supervision in the form of constraints [2]. These algorithms typically utilize the pairwise constraints to either modify the clustering objective function or to learn the clustering distortion measure. Semi-supervised clustering employs limited supervision in the form of labeled instances or pairwise instance constraints to aid unsupervised clustering and often significantly improves the clustering performance. Despite the vast amount of expert knowledge spent on this problem, most existing work is not designed for handling high-dimensional sparse data [4]. Semi-supervised clustering uses a small amount of supervised data to aid unsupervised learning. One typical approach specifies a limited number of must-link and cannot link constraints between pairs of examples. It presents a pairwise constrained clustering framework and a new method for actively selecting informative pairwise constraints to get improved clustering performance [6]. The clustering and active learning methods are both easily scalable to large datasets, and can handle very high dimensional data. Experimental and theoretical results confirm that this active querying of pairwise constraints significantly improves the accuracy of clustering when given a relatively small amount of supervision [5].

Document Preview

Download PDF

Details

Volume 10

Issue 8

Pages 35-41

DOI 10.32804/IRJMST

ISSN 2250-1959

Impact Metrics

Active Learning for Semi-Supervised Clustering Framework for High Dimensional Data

Abstract

Cite this publication

QUICKLINKS

CONTACT US

Email Not Verified

Confirm Account Verification

Active Learning for Semi-Supervised Clustering Framework for High Dimensional Data

Abstract

Cite this publication