Back to Top

Paper Title

Sparse least-squares Universum twin bounded support vector machine with adaptive Lp-norms and feature selection

Authors

Hossein Moosaei
Hossein Moosaei
Fatemeh Bazikar
Fatemeh Bazikar
Milan Hladík
Milan Hladík

Article Type

Research Article

Research Impact Tools

Issue

Volume : 248 | Page No : 123378

Published On

August, 2024

Downloads

Abstract

In data analysis, when attempting to solve classification problems, we may encounter a large number of features. However, not all features are relevant for the current classification, and including irrelevant features can occasionally degrade learning performance. As a result, selecting the most relevant features is critical, especially for high-dimensional data sets in classification problems. Feature selection is an effective method for resolving this issue. It attempts to represent the original data by extracting relevant features containing useful information. In this research, our aim is to propose a p-norm least-squares Universum twin bounded support vector machine (LSp-UTBSVM) to perform classification and feature selection at the same time. Indeed, the proposed method, which outperforms the traditional least-squares Universum twin bounded support vector machine, can achieve good classification accuracy in a reasonable amount of time while also providing a sparse solution. The model we propose is an adaptive learning procedure with p-norm ( 0 < p < 1 ), where the parameter p can be automatically selected by the data set. The algorithm we use to find the approximate solution of this model involves solving systems of linear equations. Furthermore, we obtain new bounds for the absolute values of non-zero components of a local optimal solution. These bounds allow us to remove the zero components from an arbitrary numerical solution. Setting the parameter p, LSp-UTBSVM improves classification accuracy and selects the relevant features. Numerical experiments on a handwritten digit recognition, University of California Irvine (UCI) benchmark, Normally Distributed Clusters (NDC) and high dimensional data sets confirm the superiority of the proposed method in the accuracy of classification and the selection of relevant features in comparison with some popular methods.

View more >>