Go Back Research Article September, 2024

LEVERAGING MACHINE LEARNING FOR IMPROVED SPAM DETECTION IN ONLINE NETWORKS

Abstract

This paper proposes an advanced methodology of spam detection by including N-gram tf.idf feature selection and a deep multi-layer perceptron neural network, with further improvement through the modified distribution-based balancing algorithm. Considering high-dimensional data and class imbalance problems, the proposed method proved to outperform the state-of-the-art methods on benchmark datasets, including Enron, SpamAssassin, SMS spam collection, and social networking data. It also makes up an important enhancement in the classification of spam, as it captures complex features that reduce false positives and false negatives. These results show that combining deep learning with improved feature extraction and balancing techniques provides a very robust approach for spam detection.

Keywords

Leveraging Machine Learning Spam Detection N-gram tf.idf Feature Selection High-Dimensional Data Class Imbalance Benchmark Datasets Enron Dataset SpamAssassin SMS Spam Collection Social Networking Data False Positives False Negatives Feature Extraction Deep Learning Robust Spam Detection MACHINE LEARNING
Document Preview
Download PDF
Details
Volume 11
Issue 4
Pages 258–273
ISSN 2348-5612
Impact Metrics