doi:10.36676/urr.v11.i4.1364

Paper Title

LEVERAGING MACHINE LEARNING FOR IMPROVED SPAM DETECTION IN ONLINE NETWORKS

Authors

Balachandar Paulraj

Keywords

Leveraging Machine Learning
Spam Detection
N-gram tf.idf Feature Selection
High-Dimensional Data
Class Imbalance
Benchmark Datasets
Enron Dataset
SpamAssassin
SMS Spam Collection
Social Networking Data
False Positives
False Negatives
Feature Extraction
Deep Learning
Robust Spam Detection
MACHINE LEARNING

Article Type

Research Article

Journal

Universal Research Reports

Research Impact Tools

DOI

10.36676/urr.v11.i4.1364

Issue

Volume : 11 | Issue : 4 | Page No : 258–273

Published On

September, 2024

Downloads

FULL PDF

CITATION

COPY LINK

Abstract

This paper proposes an advanced methodology of spam detection by including N-gram tf.idf feature selection and a deep multi-layer perceptron neural network, with further improvement through the modified distribution-based balancing algorithm. Considering high-dimensional data and class imbalance problems, the proposed method proved to outperform the state-of-the-art methods on benchmark datasets, including Enron, SpamAssassin, SMS spam collection, and social networking data. It also makes up an important enhancement in the classification of spam, as it captures complex features that reduce false positives and false negatives. These results show that combining deep learning with improved feature extraction and balancing techniques provides a very robust approach for spam detection.