Transparent Peer Review By Scholar9
CREDIT CARD FRAUD DETECTION USING SUPPORT VECTOR MACHINES AND NAÏVE BAYES (SVM-NB)
Abstract
in the financial business, financial fraud is an ever-increasing threat with far-reaching implications. Data mining is crucial in detecting credit card fraud in live transactions. The work investigates the performance of Naïve Bayes, Support Vector Machines and Hybridization of these two techniques on a highly imbalanced dataset (credit card dataset). Credit card fraud dataset is sourced from the European Cardholders that has 284,807 transaction records. The work was implemented in python. The performance of the model is evaluated using popular metrics namely; Accuracy, Sensitivity, F1-Score, Prevalence, Precision, Specificity, False Alarm Alert and Balanced Accuracy. The results showed based on average accuracies were 99.80% for Support Vector Machine, 98.0% for Naïve Bayes and the Hybrid model produced 99.99%. The comparative result showed that, there was an improvement in the Hybrid model.
Sivaprasad Nadukuru Reviewer
03 Oct 2024 11:30 AM
Approved
Relevance and Originality
The text addresses a critical issue in the financial sector—financial fraud—highlighting the importance of detecting credit card fraud using data mining techniques. The focus on a highly imbalanced dataset and the application of various machine learning methods, including Naïve Bayes and Support Vector Machines, showcases originality in tackling this persistent problem. By exploring hybridization of techniques, the work contributes novel insights into enhancing fraud detection methods.
Methodology
The methodology outlined in the text is appropriate for the study's objectives. Using a credit card dataset with 284,807 transaction records provides a solid foundation for analysis. However, more detail on the data preprocessing steps, such as handling the imbalanced dataset, would strengthen the methodology description. Additionally, elaborating on the implementation process in Python, including libraries used, could enhance reproducibility for readers interested in similar applications.
Validity & Reliability
The results presented demonstrate high accuracy rates for the models, suggesting valid performance metrics. However, it would be beneficial to discuss the reliability of these findings, particularly concerning the imbalanced nature of the dataset. Providing insights into cross-validation techniques or discussing how the models performed on unseen data would enhance the study's credibility. Addressing potential biases in model evaluation and the implications of using different metrics would also provide a more robust analysis.
Clarity and Structure
The text is clear and effectively conveys the main findings. However, the structure could be improved by organizing it into distinct sections, such as "Introduction," "Methodology," "Results," and "Conclusion." This would help readers navigate the content more easily. Additionally, including definitions or explanations for terms like "F1-Score" and "Balanced Accuracy" would make the text more accessible to a broader audience.
Result Analysis
The analysis of the results shows promising outcomes, particularly for the hybrid model, which achieved an impressive accuracy of 99.99%. However, the text would benefit from a deeper discussion of the implications of these findings in real-world applications. Including comparisons to existing fraud detection methods and potential limitations of the models would provide a more nuanced view. Furthermore, exploring future work or potential improvements to the models could enrich the discussion and offer insights into ongoing challenges in fraud detection.
IJ Publication Publisher
Done Sir
Sivaprasad Nadukuru Reviewer