Abstract
Breast cancer is still among the most prevalent and deadly diseases for women all over the world, and early diagnosis is imperative to improve survival. This paper proposes a machine learning system for detecting breast cancer using two methodologies: tabular data classification using Random Forest and deep learning-based mammography image classification using a Convolutional Neural Network (CNN). The first approach uses the Breast Cancer Wisconsin Diagnostic Dataset, where a Random Forest classifier is applied to structured data to make accurate malignancy predictions. The second approach uses mammography images from databases like CBIS-DDSM and classifies images as benign or malignant using a CNN with transfer learning (VGG16). The evaluation of both models indicates that Random Forest demonstrates 98% accuracy, and CNN achieves 92% accuracy according to precision, recall, and F1-score metrics. Real-time predictions are achieved through Flask API, and the deployment strategies include scalable cloud solutions such as AWS SageMaker and Google Vertex AI. This study determines the potential of machine learning to enhance diagnostic accuracy, reduce errors, and accelerate breast cancer detection. Through a combination of tabular data and imaging-based techniques, the study suggests an integrative solution for early and effective breast cancer diagnosis, paving the way for improved patient outcomes and scalable health solutions.
View more >>