Loading...
Scholar9 logo True scholar network
  • Article ▼
    • Article List
    • Deposit Article
  • Mentorship ▼
    • Overview
    • Sessions
  • Questions
  • Scholars
  • Institutions
  • Journals
  • Login/Sign up
Back to Top

Transparent Peer Review By Scholar9

ASD-Pipeline: An Ensemble Machine Learning Framework Integrating Feature Selection, Behavioural Clustering, and Class Rebalancing for Accurate Autism Spectrum Disorder Prediction

Abstract

Autism Spectrum Disorder (ASD) is a complex neurodevelopmental condition characterized by a variety of behavioral and cognitive patterns. Early and precise detection is critical in enabling timely interventions. Conventional classification models frequently exhibit poor generalization due to irrelevant features, unstructured behavioral data, and severe class imbalance. Despite current advances in machine learning for ASD detection, current models do not integrate adaptive feature selection, behavioral grouping, or imbalanced class handling in a unified, end-to-end pipeline. The lack of incorporation frequently results in suboptimal performance and limited interpretability. This study proposes a new ensemble-based framework called ASD-Pipeline, which integrates flexible feature selection, hybrid clustering, synthetic minority oversampling, and ensemble voting classification to improve the predictive performance for ASD identification. The proposed ASD-Pipeline framework uses a five-stage process to improve the accuracy of autism spectrum disorder prediction. First, the dataset is normalized utilizing Min-Max scaling to guarantee that the feature ranges remain consistent. Next, feature selection is performed utilizing FlexiFeat, an ensemble method integrating filter-based (CfsSubsetEval with BestFirst), wrapper-based (WrapperSubsetEval with GreedyStepwise), and embedded (ReliefF with Ranker) techniques to maintain only the most pertinent feature. The ClusterGroup stage uses K-Means clustering (k=5) and DBSCAN improvement (ε=0.5, minPts=3) within each cluster to create behavioral groups and remove outliers. The ReBalance stage uses Cluster-SMOTE to tackle class imbalance by producing synthetic samples for the minority class and a balanced dataset. Finally, the ASDClassifier stage involves training an ensemble of Logistic Regression, Support Vector Machine, and Gradient Boosting classifiers that are combined using soft voting. Metrics used to assess the model include accuracy, precision, recall, F1-score, and Matthews Correlation Coefficient (MCC). The proposed ASD-Pipeline surpassed existing models, achieving a significantly higher accuracy of 96.18% compared to previous techniques ranging from 76.80% to 90.60%. It also scored 91.51% precision, 91.63% recall, 95.57% F1-score, and 92.51% specificity. These findings emphasize the pipeline's efficacy in enhancing generalization and tackling difficulties such as feature relevance, behavioral grouping, and class imbalance for ASD prediction. The ASD-Pipeline offers a reliable, interpretable, and modular machine learning solution for ASD prediction. Its incorporated method tackles critical challenges in feature relevance, behavioral variability, and data imbalance, rendering it a promising tool for healthcare practitioners and researchers seeking data-driven insights into early ASD detection.

PRONOY CHOPRA Reviewer

badge Review Request Accepted

PRONOY CHOPRA Reviewer

30 May 2025 01:23 PM

badge Approved

Relevance and Originality

Methodology

Validity & Reliability

Clarity and Structure

Results and Analysis

Relevance and Originality

The research presents a highly relevant and innovative approach to autism detection by addressing three core challenges: feature relevance, behavioral variability, and class imbalance. By introducing the ASD-Pipeline, which unifies adaptive feature selection, clustering-based behavioral grouping, and synthetic data balancing, the study offers a fresh contribution to the field. The framework’s integration of techniques like ensemble learning, autism detection, and class imbalance resolution sets it apart from conventional models and underscores its originality in tackling a complex neurodevelopmental issue.

Methodology

The methodology is comprehensive and logically structured, combining Min-Max normalization, an ensemble feature selector (FlexiFeat), hybrid clustering with K-Means and DBSCAN, Cluster-SMOTE for class imbalance handling, and an ensemble classifier using Logistic Regression, Support Vector Machine, and Gradient Boosting. Each stage supports the next in a cohesive flow, enhancing the overall pipeline's effectiveness in autism detection and behavioral clustering. This integrated use of normalization, feature engineering, and machine learning techniques strengthens the pipeline's applicability in real-world datasets.

Validity & Reliability

The findings appear robust and well-supported, with the ASD-Pipeline achieving high scores across various performance metrics, including 96.18% accuracy, over 91% precision and recall, and strong values for F1-score and specificity. The inclusion of the Matthews Correlation Coefficient reinforces the credibility of the results by providing a balanced performance view. Although internal validity is strong, the study's reliability would be further enhanced with external validation using independent datasets, particularly given the emphasis on autism detection and predictive modeling.

Clarity and Structure

The article is well-organized, with a coherent presentation of the problem, proposed solution, and evaluation. Explanations for each pipeline stage are clear, contributing to strong interpretability. The language used is accessible while maintaining the technical precision necessary for topics such as feature selection, class imbalance, and ensemble learning. Despite its strengths, the addition of visual aids like diagrams or flowcharts could enhance clarity, especially for readers less familiar with behavioral clustering and machine learning frameworks.

Result Analysis

The result interpretation is thorough and data-driven, highlighting the pipeline’s superior performance across various benchmarks. The comparison with existing techniques is detailed and convincingly supports the study’s claims. The use of multiple performance indicators allows for a nuanced understanding of how the proposed system outperforms previous models in autism detection and behavioral variability handling.

avatar

IJ Publication Publisher

Respected Sir,

Thank you for your valuable feedback highlighting the strengths of our ensemble-based ASD-Pipeline, especially in feature selection and behavioral grouping. We acknowledge your constructive points regarding external validation and visualization, which are important for improving model generalizability and clarity.

Sincerely, thank you for your thoughtful review.

Publisher

User Profile

IJ Publication

Reviewer

User Profile

PRONOY CHOPRA

More Detail

User Profile

Paper Category

Computer Engineering

User Profile

Journal Name

IJNRD - INTERNATIONAL JOURNAL OF NOVEL RESEARCH AND DEVELOPMENT

User Profile

p-ISSN

User Profile

e-ISSN

2456-4184

Subscribe us to get updated

logo logo

Scholar9 is aiming to empower the research community around the world with the help of technology & innovation. Scholar9 provides the required platform to Scholar for visibility & credibility.

QUICKLINKS

  • What is Scholar9?
  • About Us
  • Mission Vision
  • Contact Us
  • Privacy Policy
  • Terms of Use
  • Blogs
  • FAQ

CONTACT US

  • logo +91 82003 85143
  • logo hello@scholar9.com
  • logo www.scholar9.com

© 2025 Sequence Research & Development Pvt Ltd. All Rights Reserved.

whatsapp