Abstract
Feature engineering remains a critical and resource-intensive phase in the machine learning (ML) lifecycle, especially within large-scale, heterogeneous data ecosystems. This paper investigates how automated machine learning (AutoML) and data mining techniques can be systematically orchestrated to develop scalable and adaptive feature engineering pipelines. We present a synthesis of existing literature and introduce architectural strategies that ensure both computational scalability and semantic alignment across disparate data sources. Visual artifacts such as flowcharts and tabular summaries aid in illustrating the challenges and solutions in constructing robust, automated feature transformation pipelines. Our findings suggest that the integration of AutoML with knowledge-driven feature selection leads to enhanced model performance and generalization across diverse domains.
View more >>