Paper Title

Essential Machine Learning Models: Comprehensive Study of Decision Trees and Regression Models

Keywords

  • Machine Learning
  • Supervised Learning
  • Regression Models
  • Decision Tree Algorithms
  • Ridge
  • Lasso
  • and Elastic Net Regression
  • Impurity Measures
  • Cost-Complexity Pruning (CCP)
  • Scikit-Learn (sklearn)
  • Model Evaluation and Comparison

Article Type

Research Article

Journal

TIJER - INTERNATIONAL RESEARCH JOURNAL

Published On

December, 2025

Downloads

Abstract

Machine learning plays an essential role in modern data-driven decision systems, where accurate prediction, model interpretability, and computational efficiency are critical, reflecting the foundational principles established in statistical learning theory by Vapnik and Mitchell [1], [2]. This paper presents a comprehensive study of two fundamental supervised learning families regression models and decision-tree algorithms along with their practical implementation in the Scikit-Learn framework [14]. The regression component covers linear, polynomial, and regularized models including Ridge, Lasso, and Elastic Net, highlighting their mathematical foundations introduced by Gauss and Legendre [3], [4] and expanded through modern regularization techniques developed by Hoerl and Kennard, Tibshirani, and Zou and Hastie [5],[7]. The decision-tree section examines classification and regression trees through impurity-based splitting mechanisms, such as Gini, entropy, and variance reduction, grounded in the seminal work of Hunt et al. [8], Quinlan’s ID3 and C4.5 algorithms [9], [10], and the CART methodology introduced by Breiman et al. [11]. It further evaluates pruning techniques particularly cost-complexity pruning used to address overfitting and improve generalization performance. Comparative analysis illustrates the structural and functional differences between linear models and decision trees, emphasizing their complementary strengths in handling linear trends, nonlinear patterns, and heterogeneous feature interactions, consistent with insights from modern ML literature [16], [17]. Scikit-Learn’s unified API is showcased as a versatile platform that streamlines model construction, tuning, and evaluation, supporting reproducible and scalable ML workflows [14]. Overall, this work provides a structured and accessible framework for understanding, comparing, and applying regression and decision-tree models, offering practical insights for developing robust and interpretable machine learning solutions.

View more »

Uploaded Document Preview