ABSTRACT

rfing

Revista Facultad de Ingeniería

Rev. Fac. ing.

0121-1129

Universidad Pedagógica y Tecnológica de Colombia

10.19053/01211129.v34.n73.2025.20194

Articles

EFFORT ESTIMATION IN SOFTWARE DEVELOPMENT PROJECTS USING SUPERVISED MACHINE LEARNING TECHNIQUES

Estimación de esfuerzo en proyectos de desarrollo de software utilizando técnicas de aprendizaje automático

ESTIMATIVA DE ESFORÇO EM PROJETOS DE DESENVOLVIMENTO DE SOFTWARE UTILIZANDO TÉCNICAS DE APRENDIZADO DE MÁQUINA

0000-0002-4492-1915

Getial-Barragán

Jesús

0000-0002-0006-6654

Timarán-Pereira

Ricardo

0009-0009-4007-8420

Bastidas-Torres

David-Ramiro

1 Universidad de Nariño (Pasto-Nariño, Colombia). jesusgetial@udenar.edu.co Universidad de Nariño Universidad de Nariño

Pasto Nariño

Colombia jesusgetial@udenar.edu.co 2 Universidad de Nariño (Pasto-Nariño, Colombia). ritimar@udenar.edu.co Universidad de Nariño Universidad de Nariño

Pasto Nariño

Colombia ritimar@udenar.edu.co 3 Pontificia Universidad Javeriana (Cali-Vale del Cauca, Colombia). davba@javerianacali.edu.co Pontificia Universidad Javeriana Pontificia Universidad Javeriana

Cali Vale del Cauca

Colombia davba@javerianacali.edu.co

AUTHORS’ CONTRIBUTION

Jesús-Alberto Getial-Barragán: Research, Data curation, Software, Writing original draft. Ricardo Timarán-Pereira: Conceptualization, Supervision, Validation, Writing - review & editing. David Ramiro-Bastidas: Formal analysis, Writing - review & editing.

CONFLICT OF INTEREST

The authors declare no conflict of interest.

17 10 2025

Jul-Sep 2025

34 73

24 06 2025 09 09 2025 30 09 2025

This is an open-access article distributed under the terms of the Creative Commons Attribution License

ABSTRACT

This paper presents the results of a research project focused on effort estimation in software development using supervised machine learning techniques. To structure the analysis process, the CRISP-DM methodology was adopted, given that it is recognized for its comprehensive approach and wide acceptance in data mining. The study was based on a dataset provided by the International Software Benchmarking Standards Group (ISBSG), to which rigorous cleaning, transformation, and variable selection procedures were applied. Four effort categories were defined, and key variables for their classification were identified, including the functional size of the software, team productivity, programming language, and the implementation platform. Eight predictive models were developed using representative supervised learning algorithms: AdaBoost, Decision Trees, Random Forests, SVM, Multilayer Perceptron, KNN, Naive Bayes, and Logistic Regression. Their evaluation was carried out using metrics such as the F1-score, MCC, ROC-AUC, Gini index, accuracy, and standard deviations to assess performance and stability. The results show that tree-based models, particularly Random Forest, offer superior performance, achieving an accuracy of 80%. It is concluded that having systematized and high-quality data is fundamental for building reliable predictive models. As future work, the study proposes examining additional ensemble configurations, incorporating new algorithms, and using updated versions of the ISBSG repository.

RESUMEN

Este artículo presenta los resultados de un proyecto de investigación orientado a estimar el esfuerzo en proyectos de desarrollo de software mediante técnicas supervisadas de aprendizaje automático. El análisis se desarrolló siguiendo la metodología CRISP-DM, reconocida por su enfoque estructurado y amplia aceptación en minería de datos. El estudio utilizó el conjunto de datos del International Software Benchmarking Standards Group (ISBSG), al cual se aplicaron rigurosos procesos de limpieza, transformación y selección de variables. Se definieron cuatro categorías de esfuerzo y se identificaron variables clave para su clasificación, como el tamaño funcional del software, la productividad del equipo, el lenguaje de programación y la plataforma de implementación. Se construyeron ocho modelos predictivos empleando algoritmos representativos del aprendizaje supervisado: AdaBoost, Árboles de Decisión, Bosques Aleatorios, SVM, Perceptrón Multicapa, KNN, Naive Bayes y Regresión Logística. Su desempeño se evaluó mediante métricas como F1-score, MCC, ROC-AUC, índice de Gini, exactitud y desviaciones estándar, con el propósito de medir rendimiento y estabilidad. Los resultados muestran que los modelos basados en árboles, especialmente el Bosque Aleatorio, alcanzan el mejor desempeño, con una exactitud del 80 %. El estudio destaca contar con datos sistematizados y de alta calidad para construir modelos predictivos confiables. Como trabajo futuro se propone explorar configuraciones de ensamble, incorporar nuevos algoritmos y usar versiones actualizadas del ISBSG

RESUMO

Este artigo apresenta os resultados de um projeto de pesquisa com o objetivo de estimar o esforço em projetos de desenvolvimento de software utilizando técnicas de aprendizado de máquina supervisionado. A análise foi desenvolvida seguindo a metodologia CRISP-DM, reconhecida por sua abordagem estruturada e ampla aceitação em mineração de dados. O estudo utilizou o conjunto de dados do International Software Benchmarking Standards Group (ISBSG), ao qual foram aplicados processos rigorosos de limpeza, transformação e seleção de variáveis. Quatro categorias de esforço foram definidas e variáveis-chave para sua classificação foram identificadas, como o tamanho funcional do software, a produtividade da equipe, a linguagem de programação e a plataforma de implementação. Oito modelos preditivos foram construídos utilizando algoritmos representativos de aprendizado supervisionado: AdaBoost, Árvores de Decisão, Florestas Aleatórias, SVM, Perceptron Multicamadas, KNN, Naive Bayes e Regressão Logística. Seu desempenho foi avaliado utilizando métricas como F1-score, MCC, ROC-AUC, índice de Gini, acurácia e desvios padrão, com o objetivo de mensurar o desempenho e a estabilidade. Os resultados mostram que os modelos baseados em árvores, especialmente a Floresta Aleatória, alcançam o melhor desempenho, com uma precisão de 80%. O estudo destaca a importância de se ter dados sistematizados e de alta qualidade para a construção de modelos preditivos confiáveis. Trabalhos futuros propõem explorar configurações de ensemble, incorporar novos algoritmos e utilizar versões atualizadas do ISBSG.

Keywords: decision tree effort estimation machine learning random forest software development

Palabras clave: aprendizaje automático árbol de decisión bosque aleatorio desarrollo de software estimación de esfuerzo

Palavras-chave: aprendizado de máquina árvore de decisão floresta aleatória desenvolvimento de software estimativa de esforço

1. INTRODUCTION

Effort estimation in software projects is critical for resource allocation, scheduling, and cost control ^[¹^,²^]. However, it remains difficult to predict accurately in dynamic and heterogeneous contexts ^[³^,⁴^]. Traditional approaches, such as parametric models or function points, often fail to capture project complexity or adapt to agile development practices ^[⁵^]. Machine learning techniques have been explored as alternatives; however, many studies rely on outdated datasets or evaluate only a few algorithms. Few studies have systematically assessed supervised models using recent ISBSG releases ^[⁶^,⁷^]. Existing evidence shows gains in accuracy and generalization ^[¹^] through models such as decision trees ^[⁸^], ensemble methods ^[⁹^,¹⁰^], neural networks ^[¹¹^], analogy-based approaches ^[¹²^], and hybrid techniques ^[¹³^]. Among these, Random Forest ^[¹⁴^] is highly effective due to its robustness with noisy and high-dimensional data.

This study applies supervised classification models to ISBSG Release 2022R1 ^[¹⁵^] following the CRISPDM methodology ^[¹⁶^]. Eight algorithms were evaluated: AdaBoost, Decision Tree, Random Forest, SVM, Multilayer Perceptron, KNN, Naïve Bayes, and Logistic Regression. Performance was measured using the F1-score, MCC, Gini, AUC, and accuracy ^[¹⁷^], complemented by standard deviations to assess stability and generalization. Effort was discretized into four balanced quantile-based categories, reducing the influence of extreme values and improving the interpretability for project managers. Within this framework, functional size, productivity, programming language, and development platform emerged as the most influential factors for distinguishing effort levels.

2. MATERIALS AND METHODS

This study used the ISBSG Release 1 dataset (July 2022), which compiles historical information on software projects and serves as the basis for model analysis and validation ^[¹⁵^]. Data processing and modeling were performed using Python 3 and specialized libraries, such as Pandas, NumPy, and Scikit-learn.

This research followed the CRISP-DM methodology across five phases. In the business understanding phase, the problem of effort estimation in software projects was defined. Data understanding involved analyzing the ISBSG repository in terms of variable nature, distribution, and quality.

Data preparation included selecting a representative subset through cleaning, imputation, encoding, normalization, outlier removal, and discretization of the target variable. In the modeling phase, eight supervised algorithms widely discussed in the literature were implemented: AdaBoost, Decision Trees, Random Forest, SVM, Multilayer Perceptron, KNN, Naïve Bayes, and Logistic Regression.

Finally, in the evaluation phase, standard metrics (F1-score, MCC, ROC-AUC, Gini index, and accuracy) and their standard deviations were applied to assess the model robustness and generalization capability.

3. RESULTS 3.1 Exploratory Data Analysis

From the ISBSG repository, a subset of 3,124 projects and 233 variables (95 numerical and 138 categorical) was extracted. The selection considered Main Frame, Personal Computer, and MultiPlatform environments; functional size ratings A or B; normalization rates between 0.9 and 1.3; and NESMA or IFPUG 4+ counting approaches, excluding lines of code.

After applying a 30% missing-value threshold per variable, 26 variables were retained, of which 20 were selected based on low collinearity (<0.84, see Table 2). Imputation strategies varied by data type and proportion of missing values: mean and median for numerical variables (<10% and 10-30%), and mode or C4.5 decision tree for categorical variables (<10% and 10-30%).

Numerical variables were normalized to the [0,1] range, whereas Project Year and Resource Level were ordinally encoded. The other categorical variables were transformed using ordinal encoding. Outliers were handled through a combination of Winsorization, IQR, and Z-score techniques. Finally, the target variable (effort) was discretized into four categories according to its quartiles (Table 1).

Table 1 Categories based on quantiles of the statistical distribution

Category 1 -∞<Value<Q1 Very low effort -∞ <Value< 500 hours

Category 2 Q1<Value<Median Moderate effort 500 hours <Value< 1050 hours

Category 3 Median<Value<Q3 High effort 1050 hours<Value<2153 hours

Category 4 Q3<Value<∞ Very high effort 2153 hours <Value< ∞

Table 2 Final dataset variables

Numeric variables Non-numeric variables

Functional size

Delivery speed

Total project elapsed time

Project execution year

Adjusted function points

Summary work effort in hours

Resource level

Industry sector

Primary programming language

Software architecture type

Functional size rating

Application group type

Software development type

Development platform

Language type

Counting approach

Relative size

Implementation date

Recording method

Function points standard

Numeric variables	Non-numeric variables
Functional size Delivery speed Total project elapsed time Project execution year Adjusted function points Summary work effort in hours Resource level	Industry sector Primary programming language Software architecture type Functional size rating Application group type Software development type Development platform Language type Counting approach Relative size Implementation date Recording method Function points standard

3.2 Results

Hyperparameter optimization was performed using a grid search to select the configurations with the highest cross-validation accuracy. As shown in Table 3, Random Forest and Decision Tree achieved the best results in accuracy, F1, and MCC, which is consistent with the findings of Nassif et al. ^[⁶^] and Zakrani et al. ^[¹⁴^], who also highlighted the stability of Random Forest in high-dimensional settings. To assess generalization and robustness, three partitioning schemes were applied (10%, 20%, and 30% for validation), with the remainder allocated for training and testing. This procedure enabled a sensitivity analysis across different configurations and prevented the results from depending on a specific data split. Among the tested combinations, the 10% validation, 20% test, and 70% training split offered the best balance for MSE, accuracy, and stability, yielding more consistent metrics than the other alternatives.

The detailed results of these configurations are presented in Tables 4 and 5, corresponding to the Decision Tree and Random Forest, respectively. Figures 1 and 2 illustrate the evolution of the mean squared error (MSE) for the training, validation, and test sets as the model depth increases.

Table 3 Metrics result for machine learning model selection

Model Accuracy F1 MCC Gini ROC

Adaboost 0.6906 0.6944 0.5872 0.5634 0.8271

Decision Tree 0.7771 0.7791 0.7033 0.7999 0.8999

Random Forest 0.8035 0.8063 0.7377 0.8844 0.9422

SVM 0.5411 0.5432 0.3934 0.4980 0.7490

Multilayer Perceptron 0.6906 0.6949 0.5872 0.7889 0.8945

KNN 0.5117 0.5095 0.3508 0.5089 0.7545

Naive Bayes 0.3006 0.2488 0.0897 0.1921 0.5960

Logistic 0.3974 0.3826 0.2061 0.3244 0.6622

Model	Accuracy	F1	MCC	Gini	ROC
Adaboost	0.6906	0.6944	0.5872	0.5634	0.8271
Decision Tree	0.7771	0.7791	0.7033	0.7999	0.8999
Random Forest	0.8035	0.8063	0.7377	0.8844	0.9422
SVM	0.5411	0.5432	0.3934	0.4980	0.7490
Multilayer Perceptron	0.6906	0.6949	0.5872	0.7889	0.8945
KNN	0.5117	0.5095	0.3508	0.5089	0.7545
Naive Bayes	0.3006	0.2488	0.0897	0.1921	0.5960
Logistic	0.3974	0.3826	0.2061	0.3244	0.6622

Table 4 Metrics result for the best partition in Decision Tree

Dataset %Validation %Test MSE Accuracy Recall F1 Std MSE Std Accuracy

Validation 10 20 0.415 0.732 0.736 0.730 0.333 0.104

Training 10 20 0.202 0.877 0.876 0.870 0.368 0.139

Test 10 20 0.329 0.787 0.786 0.781 0.346 0.112

Table 5 Metrics result for the best partition in Random Forest

Dataset %Validation %Test MSE Accuracy Recall F1 Std MSE Std Accuracy

Validation 10 20 0.240 0.821 0.819 0.823 0.024 0.009

Training 10 20 0.022 0.984 0.984 0.984 0.003 0.002

Test 10 20 0.193 0.848 0.847 0.848 0.006 0.006

Figure 1 <italic>MSE vs. Decision Tree Depth</italic>

Figure 2 <italic>MSE vs Random Forest Depth</italic>

4. DISCUSSION

Based on the results in Table 2, Random Forest emerged as the most robust model, achieving an accuracy of 0.8035, a macro F1 of 0.8063, the highest MCC (0.7377), a Gini coefficient of 0.8844, and an ROC-AUC of 0.9422, demonstrating stable classification capability. Decision Tree ranked second with an accuracy of 0.7771, macro F1 of 0.7791, MCC of 0.7033, Gini of 0.7999, and ROC-AUC of 0.8999. In both cases, tree-based models significantly outperformed AdaBoost and Multilayer Perceptron (accuracy near 0.69 and MCC of 0.5872), while SVM, KNN, Naïve Bayes, and Logistic Regression showed weaker performance, with accuracy not exceeding 0.40 in some cases, which limits their generalization capability. Beyond competitive performance, Decision Tree enabled the derivation of interpretable rules linking Adjusted Function Points (AFP), Productivity Delivery Rate (PDR), execution time, architecture, and programming language with four effort categories: (1) very low (≤ 500 hours), associated with small AFP, PDR ≤ 4.24, standard architectures, and short durations; (2) moderate (500- 1050 hours), characterized by intermediate PDR, longer delivery times, and common languages; (3) high (1050-2153 hours), defined by extended execution, accelerated PDR, and specialized languages or complex architectures; and (4) very high (> 2153 hours), associated with moderate AFP, PDR > 25, longer durations, and less common platforms, implying greater uncertainty and planning risk.

Random Forest, in contrast, showed signs of overfitting, with training accuracy near 98.5% versus 84.7% in testing, but maintained stability across metrics such as MSE and accuracy, with reduced standard deviations. The performance was sensitive to the validation and test proportions. The 10% validation and 20% test split yielded the best balance (84.7% accuracy and MSE of 0.192), whereas larger proportions reduced the learning efficiency. The number of estimators improved robustness until stabilizing between 50 and 100, in combination with moderate depths (8-12 nodes), where generalization was optimized without excessive computational cost.

In summary, both Decision Tree and Random Forest proved to be the most effective approaches. The former, for its interpretability and ability to generate useful classification rules, and the latter, for its higher stability, accuracy, and robustness, though requiring careful hyperparameter tuning to avoid overfitting and to maintain a balance between performance and efficiency. While Decision Tree provided interpretability and computational efficiency, its lower generalization and sensitivity to data partitioning reduce its reliability in demanding scenarios. In contrast, Random Forest, delivered a more consistent and robust performance, with lower variance and stability across variable configurations, and an optimal range of 50-100 estimators that balances accuracy and computational cost. These characteristics position Random Forest as the most suitable option for effort estimation on the ISBSG dataset, offering a superior balance of accuracy, stability, and practical applicability in real-world contexts.

5. CONCLUSIONS

Discretizing effort into four balanced classes facilitated the classification and improved interpretability. Random Forest proved to be the most effective model (80.3% accuracy), standing out for its generalization capability and stability within an optimal range of 50-100 estimators. Decision Tree, although less accurate, provided interpretable rules that may be valuable in contexts with limited resources. The other models exhibited overfitting or performance below 55%.

The optimal configuration was identified as 10% validation and 20% testing. The key variables influencing the classification included software size, team productivity, development platform, and programming language. A relevant limitation was the reduction of the ISBSG dataset from 233 to 26 variables due to missing data, underscoring the need to improve dataset quality.

From a practical perspective, a model with 80% accuracy can reduce uncertainty in the early project phases and support better resource allocation. Future work will explore hybrid models integrating agility metrics ^[²^,¹⁸^] and explainable AI approaches to strengthen the utility and transparency of effort estimation.

REFERENCES [1]

[1] V. Thakur, K. Dutta, “Machine learning based effort estimation models for software development projects related datasets with diverse features,” in Proceedings of the 2nd International Conference on Computational Intelligence, Communication Technology and Networks, Ghaziabad, India, 2025, pp. 807813. https://doi.org/10.1109/CICTN64563.2025.10932601

Thakur

Dutta

Machine learning based effort estimation models for software development projects related datasets with diverse featuresProceedings of the nd

2International Conference on Computational Intelligence, Communication Technology and Networks

Ghaziabad, India

2025 807813 807813

https://doi.org/10.1109/CICTN64563.2025.10932601

[2]

[2] X. Zhao, X. Xiong, Z. Mansor, R. Razali, M. Z. Ahmad Nazri, L. Li “A data-driven cost estimation model for agile development based on Kolmogorov-Arnold networks and AdamW optimization,” Journal of King Saud University - Computer and Information Sciences, vol. 37, e85, 2025. https://doi.org/10.1007/s44443-025-00058-7

Zhao

Xiong

Mansor

Razali

Ahmad Nazri

M. Z.

A data-driven cost estimation model for agile development based on Kolmogorov-Arnold networks and AdamW optimization

Journal of King Saud University - Computer and Information Sciences 37 e85 2025

https://doi.org/10.1007/s44443-025-00058-7

[3]

[3] J. A. Timana Peña, C. Piñeros Rodríguez, L. Sierra Martínez, D. Peluffo Ordóñez, “Effort estimation in agile software development: A systematic map study,” INGE CUC, vol. 19, no. 1, pp. 22-36, 2023. https://doi.org/10.17981/ingecuc.19.1.2023.03

Timana Peña

J. A.

Piñeros Rodríguez

Sierra Martínez

Peluffo Ordóñez

Effort estimation in agile software development: A systematic map study

INGE CUC 19 1 22 36 2023

https://doi.org/10.17981/ingecuc.19.1.2023.03

[4]

[4] M. Perkusich, L. C. e Silva, A. Costa, F. Ramos, R. Saraiva, A. Freire, et al., “Intelligent software engineering in the context of agile software development: A systematic literature review,” Information and Software Technology, vol. 119, e106241, Mar. 2020. https://doi.org/10.1016/j.infsof.2019.106241

Perkusich

e Silva

L. C.

Costa

Ramos

Saraiva

Freire

Intelligent software engineering in the context of agile software development: A systematic literature review

Information and Software Technology 119 e106241 2020

https://doi.org/10.1016/j.infsof.2019.106241

[5]

[5] M. Fernández-Diego, F. González-Ladrón-de-Guevara, “Potential and limitations of the ISBSG dataset in enhancing software engineering research: A mapping review,” Information and Software Technology, vol. 56, no. 6, pp. 527-544, Jun. 2014. https://doi.org/10.1016/j.infsof.2014.01.003

Fernández-Diego

González-Ladrón-de-Guevara

Potential and limitations of the ISBSG dataset in enhancing software engineering research: A mapping review

Information and Software Technology 56 6 527 544 2014

https://doi.org/10.1016/j.infsof.2014.01.003

[6]

[6] A. B. Nassif, M. Azzeh, L. F. Capretz, D. Ho, “A comparison between decision trees and decision tree forest models for software development effort estimation,” in Third International Conference on Communications and Information Technology (ICCIT), Beirut, Lebanon, 2013, pp. 220-224. https://doi.org/10.1109/ICCITechnology.2013.6579553

Nassif

A. B.

Azzeh

Capretz

L. F.

A comparison between decision trees and decision tree forest models for software development effort estimation ThirdInternational Conference on Communications and Information Technology (ICCIT)

Beirut, Lebanon

2013 220 224

https://doi.org/10.1109/ICCITechnology.2013.6579553

[7]

[7] Ritu and P. Bhambri, “Enhancing software development effort estimation with a cloud-based data framework using use case points, fuzzy logic, and machine learning,” Discover Computing, vol. 28, e143, 2025. https://doi.org/10.1007/s10791-025-09668-1

Ritu

Bhambri

Enhancing software development effort estimation with a cloud-based data framework using use case points, fuzzy logic, and machine learning

Discover Computing 28 e143 2025

https://doi.org/10.1007/s10791-025-09668-1

[8]

[8] A. Najm, A. Zakrani, A. Marzak, “Decision trees based software development effort estimation: A systematic mapping study,” in Proceedings of the International Conference on Computer Science and Renewable Energies, Agadir, Morocco, 2019, pp. 1-6. https://doi.org/10.1109/ICCSRE.2019.8807544

Najm

Zakrani

Marzak

Decision trees based software development effort estimation: A systematic mapping study Proceedings of the International Conference on Computer Science and Renewable Energies

Agadir, Morocco

2019 1 6

https://doi.org/10.1109/ICCSRE.2019.8807544

[9]

[9] M. Hosni, A. Idri, A. Abran, “Investigating heterogeneous ensembles with filter feature selection for software effort estimation,” in Proceedings of the ACM International Conference on Software Engineering and Knowledge Engineering, Pittsburgh, USA, Jul. 2017, pp. 207-220. https://doi.org/10.1145/3143434.3143456

Hosni

Idri

Abran

Investigating heterogeneous ensembles with filter feature selection for software effort estimation Proceedings of the ACM International Conference on Software Engineering and Knowledge Engineering

Pittsburgh, USA

2017 207 220

https://doi.org/10.1145/3143434.3143456

[10]

[10] I. A. Al-Naimy, M. A. Al-Jawaherry, “Software effort estimation using ensemble learning methods,” AIP Conference Proceedings, vol. 3264, no. 1, e040021, Mar. 2025. https://doi.org/10.1063/5.0259225

Al-Naimy

I. A.

Al-Jawaherry

M. A.

Software effort estimation using ensemble learning methods

AIP Conference Proceedings 3264 1 e040021 2025

https://doi.org/10.1063/5.0259225

[11]

[11] A. G. Priya Varshini, A. Kumari J. Ramprasath, S. Rishi R, S. Balakrishnan, D. Deepak, “Optimized Convolutional Neural Network Model for Software Effort Estimation,” in Proceedings of the 2024 Third International Conference on Smart Technologies in Systems and Networking Computing, Villupuram, India, 2024, pp. 1-6. https://doi.org/10.1109/ICSTSN61422.2024.10671053

Priya Varshini

A. G.

Kumari

Ramprasath

Rishi

Balakrishnan

R, S.

Deepak

Optimized Convolutional Neural Network Model for Software Effort Estimation Proceedings of the 2024 Third International Conference on Smart Technologies in Systems and Networking Computing

Villupuram, India

2024 1 6

https://doi.org/10.1109/ICSTSN61422.2024.10671053

[12]

[12] A. Idri , I. Abnane, “Fuzzy analogy based effort estimation: An empirical comparative study,” in Proceedings of the IEEE International Conference on Computer and Information Technology, Helsinki, Finland, 2017, pp. 114-121. https://doi.org/10.1109/CIT.2017.29

Idri

Abnane

Fuzzy analogy based effort estimation: An empirical comparative study Proceedings of the IEEE International Conference on Computer and Information Technology

Helsinki, Finland

2017 114 121

https://doi.org/10.1109/CIT.2017.29

[13]

[13] P. Rai, D. K. Verma, S. Kumar, “A hybrid model for prediction of software effort based on team size,” IET Software, vol. 15, no. 6, pp. 546-556, Dec. 2021. https://doi.org/10.1049/sfw2.12048

Rai

Verma

D. K.

Kumar

A hybrid model for prediction of software effort based on team size

IET Software 15 6 546 556 2021

https://doi.org/10.1049/sfw2.12048

[14]

[14] A. Zakrani, M. Hain, A. Namir, “Software development effort estimation using random forests: An empirical study and evaluation,” International Journal of Intelligent Engineering and Systems, vol. 11, no. 6, pp. 300-311, Dec. 2018. https://doi.org/10.22266/ijies2018.1231.30

Zakrani

Hain

Namir

Software development effort estimation using random forests: An empirical study and evaluation

International Journal of Intelligent Engineering and Systems 11 6 300 311 2018

https://doi.org/10.22266/ijies2018.1231.30

[15]

[15] ISBSG, Academic research projects, 2023. https://www.isbsg.org

ISBSG

Academic research projects 2023

https://www.isbsg.org

[16]

[16] A. M. Shimaoka, R. C. Ferreira, A. Goldman, “The evolution of CRISP-DM for Data Science: Methods, Processes and Frameworks,” SBC Reviews, vol. 4, no. 1, pp. 28-43, Oct. 2024. https://doi.org/10.5753/reviews.2024.3757

Shimaoka

A. M.

Ferreira

R. C.

Goldman

The evolution of CRISP-DM for Data Science: Methods, Processes and Frameworks

SBC Reviews 4 1 28 43 2024

https://doi.org/10.5753/reviews.2024.3757

[17]

[17] F. González-Ladrón-de-Guevara, M. Fernández-Diego, “ISBSG variables most frequently used for software effort estimation: A mapping review,” in Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement, Torino, Italy, Sep. 2014, pp. 1-4. https://doi.org/10.1145/2652524.2652550

González-Ladrón-de-Guevara

Fernández-Diego

ISBSG variables most frequently used for software effort estimation: A mapping review Proceedings of the 8th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement

Torino, Italy

2014 1 4

https://doi.org/10.1145/2652524.2652550

[18]

[18] B. Alsaadi, K. Saeedi, “Data-driven effort estimation techniques of agile user stories: A systematic literature review,” Artificial Intelligence Review, vol. 55, no. 7, pp. 5485-5516, Oct. 2022. https://doi.org/10.1007/s10462-021-10132-x

Alsaadi

Saeedi

Data-driven effort estimation techniques of agile user stories: A systematic literature review

Artificial Intelligence Review 55 7 5485 5516 2022

https://doi.org/10.1007/s10462-021-10132-x

How to cite:

J. Getial-Barragán; R. Timarán-Pereira & D. R. Bastidas-Torres “Effort Estimation in Software Development Projects Using Supervised Machine Learning Techniques”. Revista Facultad de Ingeniería, vol. 34, no. 73, e20194, 2025. https://doi.org/10.19053/01211129.v34.n73.2025.20194

FUNDING

No external funding was received for this study.

Category 1	-∞<Value<Q1	Very low effort	-∞ <Value< 500 hours
Category 2	Q1<Value<Median	Moderate effort	500 hours <Value< 1050 hours
Category 3	Median<Value<Q3	High effort	1050 hours<Value<2153 hours
Category 4	Q3<Value<∞	Very high effort	2153 hours <Value< ∞

Dataset	%Validation	%Test	MSE	Accuracy	Recall	F1	Std MSE	Std Accuracy
Validation	10	20	0.415	0.732	0.736	0.730	0.333	0.104
Training	10	20	0.202	0.877	0.876	0.870	0.368	0.139
Test	10	20	0.329	0.787	0.786	0.781	0.346	0.112