Comparison Between Machine Learning Models for Yield Forecast in Cocoa Crops in Santander, Colombia

Henry Lamos-Díaz; David Esteban Puentes-Garzón; Diego Alejandro Zarate-Caicedo

doi:10.19053/01211129.v29.n54.2020.10853

Vol. 29 No. 54 (2020)
Continuos Publication

Papers

Comparison Between Machine Learning Models for Yield Forecast in Cocoa Crops in Santander, Colombia

https://doi.org/10.19053/01211129.v29.n54.2020.10853

Published 2020-05-15

Henry Lamos-Díaz, Ph. D.
David Esteban Puentes-Garzón, M.Sc.
Diego Alejandro Zarate-Caicedo, Ph. D.

Henry Lamos-Díaz, Ph. D.
Universidad Industrial de Santander

David Esteban Puentes-Garzón, M.Sc.
Universidad Industrial de Santander

Diego Alejandro Zarate-Caicedo, Ph. D.
Corporación Colombiana de Investigación Agropecuaria-AGROSAVIA

How to Cite

Lamos-Díaz, H., Puentes-Garzón, D. E., & Zarate-Caicedo, D. A. (2020). Comparison Between Machine Learning Models for Yield Forecast in Cocoa Crops in Santander, Colombia. Revista Facultad de Ingeniería, 29(54), e10853. https://doi.org/10.19053/01211129.v29.n54.2020.10853

Download Citation

All articles included in the Revista Facultad de Ingeniería are published under the Creative Commons (BY) license.

Authors must complete, sign, and submit the Review and Publication Authorization Form of the manuscript provided by the Journal; this form should contain all the originality and copyright information of the manuscript.

The authors who publish in this Journal accept the following conditions:

a. The authors retain the copyright and transfer the right of the first publication to the journal, with the work registered under the Creative Commons attribution license, which allows third parties to use what is published as long as they mention the authorship of the work and the first publication in this Journal.

b. Authors can make other independent and additional contractual agreements for the non-exclusive distribution of the version of the article published in this journal (eg, include it in an institutional repository or publish it in a book) provided they clearly indicate that the work It was first published in this Journal.

c. Authors are allowed and recommended to publish their work on the Internet (for example on institutional or personal pages) before and during the process.
review and publication, as it can lead to productive exchanges and a greater and faster dissemination of published work.

d. The Journal authorizes the total or partial reproduction of the content of the publication, as long as the source is cited, that is, the name of the Journal, name of the author (s), year, volume, publication number and pages of the article.

e. The ideas and statements issued by the authors are their responsibility and in no case bind the Journal.

Abstract

The identification of influencing factors in crop yield (kg·ha^-1) provides essential information for decision-making processes related to the prediction and improvement of productivity, which gives farmers the opportunity to increase their income. The current study investigates the application of multiple machine learning algorithms for cocoa yield prediction and influencing factors identification. The Support Vector Machines (SVM) and Ensemble Learning Models (Random Forests, Gradient Boosting) are compared with Least Absolute Shrinkage and Selection Operator (LASSO) regression models. The considered predictors were climate conditions, cocoa variety, fertilization level and sun exposition in an experimental crop located in Rionegro, Santander. Results showed that Gradient Boosting is the best prediction alternative with Coefficient of determination (R²) = 68%, Mean Absolute Error (MAE) = 13.32, and Root Mean Square Error (RMSE) = 20.41. The crop yield variability is explained mainly by the radiation one month before harvest, the accumulated rainfall on the harvest month, and the temperature one month before harvest. Likewise, the crop yields are evaluated based on the kind of sun exposure, and it was found that radiation one month before harvest is the most influential factor in shade-grown plants. On the other hand, rainfall and soil moisture are determining variables in sun-grown plants, which is associated with the water requirements. These results suggest a differentiated management for crops depending on the kind of sun exposure to avoid compromising productivity, since there is no significant difference in the yield of both agricultural managements.

Keywords

agricultural-yield, agroforestry-system, cocoa, machine-learning, prediction, productivity

PDF PDF (Español) XML

References

D. Jiménez, J. Cock, A. Jarvis, J. Garcia, H. F. Satizábal, P. Van-Damme, A. Peréz-Uribe, and M. Barreto-Sanz, “Interpretation of commercial production information: A case study of lulo (Solanum quitoense), an under-researched Andean fruit,” Agricultural Systems, vol. 104 (3), pp. 258-270, Mar. 2011. https://doi.org/10.1016/j.agsy.2010.10.004 DOI: https://doi.org/10.1016/j.agsy.2010.10.004
J. W. Jones, J. M. Antle, B. Basso, K. J. Boote, R. T. Conant, I. Foster, H. C. J. Godfay, M. Herrero, R. E. Howitt, S. Janssen, B. A. Keating, R. Munoz-Carpena, C. H. Porter, C. Rosenzweig, and T. R. Wheeler, “Brief history of agricultural systems modeling,” Agricultural Systems, vol. 155, pp. 240-254, Jul. 2017. https://doi.org/10.1016/j.agsy.2016.05.014 DOI: https://doi.org/10.1016/j.agsy.2016.05.014
I. Diaz, S. M. Mazza, E. F. Combarro, L. I. Gimenez, and J. E. Gaiad, “Machine learning applied to the prediction of citrus production,” Spanish Journal of Agricultural Research, vol. 15 (2), e0205, Jun. 2017. https://doi.org/10.5424/sjar/2017152-9090 DOI: https://doi.org/10.5424/sjar/2017152-9090
S. T. Drummond, K. A. Sudduth, A. Joshi, S. J. Birrell, and N. R. Kitchen, “Statistical and neural methods for site-specific yield prediction,” Transactions of the ASAE, vol. 46 (1), pp. 5-14, 2003. https://doi.org/10.13031/2013.12541 DOI: https://doi.org/10.13031/2013.12541
J. L. De Paepe, and R. Alvarez, “Wheat Yield Gap in the Pampas: Modeling the Impact of Environmental Factors,” Agronomy, Soils & Environmental Quality, vol. 108 (4), pp. 1367-1378, 2016. https://doi.org/10.2134/agronj2015.0482 DOI: https://doi.org/10.2134/agronj2015.0482
J. D. R. Soares, M. Pasqual, W. S. Lacerda, S. O. Silva, and S. L. R. Donato, “Comparison of techniques used in the prediction of yield in banana plants,” Scientia Horticulturae, vol. 167, pp. 84-90, Mar. 2014. https://doi.org/10.1016/j.scienta.2013.12.012 DOI: https://doi.org/10.1016/j.scienta.2013.12.012
A. Shekoofa, Y. Emam, N. Shekoufa, M. Ebrahimi, and E. Ebrahimie, “Determining the Most Important Physiological and Agronomic Traits Contributing to Maize Grain Yield through Machine Learning Algorithms: A New Avenue in Intelligent Agriculture,” PLoS One, vol. 9 (5), e97288, May 2014. https://doi.org/10.1371/journal.pone.0097288 DOI: https://doi.org/10.1371/journal.pone.0097288
J. R. Romero, P. F. Roncallo, P. C. Akkiraju, I. Ponzoni, V. C. Echenique, and J. A. Carballido, “Using classification algorithms for predicting durum wheat yield in the province of Buenos Aires,” Computers and Electronics in Agriculture, vol. 96, pp. 173-179, Aug. 2013. https://doi.org/10.1016/j.compag.2013.05.006 DOI: https://doi.org/10.1016/j.compag.2013.05.006
X. Huang, G. Huang, C. Yu, S. Ni, and L. Yu, “A multiple crop model ensemble for improving broad-scale yield prediction using Bayesian model averaging,” Field Crops Research, vol. 211, pp. 114-124, Sep. 2017. https://doi.org/10.1016/j.fcr.2017.06.011 DOI: https://doi.org/10.1016/j.fcr.2017.06.011
A. A. V. da Silva, I. A. F. Silva, M. C. M. Teixeira Filho, S. Buzetti, and M. C. M. Teixeira, “Estimate of wheat grain yield as function of nitrogen fertilization using neuro fuzzy modeling,” Revista Brasileira de Engenharia Agrícola e Ambiental, vol. 18 (2), pp. 180-187, Feb. 2014. https://doi.org/10.1590/S1415-43662014000200008 DOI: https://doi.org/10.1590/S1415-43662014000200008
I. Lopez, J. Plazas, and J. C. Corrales, “A tool for classification of cacao production in Colombia based on multiple classifier systems,” in 17th International Conference Computational Science and Its Applications – ICCSA 2017, Trieste, Italy, Jul. 2017. https://doi.org/10.1007/978-3-319-62395-5_5 DOI: https://doi.org/10.1007/978-3-319-62395-5_5
E. Somarriba, and J. Beer, “Productivity of Theobroma cacao agroforestry systems with timber or legume service shade trees,” Agroforestry Systems, vol. 81, pp. 109-121, 2011. https://doi.org/10.1007/s10457-010-9364-1 DOI: https://doi.org/10.1007/s10457-010-9364-1
P. A. Zuidema, P. A. Leffelaar, W. Gerritsma, L. Mommer, and N. P. R. R. Anten, “A physiological production model for cocoa (Theobroma cacao): model presentation, validation and application,” Agricultural Systems, vol. 84 (2), pp. 195-225, May 2005. https://doi.org/10.1016/j.agsy.2004.06.015 DOI: https://doi.org/10.1016/j.agsy.2004.06.015
L. F. García Carrión, Catalogo de cultivares de cacao del Perú, Lima: Ministerio de Agricultura y Riego, 2010.
V. Vapnik, The nature of Statistical Learning Theory, New York: Springer-Verlag, 1995. DOI: https://doi.org/10.1007/978-1-4757-2440-0
H. Drucker, C. J. C. Burges, L. Kaufman, A. J. Smola, and V. Vapnik, "Support Vector Regression Machines," Neural Information Processing Systems, vol. 9, pp. 1-11, 1997.
T. Dietterich, Ensemble Methods in Machine Learning. In: Multiple Classifier Systems, Heidelberg: Springer Berlin, 2000. DOI: https://doi.org/10.1007/3-540-45014-9_1
J. H. Friedman, “Greedy Function Approximation: A Gradient Boosting Machine,” Annals of Statistics, vol. 29 (5), pp. 1189-1232, 2001. DOI: https://doi.org/10.1214/aos/1013203451
L. Breiman, “Random forests,” Machine Learning, vol. 45 (1), pp. 5-32, 2001. https://doi.org/10.1023/A:1010933404324 DOI: https://doi.org/10.1023/A:1010933404324
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, and B. Thirion, “Scikit-learn: Machine Learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825-2830, 2011.
T. M. Logan, S. McLeod, and S. Guikema, “Predictive models in horticulture: A case study with Royal Gala apples,” Scientia Horticulturae, vol. 209, pp. 201-213, Sep. 2016. https://doi.org/10.1016/j.scienta.2016.06.033 DOI: https://doi.org/10.1016/j.scienta.2016.06.033
A. Daymond, and P. Hadley, “The effects of temperature and light integral on early vegetative growth and chloroplyll fluorescence of four contrasting genotypes of cacao,” Annals of Applied Biology, vol. 145 (3), pp. 257-262, 2004. https://doi.org/10.1111/j.1744-7348.2004.tb00381.x DOI: https://doi.org/10.1111/j.1744-7348.2004.tb00381.x
Y. Ahenkorah, B. Halm, M. Appiah, and G. Akrofi, “Twenty Years’ Results from a Shade and Fertilizer Trial on Amazon Cocoa (Theobroma cacao) in Ghana,” Experimental Agriculture, vol. 23 (1), pp. 31-39, Jan. 1987. https://doi.org/10.1017/s0014479700003380 DOI: https://doi.org/10.1017/S0014479700001101
O. Deheuvels, J. Avelino, E. Somarriba, and E. Malezieux, “Vegetation structure and productivity in cocoa-based agroforestry systems in Talamanca, Costa Rica,” Agriculture, Ecosystems & Environment, vol. 149, pp. 181-188, Mar. 2012. https://doi.org/doi: 10.1016/j.agee.2011.03.003 DOI: https://doi.org/10.1016/j.agee.2011.03.003
W. Vanhove, N. Vanhoudt, and P. Van Damme, “Effect of shade tree planting and soil management on rehabilitation success of a 22-year-old degraded cocoa (Theobroma cacao L.) plantation,” Agriculture, Ecosystems & Environment, vol. 219, pp. 14-25, Mar. 2016. https://doi.org/doi: 10.1016/j.agee.2015.12.005 DOI: https://doi.org/10.1016/j.agee.2015.12.005
B. Utomo, A. A. Prawoto, S. Bonnet, A. Bangviwat, and S. H. Gheewala, “Environmental performance of cocoa production from monoculture and agroforestry systems in Indonesia,” Journal of Cleaner Production, vol. 134 (Part B), pp. 583-591, Oct. 2016. https://doi.org/10.1016/j.jclepro.2015.08.102 DOI: https://doi.org/10.1016/j.jclepro.2015.08.102

Downloads

Download data is not yet available.

Comparison Between Machine Learning Models for Yield Forecast in Cocoa Crops in Santander, Colombia

Abstract

Keywords

References

Downloads

Most read articles by the same author(s)

Similar Articles

Similar Articles

Advancing Probabilistic Frame Analysis: A Comprehensive Approach Using Monte Carlo Simulation and Response Surfaces

Performance Analysis of Access and Mobility Management Function On a 5G Core Based On CPU Usage Predictions

Inverse Kinematics for Synchronization of Three Degrees of Freedom Robots: Techniques and Applications

Advancements in Three-Phase Short-Circuit Fault Computation for Power System Generators: A Comprehensive Review

IoT-based Technology for the Coffee Drying Process Data Analysis of Small Farmers

Explainable Classification of Dermoscopy Images for the Detection of Melanoma: A Systematic Mapping of the Literature

Comparative Study of Cuckoo-Inspired Algorithms to Solve Large-Scale Continuous Optimization Problems

		Fuente Academica Premier

		(Categoría B)