• AliAshraf Sadradini 1

  • Amin Sharifi 1

  • Saeed Samadianfard 2

  • Milad Sharafi 3

  1. 1 Department of Water Engineering, Faculty of Agriculture, University of Tabriz, Tabriz, Iran
  2. 2 Assist. Professor, Department of Water Engineering, Faculty of Agriculture, University of Tabriz, Tabriz, Iran
  3. 3 Department of Water Engineering, Faculty of Agriculture, Urmia University, Urmia, Iran

Abstract

This research aims to enhance the accuracy of water quality predictions using machine learning models. The focus was on evaluating the performance of the Random Forest (RF) model and its hybrid version with a Genetic Algorithm (GA-RF) in predicting biochemical oxygen demand (BOD) and dissolved oxygen (DO) in the Nahand River Basin, Iran. The hybrid model was developed using eleven years of daily water quality data from 2013 to 2023, incorporating 11 input variables—including total nitrogen, total phosphate, nitrite, nitrate, phosphate, nephelometric turbidity unit, water temperature, air temperature, electrical conductivity, pH, and flow. Additionally, six scenarios were created using different combinations of these inputs. The models' performance was statistically evaluated through the coefficient of determination (R²), root mean square error (RMSE), Nash-Sutcliffe efficiency (NS), and Willmott’s index of agreement (WI). Results demonstrated that GA-RF consistently outperformed the standalone RF. In BOD prediction, the GA-RF-6 and RF-5 models achieved R² values of 0.563 and 0.548, respectively. For DO prediction, GA-RF-5 and RF-6 yielded R² values of 0.81 and 0.792, respectively. The findings indicate that integrating the Genetic Algorithm with Random Forest can enhance predictive accuracy in water quality assessments, supporting more informed and sustainable water resource management decisions.

Keywords

Subjects

 water resource management

Achite, M., Samadianfard, S., Elshaboury, N., & Sharafi, M. (2023). Modeling and optimization of coagulant dosage in water treatment plants using hybridized random forest model with genetic algorithm optimization. Environment, Development and Sustainability, 25(10), 11189-11207. DOI: https://doi.org/10.1007/s10668-022-02523-z.
Antia, N., McAllister, C., Parsons, T., Stephens, K., & Strickland, J. (1963). Further measurements of primary production using a large‐volume plastic sphere. Limnology and Oceanography, 8(2),166–183. DOI: https://doi.org/10.4319/lo.1963.8.2.0166.
Amini, A., Ghazvinei, P. T., Javan, M., & Saghafian, B. (2014). Evaluating the impacts of watershed management on runoff storage and peak flow in Gav-Darreh watershed, Kurdistan, Iran. Arabian Journal of Geosciences. DOI: https://doi.org/10.1007/s12517-013-0950-1
Beiranvand, B., & Rajaee, T. (2022). Application of artificial intelligence-based single and hybrid models in predicting seepage and pore water pressure of dams: A state-of-the-art review. Advances in Engineering Software, 173(1),103-117.DOI: https://doi.org/10.1016/j.advengsoft.2022.103121.
Bhateria, R., & Jain, D. (2016). Water quality assessment of lake water: a review. Water Resources Management, 2(2),161–173. DOI: https://doi.org/10.1007/s40899-015-0014-7.
Breiman, L., (2001). Random forests. Mach. Learn. 45(1):5–32. DOI: https://doi.org/10.1023/A:1010933404324.
Cao, X., Liu, Y., Wang, J., Liu, C., & Duan, Q. (2020). Prediction of dissolved oxygen in pond culture water based on K-means clustering and gated recurrent unit neural network. Aquacultural Engineering, 91(1),102-121. DI: https://doi.org/10.1016/j.aquaeng.2020.102121.
Deka, P., Dutta, P. K., Kalita, S., Nath, R. K., & Dutta, P., (2024). Water Management in Organic Farming. In Advances in Organic Farming. Apple Academic Press, (pp.151–161). https://doi.org/10.1201/9781003338681-11.
Fathima, A., Mangai, J. A., & Gulyani, B. B. (2014). An ensemble method for predicting biochemical oxygen demand in river water using data mining techniques. International Journal of River Basin Management, 12(4):357–366. DOI: https://doi.org/10.1080/15715124.2014.917318.
Gleeson, T., Wang‐Erlandsson, L., Porkka, M., Zipper, S. C., Jaramillo, F., Gerten, D., Fetzer, I. Cornell, S.E., Piemontese, L., & Gordon, L.J. (2020). Illuminating water cycle modifications and Earth system resilience in the Anthropocene. Water Resources Research, 56(4): 1-18. DOI: https://doi.org/10.1029/2019WR024957.
Goldberg, D., & Holland, J. (1988). Genetic algorithms and machine learning. In Proceedings of the sixth annual conference on Computational learning theory, 3(2),95–99. DOI: https://doi.org/10.1023/A:1022602019183.
Hassani, S.Z., & Ashofteh, P.-S. (2023). Modeling of Dissolved Oxygen in Ekbatan Reservoir Using CE-QUAL-W2 Model. Water Irrigation Management, 13(4), 983-1000 [In Persian]. DOI: https://doi.org/10.22059/jwim.2023.359526.1077
Hidayat, R. D. X., & Kurniawan, A. (2024). Sustainable Water Management in Urban Areas through Smart Water Circulation Systems. Environmental Earth Sciences, 1, 1416-1429. DOI: https://doi.org/10.1088/1755-1315/1416/1/012019.
Holland, J. H. (1992). Genetic algorithms. Scientific American, 267(1), 66–73. DOI: https://doi.org/10.1038/scientificamerican0792-66.
Kadlec. RH., & Reddy, K. (2001). Temperature effects in treatment wetlands. Water Environment Research,  73(5): 543–557. DOI: https://doi.org/10.2175/106143001X139614.
Li, X., Cheng, G., Ge, Y., Li, H., Han, F., Hu, X., Tian, W., Tian, Y., Pan, X., & Nian, Y. (2018). Hydrological cycle in the Heihe River Basin and its implication for water resource management in endorheic basins. Journal of Geophysical Research: Atmospheres, 123(2) :890–914. DOI: https://doi.org/10.1002/2017JD027889.
Li, X., Sha, J., & Wang, Z.-l., (2017). A comparative study of multiple linear regression, artificial neural network and support vector machine for the prediction of dissolved oxygen. Hydrology Research, 48(5): 1214–1225. https://doi.org/10.2166/nh.2016.258.
Lorestani, B., Merrikhpour, H., & Cheraghi, M. (2020). Cross-Sectional Study of Water Quality Changes in Lake of Kalan Malayer Dam(Case Study: 2017-2018). Journal of Environmental Health Science & Engineering, 8(1), 99-116 (In Persian). DOI: https://dx.doi.org/10.52547/jehe.8.1.99.
Matveyeva, N., & Chernov, Y., (2019). Biodiversity of terrestrial ecosystems, The Arctic. Routledge England. 40 pp.
Moein, M.M., Saradar, A., Rahmati, K., Mousavinejad, S. H. G., Bristow, J., Aramali, V., & Karakouzian, M. (2022). Predictive models for concrete properties using machine learning and deep learning approaches: A review. Journal of Building Engineering. 19: 105017. DOI: https://doi.org/10.1016/j.jobe.2022.105017.
Papa, F., Crétaux, J.-F., Grippa, M., Robert, E., Trigg, M., Tshimanga, R.M., Kitambo, B., Paris, A., Carr, A., & Fleischmann, A.S. (2023). Water resources in Africa under global change: monitoring surface waters from space. Surveys in Geophysics, 44(1), 43–93. DOI: https://doi.org/10.1007/s10712-022-09721-4.
Rafiei, G., Moezzi, F., Poorbagher, H., Rezaei Tavabe, K., & Nematollahi, M.A. (2023). Assessing Water Quality Indices and Autopurification Capacity of Balighli-Chai and Ghare-Sou Rivers using QUAL2Kw Model. Environment and Water Engineering, 9(3), 335-351 (In Persian). DOI: https://doi.org/10.22034/jewe.2022.336023.1756.
Raheli, B., Aalami, M.T., El-Shafie, A., Ghorbani, M.A., & Deo, R.C. (2017). Uncertainty assessment of the multilayer perceptron (MLP) neural network model with implementation of the novel hybrid MLP-FFA method for prediction of biochemical oxygen demand and dissolved oxygen: a case study of Langat River. Environmental Earth Sciences, 76(51), 1–16. DOI: https://doi.org/10.1007/s12665-017-6393-1.
Rodriguez-Galiano, V., Sanchez-Castillo, M., Chica-Olmo, M., & Chica-Rivas, M. (2015). Machine learning predictive models for mineral prospectivity: An evaluation of neural networks, random forest, regression trees and support vector machines. Ore Geology Reviews, 71(12), 804–818. DOI: https://doi.org/10.1016/j.oregeorev.2015.06.009.
Roushangar, k., & Davoudi, S. (2023). Dissolved Oxygen Modeling Using Deep Learning and Pre-Processor Methods. Water Irrigation Management, 12(4), 983-890 (In Persian). DOI: https://doi.org/10.22059/jwim.2022.345864.1005.
Schenk, L., & Bragg, H. (2021). Sediment transport, turbidity, and dissolved oxygen responses to annual streambed drawdowns for downstream fish passage in a flood control reservoir. Journal of Environmental Management, 295(1), 113–125. DOI: https://doi.org/10.1016/j.jenvman.2021.113021.
Siebert, S., Henrich, V., Frenken, K., & Burke, J. (2013). Update of the digital global map of irrigation areas to version 5. Agricultural Water Management, 10(2), 2660-2728. DOI: https://doi.org/10.1016/j.agwat.2013.05.007
Smith, P.F., Ganesh, S., & Liu, P. (2013). A comparison of random forest regression and multiple linear regression for prediction in neuroscience. Journal of Neuroscience Methods, 220(1), 85–91. DOI: https://doi.org/10.1016/j.jneumeth.2013.08.004.
Wang, Z., Lai, C., Chen, X., Yang, B., Zhao, S., & Bai, X. (2015). Flood hazard risk assessment model based on random forest. Journal of Hydrology, 527(241), 1130–1141. DOI: https://doi.org/10.1016/j.jhydrol.2015.05.057.