Abstract
Nowadays, the major threat to humans occurs due to water quality deterioration. The quality of water creates a new sign for the public to prevent them from waterborne diseases. This study uses sensitive water quality parameters obtained from the northeast monsoon season, October 2021, at different locations in Mooli Kulam lake (11°07′17.6″ N, 77°22′59.9″ E) of Tiruppur District, Tamil Nadu, India. The parameters considered for the analysis of lake water quality are closely included with drinking and irrigation parameters. The northeast monsoon samples collected from the lake were analysed and the Water Quality Indexing was applied to the dataset using three methods, namely, the Weight Arithmetic method, the Canadian Council of Ministers of the Environment, and Horton’s method. The parameters are divided into drinking water variables and irrigation water variables. This study includes water quality index mapping using Inverse Distance Weighting interpolation of the spatial distribution method using ArcMap 10.8. The dataset was subjected to correlation and regression analysis in order to determine the most significant pollutant. A total of 10 sampling stations and 23 water quality parameters have been analysed. The results obtained show that the lake has high eutrophication with compounds of potassium, iron, and nitrates.
1. Introduction
Water is one of the greatest sources of life on Earth. The sources of freshwater are so limited with the present population increase and the human inventions that cause a lot of impacts on the deterioration of water. It is seen from many studies that water exploitation and water theft have also become a major threat to water scarcity [1]. Many methods and many components have been invented for the assessment of water quality and water pollution. The development of a new method of assessing pollutant load is viewed as a broad study in today’s culture. Various methods and mathematical tools are employed and checked for their best fit for assessing the water pollutants. The quality of water for urban, rural, and industrial purposes can be determined best by calculating the WQI. The groundwater samples were from the southern part of wells in the Varanashi district of Uttar Pradesh, India. A total of 16 groundwater wells were chosen to assess 22 water parameters for the potability study. The samples were collected during the premonsoon season in the month of May 2015, respectively. According to the current study, 20% of the drinking water in this region is unsuitable, while the remaining 80% is classified as good, moderate, bad, or extremely poor according to the WQI. The findings of this study will aid in the effective planning and management of available water resources [2]. DWQI development model, parameter classification, and subindex construction have been done with the help of the regression equation and aggregation function with the Min–Max operator. For the quality assessment, twenty-two water quality variables were used from 24 groundwater well samples [3]. The WQI of Obulavaripalli Mandal in the YSR district shows that 40% of samples are good for drinking; 30% of samples are very poor for drinking. A total of ten water parameters analysed from 20 groundwater samples resulted in the overall report showing the groundwater quality is unfit for human consumption [4].
Water quality must be protected in order to provide safe drinking water to the public. WQI of the Oros Reservoir in the Northeast of Brazil was studied using the Principal Component Analysis (PCA) for identifying human consumption. The entry points P1, P3, P4, and P5 of the reservoir show the lowest WQI value, while P6 and P7 at the exit point of the reservoir show the highest WQI [5]. From September 2014 to January 2016, 96 locations along major rivers were sampled four times during four seasons. Our findings are useful for water quality management and might be implemented in the Lake Taihu Basin for immediate and low-cost assessment of water quality [6]. The WQI of groundwater samples collected from Tumkur taluk has a WQI ranging from 89.21 to 660.56. According to the findings, the groundwater in the region requires some type of treatment before consumption, as well as protection from pollution. Thus, the study demonstrates that the use of WQI is a fundamental tool for assessing the potability of water. [7]. A WQI was developed from nine physicochemical properties that were periodically monitored at eighteen sampling locations (January–November 2000) to assess the geographical and surface water quality in the watershed altered with time. To decrease the expenses involved with its adoption, modifications to the original WQI were made using Principal Component Analysis (PCA) [8]. To provide safe drinking water on a long-term basis, it is essential to monitor and safeguard its quality. Before evaluating the suitability of groundwater for various purposes, it is necessary to understand the chemical composition of ground water. Ground water contains seven major chemical elements in dissolved state. They are Ca2+, Mg2+, Cl−, , Na+, K+, and [9].
Mathematical tools are most helpful for assessing the quality of water. Assessment of pollutants and the level of pollution analysis require a large number of dataset assessments with a lot of in situ tests and laboratory tests. A minimum of 1000 results must be analysed in order to improve water conservation. Environmentalists use these datasets to develop the indexes for analysing the water quality. This type of indexing uses mathematical tools that compress the dataset and provide a clear view of the water quality values. Twenty variables are taken into consideration by two WQI (subjective and objective) (WQIsub and WQIobj). A case study of the River Suqua in Cordoba City, Argentina, has been analysed for the spatial and seasonal variations in water quality using WQI. The impact of the city’s rapid urbanisation on water quality is harmful, particularly in the areas where sewage is discharged. The quality of the water seems to be poor during the dry season [10]. For the past few years, the quality of water in lakes has been measured using the WQI method. Fifteen water parameter analyses of fifty groundwater samples and thirty-five surface water samples from the Brahmaputra plain in the Jia-Bharali River basin indicates that the multivariate statistical analysis is a great tool for assessing water quality. A complex and highly variable dataset can be interpreted with the use of PCA analysis and other multivariate tools. Identification of pollution load can be easily assessed using Varimax Factors from PCA analysis [11].
For decades, many developments have been developed and modified methods have been created to check the suitability of water for drinking. The WQI is a significant rating to describe the water quality characteristics in a specific line and assists in the selection of appropriate treatment options to solve the problems. WQI represents the complete impact of a wide range of criteria for the water quality analysis and distributes water quality data to the public and authorities [12]. A “modified DWQI” in Iran’s metropolitan regions was studied using Canadian DWQI to assess the fresh drinking water quality index. According to the revised DWQI value, approximately 95 percent of groundwater flow rates were in good condition, while water quality was found to be fair in 3% of the samples and marginal in the remaining 2% [13].
A significant analysis of water quality assessment can be easily done with the implication of mathematical tools and software tools in it. Water quality evaluation has grown into a large topic of study in water analysis. Model analysis has become more significant in the assessment of water quality. To analyse water quality, several computer techniques and mathematical models have been created. This study focuses on creating an estimation model for four samples collected from Periyakulam Lake, Ukkadam, located in Manchester City of South India. The estimation model results show that the sunflower optimization algorithm (SFO) is an efficient computing tool for estimation models [14]. The WQI for Loktak Lake shows that increased human activity causes stress on the wetland. Eleven parameters were utilised to generate the WQI for the analysis of water prospective for five sample stations. Based on its importance, each characteristic was given a relative weight ranging from 1.46 to 4.09 [15].
The high WQI indicates that the contamination in river water quality was determined by 21 water quality parameters in the Mettur river basin, Salem, Tamil Nadu. After identifying the likely sources of contamination into the river basin using the factor analysis, preventative measures and monitoring programmes can be implemented to prevent the future contamination levels. Multivariate statistics are effective in converting big value datasets into simple interpretations by displaying geographical variations, whereas cluster analysis is used to split the areas into high, medium, and low contaminated areas [16]. Pearson’s correlation analysis using SPSS can be used for the analysis of surface water and contour analysis were used for groundwater quality in rivers from the regions of Sukinda, Odisha. Due to the mining activity near the area, 98% of the chromium is supplied to the entire country. This region was highly contaminated with chromium ore in both the surface and groundwater water quality and the people residing in this area were affected with diseases, like respiratory tract problems, skin, and immune system drops [17].
Identifying the overall water quality and any possible risks due to the development of socioeconomical activities around the Parbati River, Himachal Pradesh, was done using various analysis techniques like PCA, along with multivariate, cluster analysis, and some graphical visualisation techniques. Statistical studies were one of the tools employed to identify any traces of pollutants in the river basin and also to understand the factors inducing the chemical characteristics in water and other sources of contamination as well [18]. A logistic regression model was incorporated for this research work to assess the quality of water in public pools in Medellin, Colombia, by reviewing their microbial and physicochemical characteristics. To be more specific, for correlation, Pearson’s coefficient was used to identify their linear relationship, factor analysis was used to evaluate the water quality parameters, and a regression model was used to bind both microbial and water quality parameters. In pools, hypochlorite is added to disinfect the pool and it converts and tends to increase the EC of the pool and some microorganisms are affected by particular parameters of the water. Regular monitoring should be done to maintain the adequate condition of the pool [19].
In the Yamuna River basin, Fuzzy-based WQI was derived to determine the natural conditions of water and showed that the sensitivity of parameters was used in defining the surface water quality. Data taken from 22 sampling stations takes eight characteristics of water samples and ranges them between very good, good, fair, poor, and very poor. Five classes of water quality were assigned under these classes of classification as inputs to fuzzy models. As it is limited to freshwater studies in this research, the extended version of this study can be applied to groundwater studies [20]. For the water quality metrics in Lake Prashar, Himachal Pradesh, Pearson’s correlation, principal component analysis, and cluster analysis were used. Because there was a significant difference across seasons and months, WQI, PCA, and cluster analysis were given on a season-by-season basis. PCA identified water temperature, TDS, conductivity, turbidity, phosphate, BOD, hardness, calcium, sodium, and potassium as key characteristics that lowered water quality during the summer and monsoon seasons. The findings found that the main variables affecting water quality during the monsoon were the garbage generated by tourists that entered the system and the surrounding erosion as a result of overgrazing in the Prashar Lake area [21]. A relatively minimum number of wells of the study area witness extraordinary values of conductivity and chloride due to the usage of fertiliser for agricultural use [22].The main objective of the study is to evaluate the quality of Mooli Kulam lake water in Tirupur District to check its suitability for drinking purpose and irrigation use.
1.1. Study Area
Mooli Kulam is an urban lake situated within the city limits in Tirupur city with an area of about 0.223 km2. The source of water to the lake is from rainfall and the Noyyal river. The water to the lake is fed by a 2 km long canal from the Noyyal river during floods and is one of the 32 irrigation tanks of the Noyyal river basin. The lake is situated at 11°07′17.6″ N and 77°22′59.9″ E. The depth of the lake varies from 3 m to 6 m. The lake’s water looks dark green in color and has a well-grown ecosystem with the presence of aquatic plants and fishes. The south bank of the river contains many colored dyes and effluents. The climate in Tiruppur is experienced by hot semiarid with the mean maximum of 35°C (95.0°F) and minimum temperatures of 22°C (71.6°F). The average annual rainfall is around 700 mm (28 in) with 47% during northeast and 28% during southwest monsoons. This lake holds water to its maximum level during flood in the Noyal River. The shape of the lake is undefined, and the bed of the lake consists of colluvial sediments providing water throughout the year to the nearby areas for irrigation. The location of the study area is shown in Figure 1.

2. Methodology
Water samples were collected from Mooli Kulam Lake in October 2021, and parameters such as pH, total hardness, TDS, chlorides, sulphates, calcium, magnesium, sodium, fluoride, bicarbonate, and electrical conductivity were analysed in accordance with the APHA, 2004. The results are formulated to form a water quality index. 3 methods of WQI have been adopted, and the results obtained for WQI are plotted as a spatial distribution map for data integrity. A mathematical tool such as correlation analysis has been applied to the dataset to obtain similarities and interrelationships between the variables of water quality. A regression model has been created with highly correlated parameters to assess the best-fit regression models.
2.1. Weighted Arithmetic Method
The universally adopted method of classifying low water quality can be carried out using the weighted arithmetic index method. This was formulated by Brown in 1972 and many methods are followed up with this method in slightly modified forms. The following equation calculates the water quality index using this method:where is rating scale of quality and Wi is unit weight.
2.2. Canadian Council of Ministers of the Environment
The compatible method to enumerate the water quality for the public was formulated by Canadian jurisdictions by establishing a committee. This committee developed WQI, which many water agencies can apply with slight modifications [22, 23]. The sampling method in this procedure requires that a minimum of four parameters be sampled with a minimum of four times [24, 25]. The CCME WQI method index scores can be obtained bywhere F1 is scope (the ratio of the failed variables number to the total variables in percentage). F2 is frequency (number of times of the unmet objectives). F3 is amplitude (unmet objectives amount).
Based on the WQI values in weighted arithmetic method and CCME, water quality for drinking can be rated and represented in Table 1.
2.3. Horton Equation
The WQI of the study area is carried out using Horton’s method (Horton, 1965) by considering ten parameters, namely, TDS, pH, sulphates, total hardness, chlorides, calcium, bicarbonate, sodium, magnesium, and fluoride, by comparing their standards with WHO, ICMR, and ISI standards using the following equation.where qn is the rating scale for nth water quality parameter and Wn is the unit weight factor.
The quality ratingVn is the estimated value of nth water quality parameter at a given sampling location and Vid is the value of the nth water quality parameter in pure water.
The entire ideal values (Vid) are taken as seven for pH and zero for all the other parameters.
Sn is standard permissible value of nth water quality parameter.
Unit weight is expressed as follows:where K = 1/Σ (1/Sn = 1, 2, 3, …, n). Sn = Standard permissible value of nth water quality parameter.
Based on the WQI values, water quality for drinking and irrigation can be rated and represented in Table 2.
2.4. Descriptive Analysis
Descriptive statistics is concerned with quantitatively describing the characteristics of a particular individual or a group. It summarizes data from a sample with minimum and maximum values, mean or standard deviation, and the measures of variability and central tendency.
2.5. Correlation Analysis
It is a measure of the degree between two sets of quantitative data. It is an association of data that contains two sets of variables. The other variables are considered to be correlated if a change in one variable impacts a change in the other. If the two variables vary in the same direction, that is, if an increase in one variable causes an increase in the other, and vice versa, Karl Pearson’s coefficient of correlation can be worked out by using the equation below.
Water quality analysis depends on several parameters. These parameters are measured in terms of calcium, potassium, sodium, carbonate, bicarbonate, chloride, magnesium, sulphate, EC, TDS, fluoride, and pH. A systematic statistical analysis of the correlation coefficient of water quality measures aids in the assessment of overall water quality in such a way that the correlation coefficient (r) up to 0.5 has no significant linear association between them. A significant linear correlation exists when r is between 0.5 and 0.8, and a strong linear correlation exists when r is greater than 0.8.
2.6. Regression Analysis
To estimate the response variable outcome, Multiple Linear Regression (MLR), a statistical technique, is used with the help of several explanatory variables. The linear relationship between the explanatory (independent) variables and response (dependent) variables can be modelled using MLR. Multiple regressions are simply an extension of ordinary least square regression (OLS) by incorporating more than one explanatory variable.where for i = n observations, yi = dependent variable, xip = explanatory variables, β0 = y-intercept (constant term). βp = slope coefficients for each explanatory variable; ϵ = the model’s error term (also known as the residuals).
3. Results and Discussion
3.1. Water Quality Index
In total, 11 parameters have been used; namely, pH, TDS, total hardness, chlorides, sulphates, calcium, magnesium, sodium, fluoride, bicarbonate, and electrical conductivity were used to calculate WQI of the lake water for drinking. For irrigation, except bicarbonates and fluoride all other parameters were used.
3.1.1. Weighted Arithmetic Method of WQI
Table 3 shows the variations in the WQI of the samples from 1 to 10. For the purpose of drinking and irrigation, only a few parameters have been changed and the method of calculation of WQI remains the same. It is seen that 9 samples are termed to be of good water quality in terms of drinking and only one sample is seen to be poor for drinking purposes. In considering the WQI for irrigation, it is seen that all samples are found to be fit for irrigation, where the values lie in the range between 86 and 100.
3.1.2. Horton Method of WQI
Table 3 shows the variations in the WQI of water samples from the Mooli Kulam lake using Horton method. The WQI for irrigation seems to be excellent for the parameters included, and the quality of the water for drinking water quality standards seems to be good for 9 samples and fair for one sample.
3.1.3. Canadian Method of WQI
Table 4 shows the variations in the WQI of water quality sampling in Mooli Kulam lake and the method of assessment of WQI is by the Canadian Council. It is seen that the samples do not obey the standards. The WQI obtained is 68.62, which is depicted as fair water quality. The spatial distribution of WQI for drinking and irrigation is presented in Figure 2.

3.2. Multivariate Analysis
3.2.1. Descriptive Analysis
Table 5 shows the descriptive analysis of the lake water samples. It is seen that the pH value has a maximum of 8.42 and a minimum of 8.26 with a low standard deviation of 0.1702. EC has the highest standard deviation of all the variables. The maximum value of EC is 1479, which is so close to the permissible limit standards. Turbidity has a maximum value of 84 and a minimum value of 13. Both the maximum and minimum values are not within permissible limits. Iron has a maximum of 1.1 and a minimum of 3.5, with a standard deviation of 0.733. The maximum level of potassium is 24 and the minimum is 10, with a standard deviation of 5.06.
3.2.2. Correlation Analysis
The value of correlation coefficient for northeast season is given in Table 6. It is seen that TDS has a strong linear correlation with electrical conductivity, with a correlation coefficient of 0.999. Total hardness has a strong linear correlation with EC and TDS with a correlation coefficient of 0.809 and 0.808. Sulphate has a strong linear correlation with EC, TH, and TDS with a correlation coefficient of 0.819, 0.918, and 0.817.
Chloride has a strong linear correlation with TH and SO4 with a correlation coefficient of 0.965 and 0.87. Chloride has negative linear correlation with pH, EC, TDS, and hardness with correlation coefficient (−0.40, −0.23, −0.23, and −0.38). Its value is independent of pH, EC, TDS, and hardness. Magnesium has a strong linear correlation with five parameters, namely, EC, TDS, TH, SO4, and Cl (0.874, 0.872, 0.974, 0.927, and 0.882).
Turbidity has a strong correlation with Na (0.808) and a significant linear correlation with EC, TDS, and SO4 (0.759, 0.759, and 0.529). Fluoride has negative linear correlation with pH, EC, TDS, hardness, sulphates, calcium, magnesium, and sodium with correlation coefficient (−0.46, −0.22, −0.22, −0.21, −0.35, −0.20, −0.21, and −0.17). Its value is independent of pH, EC, TDS, hardness, sulphates, calcium, magnesium, and sodium. Potassium has a significant linear correlation with sodium and turbidity (0.672 and 0.596). HNO3 has a strong linear correlation with EC, TDS, sodium, and turbidity (0.942, 0.943, 0.874, and 0.861) and a significant linear correlation with TH, sulphate, and magnesium (0.664, 0.738, and 0.768).
Iron has a strong linear correlation with turbidity and HNO3 (0.874 and 0.842) and a significant linear correlation with EC, TDS, TH, SO4, Cl, Mg, and Na (0.796, 0.795, 0.665, 0.747, 0.617, 0.669, and 0.610). BOD has a significant linear correlation with TH and Mg (0.524 and 0.538). DO has a negative linear correlation with EC, TDS, hardness, sulphates, calcium, magnesium, sodium, turbidity, bicarbonate alkalinity, and iron with correlation coefficient (−0.79, −0.79, −0.85, −0.79, −0.79, −0.85, −0.48, 0.53, 0.76, and −0.71). Its value is independent of EC, TDS, hardness, sulphates, calcium, magnesium, sodium, turbidity, bicarbonate alkalinity, and iron.
3.2.3. Multiple Linear Regression
The R2 value of 1 in the present model indicates that the water parameters turbidity, electrical conductivity, pH, bicarbonate alkalinity, total alkalinity, carbonate hardness, total hardness, and calcium explain 100% of the variability of TDS. The best-fit MLE for predicting the TDS is given below.
The larger the R2 value, the bigger the F-ratio, which indicates that the relationship between the dependent and the independent variables is stronger. The overall significance of the regression model is determined by the larger value of the F-statistic as given in ANOVA in Figure 3. The P-value of 0.0013 ( < 0.05 level) validates that the data is a good fit as per the regression model.

4. Conclusion
The research shows a variety of WQI methods for the assessment of the water quality of lakes. The results obtained from Weighted Arithmetic and Horton’s method shows that station 8 comes under poor and fair, which cannot be used for drinking purpose. The remaining station area water quality is found to be good and can be used for drinking. The CCME method results show the water quality is found as fair. Thus, the results state that the water is safe for drinking except station 8. For irrigation, all the station water can be used for irrigation. The IDW interpolation method of spatial distribution analysis shows the variations in the water quality parameters at one location. The correlation analysis states that the variables have a linear correlation with many other variables. TDS and EC are found to be too highly correlated. The regression analysis also states the majority of variables are dependent on the total dissolved solids, whereas a multiple linear regression fit for TDS shows that the variables are dependent on total dissolved solids. The lake’s water turbidity is a major threat to human consumption. Organic pollutants are highly concentrated in the lake water. Certain strategies for water conservation and water body conservation should be adopted for the wellness of the water body and human health.
Data Availability
The data used to support the findings of this study are included within the article.
Conflicts of Interest
The authors declare that they have no conflicts of interest regarding the publication of this paper.