Research on Monitoring Topping Time of Cotton Based on AdaBoost+Decision Tree

Li, Yibai; Cao, Guangqiao; Ji, Chao; Liu, Dong; Zhang, Jinlong; Li, Liang; Chen, Cong

doi:https://doi.org/10.1155/2022/4214332

Discrete Dynamics in Nature and Society

On this page

Abstract Introduction Materials and Methods Results Discussion and Analysis Conclusions Data Availability Conflicts of Interest Authors’ Contributions Acknowledgments References Copyright Related Articles

Research Article | Open Access

Volume 2022 | Article ID 4214332 | https://doi.org/10.1155/2022/4214332

Research on Monitoring Topping Time of Cotton Based on AdaBoost+Decision Tree

Yibai Li,¹Guangqiao Cao,¹Chao Ji,²Dong Liu,¹Jinlong Zhang,¹Liang Li,¹and Cong Chen¹

Academic Editor: Ya Jia

Received28 Dec 2021

Accepted24 Feb 2022

Published24 Mar 2022

Abstract

Topping is an important part in cotton field management, the spraying time has a great impact on cotton quality. In agricultural production, the strategy of timing the cotton topping mainly relies on manual inspections and experience, which is lack of efficiency and science. To solve the problem, this paper uses a drone equipped with a multispectral camera to collect the multispectral information of the cotton canopy of 12 days which includes before and after the topping operation in Shihezi. At the same time, the information of cotton plant height, the number of fruiting branches, and flower buds are collected. Compare multiple band combinations and vegetation index; the combined data of 550 + 730 + 790 nm band is selected as the model input. AdaBoost + decision tree method is proposed as a fitting model, the fitting results show that the coefficient of determination (R²) between multispectrum and cotton plant height is 0.96, and the average prediction error (RMSEP) is 0.40 cm, the coefficient of determination (R²) between multispectrum information and the fruiting branches is 0.97, the prediction mean error (RMSEP) is 0.54, and the correlation determination (R²) with the flower buds is 0.84, and the prediction mean error is (RMSEP) 0.49. The output data of the fitting model is used as the input of the topping time discriminant model, and the discriminant model can obtain an accuracy of 94.03%. The method in this paper can effectively monitor the growth status of cotton in the topping time and provide a technical path to scientifically determine the cotton topping time.

1. Introduction

Cotton is an important economic crop in China [1, 2], where Xinjiang is a major cotton production area with high mechanization in cotton production. Rapid and accurate monitoring of cotton growth status, precision management of cotton fields, and rational arrangement of production plans are essential to improve the production efficiency and yield of cotton in Xinjiang [3, 4]. The flower bud stage is one of the important growth periods of cotton [5, 6]. Cotton topping operation is a crucial operational task in the flower bud stage. Scientific topping and fertilization are conducive to promoting reasonable conversion of photosynthetic products and improving cotton yield and quality [7, 8]. The Technical Guidance on Mechanized Cotton Production in the Northwest Inland Cotton Growing Regions and related cotton planting agronomy regarding the requirements of cotton topping time issued by the Ministry of Agriculture and Rural Affairs of Peoples’ Republic of China stipulates that the appropriate topping time should be determined according to the factors such as cotton growth, plant height, and number of fruiting branches; early topping as appropriate based on promoting early maturity is preferred [9, 10]. Cotton topping is highly time sensitive, and the choice of operation time has a significant impact on cotton growth [6, 11, 12]. The determination of the operational strategy with unmanned aerial vehicle (UAV) or mechanical topping and the operating window period in path planning depend on the monitoring of crop growth conditions in the field [13, 14]. Therefore, rapid and accurate monitoring of cotton during the topping operation period can provide a data basis for scientific determination of topping time and strategic arrangement for topping operations [15, 16]. Currently, the cotton topping time is mainly determined based on manual field observation of crops and weather and soil conditions, and the topping time is judged empirically, which is inefficient in the information acquisition and lack of interpretability [11]. The indexes for determining cotton topping time mainly include plant height, number of fruiting branches, and number of flower buds [8]. There have been studies on the rapid acquisition of cotton plant height. However, rapid acquisition of information such as number of fruiting branches and flower buds is scarcely studied. Low-altitude UAV remote sensing has the advantages of high mobility, velocity and flexibility, low cost, and easy operation, with important application prospects in precision agriculture and agricultural production management [17–20]. Currently, research results have indicated that UAVs equipped with multispectral and hyperspectral cameras can be used to collect remote sensing information of cotton canopy [21–23], establish regression models, and monitor the plant height. However, there are a few studies to establish models for number of fruiting branches and number of flower buds in cotton flower bud stage.

Common base regression models include multiple linear regression, support vector machine, neural network, decision tree, etc. [24–28]. Current studies mainly focus on the prediction of physiological indexes of cotton in single or multiple growth periods, with long data collection intervals, large gaps in plant growth status, and therefore low requirements for the feature extraction capacity of prediction models [29–31]. Due to the limited training parameters of the simple regression model, the model is prone to overfitting and the generalization ability is weak, resulting in low regression accuracy and model robustness. The morphological changes of cotton crops are relatively small in the period of the fixed period, so there are high requirements for the feature extraction and generalization ability of the regression model. Studies have shown [32] that the extracted features of each common base regression model can be complementary. Ensemble learning improves the feature extraction capacity of models by combining multiple common base regression models [33–35], which can be applied to multispectral information processing to improve the model prediction effect.

In summary, this paper studies the following two questions: (1) cotton topping time is determined mainly based on manual observation of crop growth status, which is lack of interpretability and inefficient information acquisition. The use of UAV to collect multispectral information about crop growth can obtain information on the plant growth status quickly and efficiently. (2) To improve the disadvantage of weak robustness of common base regression models, ensemble learning is used to enhance the regression effect and provide a rapid crop data acquisition and processing method for the determination of cotton topping time.

2. Materials and Methods

2.1. Overview of Experimental Area and Materials

The experimental cotton variety in this paper is Xinluzao 68 (early maturing hybrid cotton variety). The data were collected from July 1, 2021, to July 12, 2021. The multispectral remote sensing information of cotton canopy, as well as crop growth indexed affecting the topping time and strategy, was collected continuously for 12 days before and after topping. Topping was performed in the experimental plot on July 8. Data were collected at the location of the 8th Division of Xinjiang Production and Construction Corps in Shihezi, Xinjiang, with an experimental plot area of about 0.5 hectares. Located at 43°N 84°E, Shihezi has a typical temperate continental climate, with an average temperature of 25.1–26.1°C, annual precipitation of 125.0–207.7 mm, and long sunshine hours during summer. As an essential economic crop in Shihezi, cotton accounted for more than 85% of the crop planting area, with the scale, intensive, and large-scale mechanization level ranking first in China’s three major cotton planting regions, making Shihezi an important base for the development of intelligent cotton production.

2.2. Data Acquisition and Preprocessing

2.2.1. Multispectral Data Acquisition

The experiment was conducted using 10 kg XAG XMISSION UAV with a maximum load of 6 kg. The target plot was selected by logging into the XAG control system using a smartphone, and the UAV flew according to the route trajectory automatically planned by the system at an altitude of 7 m and a speed of 3 m/s. The data collection time was 2 : 00 pm every day. The multispectral camera covered four wavelength bands of 550, 660, 735, and 790 nm, with 20 megapixels, image resolution of pixels, and camera weight of 0.85 kg, suitable to be carried on the UAV flight platform for operation. During the experiment, a cm standard white board was placed on the experimental plot as the radiation correction data for the remote sensing data. The UAV image remote sensing experiment was shown in Figure 1.

2.2.2. Multispectral Data Preprocessing

The multispectral remote sensing data on cotton canopy should be preprocessed after acquisition. The spectral resolution was 10 nm, and the spatial resolution of graphics was . The preprocessing of remote sensing data mainly included two steps. (1) Selection of the region of interest (ROI). A rectangular box of pixels was used for each spectral band in each image. The area of all cotton in the image was manually selected as the ROI, and the mean within the ROI was obtained as the spectral value of that band in the image. (2) Radiation correction of multispectral remote sensing images. A rectangular box of pixels was used to select the standard whiteboard position as the white correction region. The spectral values of the white correction region were averaged, and radiation correction was performed on the multispectral remote sensing data according to equation (1), where represented the average spectral value of the ROI in a certain band, represented the average spectral value of the standard white correction region in that band on that day, represented the mean pixel of that band when the lens was covered on that day, and represented the spectral reflectance of that band after radiation correction on that day; 20–30 data samples were collected per day, and a total of 1,040 data samples were obtained.

Figure 2 shows the means of the reflectance of samples collected at different wavelengths during the 12-day data acquisition period. The trend of the image indicated that the range of spectral reflectance valued at 550 nm and 660 nm was close, with a slightly similar trend of data change. The range and trend of reflectance values at 730 nm and 790 nm were also basically the same, and the reflectance values gradually decreased as the acquisition time changes.

(a)

(b)

(c)

(d)

Acquisition of data on cotton plant height, number of fruiting branches, number of flower buds, and determination of topping time is done.

To determine the correlation between the spectral information of cotton canopy and the indexes such as cotton plant height, number of fruiting branches, and number of flower buds, the aforesaid indexes were measured in the field. Five plants at a distance with fixed location in the experimental plot were randomly selected as sample measuring points. The sample heights of these five fixed cotton plants were measured daily using a tape and averaged as the plant height in the plot. The number of fruiting branches and flower buds of the five fixed plants were measured by the counting method and averaged as the number of fruiting branches and flower buds in the plot on that day. The variation of cotton plant height, number of fruiting branches, and flower buds over time was shown in Figure 3.

(a)

(b)

(c)

The figure indicated that cotton plant height, number of fruiting branches, and number of flower buds continue to increase from 62 cm, 6.5, and 17 on day 1 until days 7–8, when almost no changes were observed. Topping operation was performed in the experimental plot on day 8. It could be seen from the figure that cotton indexes were no longer changed after topping.

2.3. Data Processing Methods

2.3.1. Decision Tree

The decision tree algorithm is a data mining algorithm featuring high readability and fast computing speed, etc. Decision trees are constructed based on the calculation of conditional probabilities. The decision tree consists of two main processes: structure building and pruning. During learning, a decision tree model is established using the training data based on the principle of loss function minimization.

Common algorithms used for decision tree building include ID3 and C4.5 algorithms. Maximum likelihood estimation (MLE) is used in the decision tree for probabilistic model calculation. The core of both algorithms is to select features at each node of the decision tree using information gain (ratio), build a decision tree based on the recursive method, calculate all possible feature information gains (ratios) for the nodes starting from the root node, and select the feature with the largest information gain (ratio) as the node feature. Child nodes are established based on various values of this feature, and the above method is recursively called on the child nodes to build the decision tree until all features have minimal gain (ratio) or no feature position, where information gain is used in ID3 to select features, and information gain ratio is used in C4.5 to select features about cotton canopy.

Equation (2) indicates the information gain, in which the training dataset is D, the feature is A, and represents the empirical entropy of the dataset D. represents the empirical conditional entropy of feature A on the dataset, where n is the number of values taken for feature A. Equations (3) and (4) indicate the information gain ratio. In the input space where the training dataset is located, each region is recursively divided into two subregions. The letters in equations (3) and (4) have the same meaning as equation (2). This paper chooses formula (3) as the cost function. The output values in each subregions are determined, and a binomial decision tree is established.

2.3.2. Decision Tree Algorithm Based on Ensemble Learning

Ensemble learning is widely used in machine and statistical learning. The ensemble approach includes training different single regression models independently using the same or different datasets with different parameters among individual regression models. The final expected output is obtained by identifying the output mean of all single or multiple one-class classifiers.

	Input: training set , base learning algorithm , represents an input instance, n is the number of features, , represents the regression label, i = 1,2, …, N, N represents dataset size, number of training rounds for T.
	Output: .
(1)	for t = 1, 2, …, T do
(2)	, represents the training set D is randomly sampled t times, and a total of m times are collected to obtain a sampling set containing m samples.
(3)	end

(1)Random Forests. Random forests are an ensemble of decision trees and bagging. The Bagging Algorithm 1 is described as follows. To increase randomness, n samples are selected using sampling with replacement in the sample acquisition process.(1)n samples are selected from the sample set using Bootstrap sampling (sampling with replacement)(2)k attributes are randomly selected from all attributes, and the best segmentation attributes are taken as nodes to build a CART decision tree(3)The above two steps are repeated m times; that is, m CART decision trees are established(4)The m CART decision trees form a random forest, and data regression values are determined through voting results

(2)AdaBoost + Decision Tree. AdaBoost is an adaptive augmented learning Algorithm 2, which is described as follows.

	Input: training dataset , where sample , , is the instance space, and is the set of tokens.
	Output: final regression algorithm: G(x).
(1)	Initialize the weight distribution of the training data, weight value for each data.

(2)	For . represents the total amount of the simple regression model.
(a)	Perform learning using the training dataset with weight distribution to obtain the base regression model .

(b)	Calculate the error rate of on the training dataset, where represents the distance between and , this paper chooses the mean square error distance.

(c)	Calculate the coefficient of .

(d)	Update the weight distribution of the training dataset
	,

	Where is the normalization factor

	Which makes a probability distribution.
(3)	Construct a linear combination of the basic regression model

	To obtain the final regression model.

In this paper, adopts the decision tree model.

2.3.3. Other Multispectral Analysis Algorithms

Common base regression methods may include multivariate linear regression, neural network, support vector machine, decision tree, etc.

The multivariate linear method fits the input spectral data and the target data by identifying the reasonable polynomial parameters, with relatively short training time. However, its fitting effect on linear indistinguishable data is relatively poor. In the neural network, vector multiplication and nonlinear activation function are used to multiply the input data with the connection weights, map the result to the nonlinear space, and adjust the connection weights by backpropagation (BP) to obtain a relatively good fitting effect. In the support vector machine, kernel functions with linear or nonlinear kernels can be selected to map data into a high-dimensional space and extract high-dimensional features of the data by identifying the maximum margin separating hyperplane. The neural network and support vector machine integrate feature extraction and regression analysis. The subset with the lowest root-mean-square difference is verified by cross-validation to obtain the optimal band subset for the purpose of feature extraction and regression analysis.

3. Experiments and Results

The multispectral data after radiation correction were used as inputs to establish a regression prediction model for cotton plant height, number of fruiting branches, number of flower buds, and other indexes. The regression model adopted many common base regression methods such as multivariate linear regression, neural network, support vector machine, and decision tree. The wave spectra were first input into the simple regression model to analyze the effect of each spectrum on the prediction results. Subsequently, multiple wave bands were combined for regression analysis based on the regression results. Due to the short data acquisition interval and minor changes in cotton plant growth, the fitting effect of the simple regression model was compared to improve the feature extraction and target fitting capacity of the model. The model with good fitting effect and stable performance was selected as the basis for improvement, and the ensemble learning algorithms of bagging and AdaBoost were used to improve the prediction capacity of the simple regression model, respectively.

3.1. Dataset and Experimental Environment

The dataset consists of 1040 pieces of data. The input data is cotton multispectral data, and the label data were cotton plant height, number of fruiting branches, and number of flower buds. The algorithm was iterated 30,000 times to ensure model convergence; 920 samples were used as model training samples, and 120 data samples as test samples. All algorithms that require iteration were iterated 1000 times. Algorithm implementation was based on python platform. The processor adopted is Intel i9; the operating system was Windows 10.

3.2. Regression Models of Cotton Canopy Spectra for Cotton Plant Height, Number of Fruiting Branches, and Flower Buds at Different Wavelengths

In this paper, the spectral reflectance at different wavelengths was firstly input into various common base regression models for analysis on the cotton plant height, number of fruiting branches, and number of flower buds, respectively. The fitting results are shown in Table 1.

The fitting results indicated that the 730 nm band showed the best fit for plant height, number of flower buds, and number of fruiting branches, followed by the 790 nm band, with the fitting coefficients of above 0.65 for all three targets; the 790 nm band had relatively poor fit, with the fitting coefficients of below 0.3 for all targets; the 660 nm band had the worst fit, with the fitting coefficients of below 0.1 for all targets.

In terms of regression methods, neural network and decision tree had better regression results in regression models for a single band. The regression coefficients and prediction mean square error (RMSEP) of the neural network were slightly higher than those of the decision tree. This suggested that the neural network fitted the data well, but the effect was unstable; SVM was second to the aforesaid two methods in regression coefficients; multiple linear regression had the worst effect, indicating that multispectral data were nonlinear and should be extracted by nonlinear methods.

3.3. Regression Model of Conventional Vegetation Index (VI) for Cotton Plant Height, Number of Fruiting Branches, and Number of Flower Buds

Regression was performed on cotton plant height, number of flower buds, and fruiting branches using multiple conventional VIs. Due to the experimental equipment used in this paper, the VIs associated with 660, 730, and 790 nm bands were selected for regression. Difference index (DI), difference vegetation index (DVI), red-edge chlorophyll index (CI_rededge), normalized difference vegetation index (NDVI), green normalized difference vegetation index (GNDVI), and triangle vegetation index (TVI) were finally selected. The formula for calculating the indexes was as follows: R in the formula represented the spectrum, and the subscript numbers represented the band with the specified wavelength. Since the spectral resolution error of the spectral instrument was ±30 nm, could be replaced by the 790 nm band, and could be replaced by the 660 nm band, and and could be replaced by the 730 nm bank. The regression effects are shown in Table 2.

The results in Table 2 indicated that the regression effects of two conventional VIs, DVI and CI_rededge, on the cotton plant height, number of buds, and number of fruiting branches were relatively stable, with the determination fitting coefficient (R²) of above 0.5, whereas the other indexes had unstable fits to the three indexes, and the changes in the fitting indexes and methods had a significant impact on the fitting effects. The regression effect of combined spectral data was better than that of single spectral data, in which the fitting effect of VI involving the 550, 730, and 790 nm bands was better, same as single spectral data in the regression effect.

3.4. Regression Model of Multispectral Data of Cotton Canopy for Cotton Plant Height, Number of Fruiting Branches, Flower Buds in Different Band Combinations

In Table 3, new band combinations were input into the regression model by superimposing bands with relatively good regression effects for analysis. The results indicated that the regression effect of 730 + 790 nm band combination was better than that of single bands, with regression coefficients of up to above 0.80 for cotton plant height, number of flower buds, and number of fruiting branches; the regression effect of 550 + 730 + 790 nm band combination was slightly better than that of 730 + 790 nm, suggesting that the 550 nm band had a positive effect on the fitting results. Regression was performed on the relevant indexes using the full band spectra. In the regression effect, degradation was observed compared to 730 + 790 nm and 550 + 730 + 790 nm band combinations. This suggested that the addition of 660 nm spectral band has a negative effect on regression. The 550 + 730 + 790 nm band combination and conventional VI had the same conclusion in the regression effect.

In terms of regression methods, both neural networks and decision trees had relatively good performance, with decision trees achieving higher correlation coefficients, lower RMSEP, and more stable regression results for multiple indexes than neural networks; support vector machine and multivariate linear regression model had general effects.

3.5. Band Combination Cotton Topping Index Regression Model Based on Improved Decision Tree

In terms of band combination, the 730 + 790 nm, 550 + 730 + 790 nm, and full band data were fitted using random forest and AdaBoost + decision tree methods, respectively. The fitting results were shown in Table 3. The 550 + 730 + 790 nm spectral combination had better regression effects than 730 + 790 nm, indicating that 550 nm was positively correlated with the indexes in the cotton topping time, which can supplement the information of 730 and 790 nm spectral bands, whereas the full band had a worse regression effect than the 550 + 730 + 790 nm combination due to the relatively poor regression effect of 660 nm spectral band on cotton-related indexes that interfere with the regression effect. The results were shown in Table 4.

In terms of methods, among the simple regression models, the decision tree model and the neural network had better performance in the correlation coefficient of regression, with the decision tree having lower prediction RMSEP, indicating that it had a more stable prediction effect. Hence, the method was improved based on the decision tree in this paper. The decision tree model was improved on the regression model using boosting and AdaBoost methods. The boosting method improved the model fitting capacity by establishing different subsets and assigning different weights to various subsets; the AdaBoost method improved the feature extraction capacity of the model by establishing multiple regression models and assigning different weights to the regression models. The comparative analysis indicated that AdaBoost + decision tree had a better regression effect, higher correlation coefficient, lower prediction RMSEP, and stronger feature extraction capacity. As can be seen in Table 5, decision tree, as a simple regression method, had the least time consumption, the time consumption of random forest was 0.69 s, and the time consumption of AdaBoost + decision tree was 0.98 s. The reason for this phenomenon was that random forest was an ensemble learning method based on bagging, each regression algorithm was trained in parallel, while in the AdaBoost method, each learning was serial, different basic regression algorithms had corresponding coefficients, and the coefficients were continuously adjusted during the training process, so the training time of AdaBoost + decision tree was longer than that of random forest. The results were shown in Table 5.

The 550 + 730 + 790 nm band combination had the highest fit for cotton plant height, number of buds, and number of fruiting branches, with the AdaBoost + decision tree method having the highest fitting coefficients (0.95, 0.96, and 0.84) and the lowest prediction RMSEP (0.40, 0.54, and 0.49). In the 730 + 790 nm band combination, the AdaBoost + decision tree method had the highest correlation coefficients for plant height, number of buds, and number of fruiting branches (0.90, 0.91, and 0.95). In the full band combination, the AdaBoost + decision tree method also had the highest fitting coefficients (0.93, 0.9, and 0.95) and the lowest prediction RMSEP (0.86, 0.95, and 0.99). The effects of AdaBoost + decision tree and 550 + 730 + 790 nm band combination on the predicted and measured data of cotton plant height, number of fruiting branches, and number of buds were shown in Figure 4.

(a)

(b)

(c)

3.6. Cotton Topping Time Determination Test

According to the requirements of cotton topping time in the Technical Guidance on Mechanized Cotton Production in Northwest Inland Cotton Growing Regions issued by the Ministry of Agriculture and Rural Affairs and related cotton planting agronomy, the appropriate topping time should be determined based on the cotton growth, plant height, number of fruiting branches, and other factors. Early topping should be performed as appropriate, and plant monitoring indexes should be determined based on the promotion of early maturity. In this paper, the topping indexes of Xinluzao 83 in Shihezi of Xinjiang were quantified according to the determined indexes in the topping time specified in the Technical Guidance after field investigation and experimental data analysis. The topping time determination indexes of this variety were determined as follows: plant height 65–70 cm, 9–11 fruiting branches, and 18–21 buds upon cotton topping.

On this basis, whether to perform topping was determined based on the predicted indexes of plant height, number of fruiting branches, and number of buds according to the neural network, decision tree, random forest, and AdaBoost + decision tree methods in the 550 + 730 + 790 nm band combination. The predicted topping results are compared with the actual measured values, and the statistics of determination accuracy were shown in Table 5. Table 5 indicated that the 550 + 730 + 790 nm band combination and AdaBoost + decision tree method had a better result in determining cotton topping time than the other band combinations and methods. The results were shown in Table 6.

According to the experimental results, the 550 + 730 + 790 nm data combined with decision tree, random forest, and AdaBoost + decision tree methods were selected to obtain the fitting models, and the model outputs were input into the topping determination model, respectively. In terms of determination accuracy, the AdaBoost + decision tree method showed a better fitting effect, and the 550 + 730 + 790 nm band combination had the best fit for cotton plant height, number of buds, and number of fruiting branches. Among them, the AdaBoost + decision tree method had the highest determination fitting coefficients (0.96, 0.84, and 0.97) and the lowest RMSEP (0.40 cm, 0.54, and 0.49). The data predicted by the model were input into the topping determination model, and the accuracy of determination was 94.03%. The topping prediction results had also corroborated that AdaBoost + decision tree was the best fitting model for cotton plant height, number of fruiting branches, and number of buds.

4. Discussion and Analysis

In this paper, plant indexes were determined and quantified according to the planting guideline and agronomic requirements for determining the topping time of cotton in Xinjiang. The plant height, number of fruiting branches, and number of buds were collected as indexes for determination. The multispectral information of cotton canopy during the topping time was collected to establish the analysis model. The multispectral information processing mainly included two parts: multispectral information preprocessing and spectral information processing modelling. The multispectral information was converted into reflectance by using the black and white correction method. The spectral preprocessing could reduce the interference of other factors such as experimental instruments and background and effectively improve the accuracy of information inversion interpretation. On this basis, simple regression methods such as the multivariate linear regression model, neural network, SVM, and decision tree were used to perform regression on cotton plant height, number of buds, number of fruiting branches, and other indexes based on traditional experience. The bands with good regression effects were selected for combination analysis to determine good prediction combinations and methods. The fitting results indicated that the 730 nm and 790 nm spectral bands had high correlations with cotton plant height, number of fruiting branches, and number of buds; the 550 nm band had weak correlations with the related indexes; the 660 nm spectral band had weak prediction capacity for related indexes. According to the studies in [6, 12, 13, 24, 25], the red-light band information collected from the crop canopy had better fitting effect for common indexes such as chlorophyll and leaf area index in addition to cotton plant height.

In terms of regression algorithms, among the simple algorithms, decision tree, support vector machine, and neural network methods had the best performance on some indicators and band combinations; however, decision tree performed the most stability. The improved method based on ensemble learning had better feature extraction capacity for crop canopy multispectral information than the simple methods, which was corroborated in canopy monitoring of wheat and rice crops, similar to the findings in this paper [21–25]. From the performance of the ensemble algorithm, the fitting model based on AdaBoost + decision tree was better than the decision tree in fitting degree. For the random forest based on bagging, the training set is drawn from the original dataset with replacement. The datasets were independent. Uniform sampling was used, with equal weights for each sample. The AdaBoost method changed the weight of the data and adjusted the sample weight through the error rate. The higher the error rate, the greater the weight. Therefore, the AdaBoost + decision tree method had better performance in fitting degree than random forest. However, in terms of time consumption, AdaBoost + decision tree had higher time consumption than decision tree. The time consumption of both was within 1 s, and the gap between the two was about 0.3 s. In the process of field crop information collection, this time consumption was acceptable. In addition to ensemble learning methods, the use of deep learning can provide a better fit for cotton height but required a greater amount of training data [26], which is a direction for further in-depth research.

In terms of model application, the model in this paper can be integrated into a multispectral unmanned aircraft hardware platform or a multispectral transmission platform, which could achieve the purpose of quickly, automatically, and accurately obtaining the plant height, number of flower buds, and fruit branches of cotton at the flower bud stage. Using the automatically obtained conclusions can provide crop indicators for the operation strategy arrangement and path planning strategy determination of intelligent plant protection machinery [27]. At the same time, the model building method proposed in this paper could be applied to monitoring the growth state of cotton or other crops in a short period of time.

5. Conclusions

(1)The 550 + 730 + 790 nm band combination had a better fitting effect than the other band combinations and single bands. Among the single bands, 730 nm and 790 nm had good fitting effects on cotton plant height, number of fruiting branches, and number of buds, followed by 550 nm, which could supplement the red-light band information; 660 nm had the poorest fitting effect.(2)The Adaboost + decision tree method was proposed for multispectral data analysis of 550 + 730+790 nm band combination. The model test results indicated that the determination coefficient of multispectral remote sensing information was 0.96 for cotton canopy and plant height, with a prediction RMSEP of 0.4 cm; 0.97 for the number of fruiting branches, with a prediction RMSEP of 0.54; 0.84 for the number of buds, with a prediction RMSEP of 0.49; the determination accuracy of the prediction results on the topping time was 94.03%.

The method proposed in this paper could provide data support of crop physiological state for the determination of cotton topping time and the operation strategy of intelligent topping machinery.

Data Availability

The remote sensing data and codes used in the experiment to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Authors’ Contributions

Conceptualization was done by Li Yibai and CHEN Cong; methodology was done by Li Yibai and Cao Guangqiao; hardware was done by Liu Dong; software was done by Li Yibai and Zhang Jinlong; validation was done by Cao Guangqiao and CHEN Cong; formal analysis was done by Cao Guangqiao; investigation was done by Li Liang; resources were done by Ji Chao; data curation was done by Ji Chao; writing—original draft preparation was done by Li Yibai and A.Z.; writing—review and editing was done by CHEN Cong; supervision was done by Cao Guangqiao; project administration was done by Ji Chao; funding acquisition was done by Zhang Jinlong. All authors have read and agreed to the published version of the manuscript.

Acknowledgments

This work was supported by a grant of the special funding for basic scientific research business expenses of Central Public Welfare Scientific Research Institutes (S202010, S202109-02) and Science and Technology Innovation Project of Chinese Academy of Agricultural Sciences (Academy of Agricultural Sciences Office (2014) no. 216).

References

M. Tian and S. Ban, “Chang Qingrui Use of hyperspectral images from UAV-based imaging sepectroradiometer to estimate cotton leaf area index,” Transactions of the Chinese Society of Agricultural Engineering, vol. 32, no. 21, pp. 102–108, 2016.
View at: Publisher Site | Google Scholar
M. Tian, S. Ban, and R. Chang, “Estimation of SPAD value of cotton leaf using hyperspectral image from UAV-based imaging spectroradiometer,” Transactions of the Chinese Society for Agricultural Machinery, vol. 47, no. 11, pp. 285–293, 2016.
View at: Publisher Site | Google Scholar
X. Sun, L. Liu, D. Wang, D. Wang, P. Xu, and S. Li, “Research on the progress of key technology of cotton harvester,” in Proceedings of the ASABE Annual International Virtual Meeting, July 2020.
View at: Publisher Site | Google Scholar
T. Wang, X. Mei, and J. A. Thomasson, “Volunteer cotton habitat prediction model and detection with UAV remote sensing,” in Proceedings of the ASABE Annual International Virtual Meeting, July 2020.
View at: Publisher Site | Google Scholar
X. Li, C. Yang, W. Huang, J. Tang, Y. Q. Tian, and Q. Zhang, “Identification of cotton root rot by multifeature selection from sentinel-2 images using random forest,” Remote Sensing, vol. 12, no. 21, 2020.
View at: Publisher Site | Google Scholar
Y. Lin, Z. Zhu, and W. Guo, “Continuous monitoring of cotton stem water potential using sentinel-2 imagery,” Remote Sensing, vol. 12, no. 7, 2020.
View at: Publisher Site | Google Scholar
S. Mauget, M. Ulloa, and J. Dever, “Planting date effects on cotton lint yield and fiber quality in the U.S. Southern high plains,” Agriculture, vol. 9, no. 4, p. 82, 2019.
View at: Publisher Site | Google Scholar
M. Zhang, Y. Dai, X. Luo, Y. Junzhi, and Z. Minghui, “Control system design and research of the wireless control power switch for cotton plant topper,” Journal of Chinese Agricultural Mechanization, vol. 35, no. 2, pp. 286–289, 2014.
View at: Publisher Site | Google Scholar
J. Bian, Z. Zhang, J. Chen, and H. Chen, “Simplified evaluation of cotton water stress using high resolution unmanned aerial vehicle thermal imagery,” Remote Sensing, vol. 11, no. 3, 2019.
View at: Publisher Site | Google Scholar
Y. Ma, X. Lv, and Y. Xiang, “Monitoring of cotton leaf area index using machine learning,” Transactions of the Chinese Society of Agricultural Engineering, vol. 37, no. 13, pp. 152–162, 2021.
View at: Google Scholar
H. Mao, J. Meng, F. Ji, and Q. Zhang, “Comparison of machine learning regression algorithms for cotton leaf area index retrieval using sentinel-2 spectral bands,” Applied Sciences, vol. 9, no. 7, 2019.
View at: Publisher Site | Google Scholar
I. J. Marang, P. Filippi, and T. B. Weaver, “Machine learning optimized hyperspectral remote sensing retrieves cotton nitrogen status,” Remote Sensing, vol. 13, no. 8, 2021.
View at: Publisher Site | Google Scholar
C. W. Jeon, H. J. Kim, C. Yun, X. Han, and J. H. Kim, “Design and validation testing of a complete paddy field-coverage path planner for a fully autonomous tillage tractor,” Biosystems Engineering, vol. 208, no. 2, pp. 79–97, 2021.
View at: Publisher Site | Google Scholar
G. Edwards, J. Hinge, N. Skou-Nielsen, V. H. Andres, A. G. Claus, and G. Ole, “Route planning evaluation of a prototype optimised infield route planner for neutral material flow agricultural operations,” Biosystems Engineering, vol. 153, pp. 149–157, 2017.
View at: Publisher Site | Google Scholar
B. C. Strik, “Pruning and training systems impact yield and cold hardiness of “marion” trailing blackberry,” Agriculture, vol. 8, no. 9, 2018.
View at: Publisher Site | Google Scholar
H. Yanjie and S. C. Longlong, “Research status and prospect of cotton terminal bud identification and location technology,” Journal of Chinese Agricultural Mechanization, vol. 39, no. 11, pp. 72–78, 2018.
View at: Publisher Site | Google Scholar
Z. Fu, S. Yu, J. Zhang et al., “Combining UAV multispectral imagery and ecological factors to estimate leaf nitrogen and grain protein content of wheat,” European Journal of Agronomy, vol. 132, no. 12, p. 6405, 2022.
View at: Publisher Site | Google Scholar
L. Xun, J. Zhang, D. Cao, J. Wang, S. Zhang, and F. Yao, “Mapping cotton cultivated area combining remote sensing with a fused representation-based classification algorithm,” Computers and Electronics in Agriculture, vol. 181, Article ID 105940, 2021.
View at: Publisher Site | Google Scholar
J. Q. Zhang, H. Tian, P. Wang, K. Tansey, S. Zhang, and H. Li, “Improving wheat yield estimates using data augmentation models and remotely sensed biophysical indices within deep neural networks in the Guanzhong Plain, PR China,” Computers and Electronics in Agriculture, vol. 192, no. 1, Article ID 106616, 2022.
View at: Publisher Site | Google Scholar
P. Zhou, C. Gong, and X. Yao, “Machine learning paradigms in high-resolution remote sensing image interpretation,” National Remote Sensing Bulletin, vol. 25, no. 1, pp. 182–197, 2021.
View at: Publisher Site | Google Scholar
V. Niko, H. Eija, N. Roope, H. Teemu, N. Oiva, and K. Jere, “A novel machine learning method for estimating biomass of grass swards using a photogrammetric canopy height model, images and vegetation indices captured by a drone,” Agriculture, vol. 8, no. 5, p. 70, 2018.
View at: Publisher Site | Google Scholar
B. Kaiyi, N. Zheng, X. Shunfu et al., “Non-destructive monitoring of maize nitrogen concentration using a hyperspectral LiDAR: an evaluation from leaf-level to plant-level,” Remote Sensing, vol. 13, no. 24, 2021.
View at: Publisher Site | Google Scholar
Y. Zhang, M. Migliavacca, J. Penuelas, and W. Ju, “Advances in hyperspectral remote sensing of vegetation traits and functions,” Remote Sensing of Environment, vol. 252, no. 1, Article ID 112121, 2020.
View at: Publisher Site | Google Scholar
H. Amir, R. A. Washington-Allen, and B. G. Leib, “Prediction of cotton lint yield from phenology of crop indices using artificial neural networks,” Computers and Electronics in Agriculture, vol. 152, no. 7, pp. 186–197, 2018.
View at: Publisher Site | Google Scholar
J. F. I. Nturambirwe, H. N. Hélène, and W. J. Perold, “Detecting bruise damage and level of severity in apples using a contactless NIR spectrometer,” Applied Engineering in Agriculture, vol. 36, no. 3, pp. 257–270, 2020.
View at: Publisher Site | Google Scholar
G. Shao, W. Han, H. Zhang et al., “Mapping maize crop coefficient Kc using random forest algorithm based on leaf area index and UAV-based multispectral vegetation indices,” Agricultural Water Management, vol. 252, no. 30, Article ID 106906, 2021.
View at: Publisher Site | Google Scholar
J. Cui, M. Yang, D. Son, C. Seong-In, and G. Kim, “Hyperspectral imaging for tomato bruising damage assessment of simulated harvesting process impact using wavelength interval selection and multivariate analysis,” Applied Engineering in Agriculture, vol. 36, no. 4, pp. 533–547, 2020.
View at: Publisher Site | Google Scholar
J. Yu, L. S. John, L. Changying, C. R. Glen, and H. P. Andrew, “Ground based hyperspectral imaging to characterize canopy-level photosynthetic activities,” Remote Sensing, vol. 12, no. 2, 2020.
View at: Publisher Site | Google Scholar
Y. Lan, Z. Huang, X. Deng et al., “Comparison of machine learning methods for citrus greening detection on UAV multispectral images,” Computers and Electronics in Agriculture, vol. 171, no. 10, 2020.
View at: Publisher Site | Google Scholar
M. A. Soppa, B. Silva, F. Steinmetz et al., “Assessment of polymer atmospheric correction algorithm for hyperspectral remote sensing imagery over coastal waters,” Sensors, vol. 21, no. 12, 2021.
View at: Publisher Site | Google Scholar
S. Weng, S. Yu, B. Guo, P. Tang, and D. Liang, “Non-destructive detection of strawberry quality using multi-features of hyperspectral imaging and multivariate methods,” Sensors, vol. 20, no. 11, 2020.
View at: Publisher Site | Google Scholar
C. Yangbo, D. Peng, and Y. Xiaojun, “Improving land use/cover classification with a multiple classifier system using AdaBoost integration technique,” Remote Sensing, vol. 9, no. 10, 2017.
View at: Publisher Site | Google Scholar
S. Fei, M. A. Hassan, Z. He, and Z. Chen, “Assessment of ensemble learning to predict wheat grain yield based on UAV-multispectral reflectance,” Remote Sensing, vol. 13, no. 12, Article ID 2338, 2021.
View at: Publisher Site | Google Scholar
D. T. Bui, T. C. Ho, B. Pradhan, and V. Nhu, “GIS-based modeling of rainfall-induced landslides using data mining-based functional trees classifier with AdaBoost, Bagging, and MultiBoost ensemble frameworks,” Environmental Earth Sciences, vol. 75, no. 14, pp. 1–22, 2016.
View at: Publisher Site | Google Scholar
G. Hu, C. Yin, M. Wan, Y. Zhang, and Y. Fang, “Recognition of diseased Pinus trees in UAV images using deep learning and AdaBoost classifier,” Biosystems Engineering, vol. 194, pp. 138–151, 2020.
View at: Publisher Site | Google Scholar

Copyright

Copyright © 2022 Yibai Li et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

PDF Download Citation

Download other formats

Order printed copies